Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow unicode domain name and path #423

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

LiuQhahah
Copy link

Problem

The current Java version of this library has a limitation where it fails to recognize URLs containing Unicode characters. This is despite the fact that such URLs are supported by browsers and can be registered and used effectively. For instance, URLs like "http://www.詹姆斯.com/詹姆斯" are not identified as valid URLs. This issue arises from Java's inability to recognize Unicode characters as valid components in domain names and paths..

Solution

To address this issue, I have enhanced the regular expressions used for URL validation in the Java code. Specifically, I have incorporated the Unicode regex \p{L} and \p{M} into the regular expressions that validate the domain name and path of the URL. This modification ensures that the library can now correctly identify and validate URLs containing Unicode characters.
Result

With these changes, the library can now correctly identify URLs that include Unicode characters in their domain name or path as valid URLs. For example, a URL like "http://www.詹姆斯.com/詹姆斯" will now be correctly identified as a valid URL. This enhancement broadens the range of URLs that the library can recognize and validate, aligning it more closely with the behavior of modern web browsers.

@CLAassistant
Copy link

CLAassistant commented Apr 26, 2024

CLA assistant check
All committers have signed the CLA.

@LiuQhahah LiuQhahah changed the title add unicode domain name and path Allow unicode domain name and path Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants