-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Closed
Labels
Description
Feature Request
Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":
pydantic version: 1.3
pydantic compiled: True
install path: /usr/local/lib/python3.8/site-packages/pydantic
python version: 3.8.1 (default, Jan 3 2020, 22:55:55) [GCC 8.3.0]
platform: Linux-4.9.184-linuxkit-x86_64-with-glibc2.2.5
optional deps. installed: ['typing-extensions']
The IANA has approved many TLDs that are not matched by the TLD domain ending regex used for HttpUrl validation. There are currently ~152 such TLDs: see the entries in the authoritative list of TLDs containing the ASCII Compatible Encoding prefix xn--.
One approach to adding compatibility for such TLDs would be to modify the domain ending regex pattern to allow for Unicode characters, as well as the corresponding internationalized ASCII strings. For example:
_domain_ending = r"(?P<tld>(\.[^\W\d_]{2,63})|(\.(?:xn--)[_0-9a-z-]{2,63}))?\.?"Such a change would allow for the following to run successfully
from pydantic import BaseModel, HttpUrl, ValidationError
class Domain(BaseModel):
domain: HttpUrl
ascii_domains = ["https://example.com"]
idna_domains = [
"https://example.xn--p1ai",
"https://example.xn--vermgensberatung-pwb",
"https://example.xn--zfr164b",
]
unicode_domains = [str.encode(domain).decode("idna") for domain in idna_domains]
valid_domains = ascii_domains + idna_domains + unicode_domains
invalid_domains = ["https://example.123", "https://example.ab34"]
for domain in valid_domains:
Domain(domain=domain)
for invalid_domain in invalid_domains:
try:
Domain(domain=invalid_domain)
except ValidationError:
passWould you accept a PR for this?
dmontagu