Skip to content

Support uppercase characters in host #1441

@xZise

Description

@xZise

Please confirm the following

  • I understand this is open source software provided for free and that I might not receive a timely response.
  • I am positive I am NOT reporting a (potential) security
    vulnerability, to the best of my knowledge. (These must be shared by
    submitting this report form instead, if
    any hesitation exists.)
  • I am willing to submit a pull request with reporoducers as xfailing test cases or even entire fix. (Assign this issue to me.)

Describe the bug

When using uppercase characters in the hostname they get reported as invalid:

>>> URL.build(scheme="http", host="A", port=port, path="/")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "...\yarl\_url.py", line 386, in build
    _host = _encode_host(host, validate_host=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...\yarl\_url.py", line 1496, in _encode_host
    raise ValueError(
ValueError: Host 'A' cannot contain 'A' (at position 0)
>>> URL.build(scheme="http", host="a", port=port, path="/")
URL('http://a/')

It appears to me, that there needs to be a conversion into lowercase, for example the host-property says it gets converted to lowercase (which is the case when using __init__()) and the comment at NOT_REG_NAME mentions that it only accepts lowercase ASCII values:

# this pattern matches anything that is *not* in those classes. and is only used
# on lower-cased ASCII values.

When using a non-ASCII string with host, it gets encoded so it seems weird that ASCII uppercase strings aren't "encoded" into lowercase. There is #386, but there it is using __init__() which works correctly.

To Reproduce

  1. Install yarl
  2. Call yarl.URL.build(scheme="http", host="A", port=port, path="/")

Expected behavior

A valid URL which is identical to the build URL using the lowercase host.

Logs/tracebacks

>>> URL.build(scheme="http", host="A", port=port, path="/")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "...\yarl\_url.py", line 386, in build
    _host = _encode_host(host, validate_host=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...\yarl\_url.py", line 1496, in _encode_host
    raise ValueError(
ValueError: Host 'A' cannot contain 'A' (at position 0)
>>> URL.build(scheme="http", host="a", port=port, path="/")
URL('http://a/')

Python Version

$ python --version
Python 3.12.2

multidict Version

$ python -m pip show multidict
Name: multidict
Version: 6.1.0
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache 2
Location: ..\Lib\site-packages
Requires:
Required-by: yarl

propcache Version

$ python -m pip show propcache
Name: propcache
Version: 0.2.0
Summary: Accelerated property cache
Home-page: https://github.com/aio-libs/propcache
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache-2.0
Location: ..\Lib\site-packages
Requires:
Required-by: yarl

yarl Version

$ python -m pip show yarl
Name: yarl
Version: 1.18.0
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache-2.0
Location: ..\Lib\site-packages
Requires: idna, multidict, propcache
Required-by:

OS

Windows 10

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions