Skip to content

parse_qsl method does not parse query strings without "=" #110843

@anthonyoteri

Description

@anthonyoteri

Bug report

Bug description:

According to RFC3986, query strings containing single values such as foo&bar&baz are valid. In this case, the presence of the parameter conveys information. However if we try to parse that query string, with strict_parsing=True it will result in a ValueError. With the strict_parsing=False these parameters are simply dropped.

Typically these name-only parameters are used for boolean options where the presence of the parameter indicates something meaningful to the underlying service. Good API design practices would suggest that the URL be changed to the format "name=true" or something to that effect, however we are not always in control of the URL and may need to support parsing query strings of this format.

The RFC simply states this about the query parameters.

The query component contains non-hierarchical data that, along with
data in the path component (Section 3.3), serves to identify a
resource within the scope of the URI's scheme and naming authority
(if any). The query component is indicated by the first question
mark ("?") character and terminated by a number sign ("#") character
or by the end of the URI.

Berners-Lee, et al. Standards Track [Page 23]

RFC 3986 URI Generic Syntax January 2005

  query       = *( pchar / "/" / "?" )

The characters slash ("/") and question mark ("?") may represent data
within the query component. Beware that some older, erroneous
implementations may not handle such data correctly when it is used as
the base URI for relative references (Section 5.1), apparently
because they fail to distinguish query data from path data when
looking for hierarchical separators. However, as query components
are often used to carry identifying information in the form of
"key=value" pairs and one frequently used value is a reference to
another URI, it is sometimes better for usability to avoid percent-
encoding those characters.

Nowhere in there does it explicitly state that it must represent name-value pairs with "=" as a separator character.

Python 3.11.5 (main, Sep  2 2023, 14:16:33) [GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.parse import parse_qsl
>>> parse_qsl("foo=1&bar")
[('foo', '1')]
>>> parse_qsl("foo=1&bar", strict_parsing=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.11/urllib/parse.py", line 780, in parse_qsl
    raise ValueError("bad query field: %r" % (name_value,))
ValueError: bad query field: 'bar'
>>>

CPython versions tested on:

3.11

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions