Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-82151: Improve docs for urllib3.parse #18631

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

idomic
Copy link
Contributor

@idomic idomic commented Feb 23, 2020

@idomic
Copy link
Contributor Author

idomic commented Mar 1, 2020

Tagging @taleinat

Copy link
Contributor

@taleinat taleinat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix here is good and important.

However, this does not address the central issue brought up in bpo-37970, regarding this line of the docs being misleading, since the parts of the netloc are available as additional attributes on the returned named-tuple object:

The components are not broken up in smaller parts (for example, the network location is a single string)

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@idomic
Copy link
Contributor Author

idomic commented Mar 15, 2020

The fix here is good and important.

However, this does not address the central issue brought up in bpo-37970, regarding this line of the docs being misleading, since the parts of the netloc are available as additional attributes on the returned named-tuple object:

The components are not broken up in smaller parts (for example, the network location is a single string)

You're right, don't know why I missed this part, fixed for both urlsplit and parse.

@idomic
Copy link
Contributor Author

idomic commented Mar 15, 2020

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@taleinat: please review the changes made to this pull request.

Copy link
Contributor

@taleinat taleinat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change to the documentation is still not clear enough. Writing "The components are broken up into smaller parts (for example, ...)" is confusing, since for example the query is not broken down into its parts.

What really happens is that the parts as described in the beginning are used as-is for the named-tuple fields. Additionally, only the netloc is broken down into username, password, hostname, and port, and those are added as additional attributes on the returned object. Further, these extra attributes are only accessible by name, but not by index - they are not technically part of the named-tuple's fields.

We need to convey this clearly and concisely, which needs some more thought. The currently suggested wording doesn't achieve this.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@idomic
Copy link
Contributor Author

idomic commented Mar 22, 2020

@taleinat I've changed one description, once we'll decide on the format I'll clone for the other function

@idomic
Copy link
Contributor Author

idomic commented May 17, 2020

@taleinat ping on that

@taleinat
Copy link
Contributor

Apologies for the delay, @idomic.

What you've written is better, I like the direction :)

I suggest omitting the first sentence, which I find redundant, changing the order of the sentences, and changing "expanded" to "decoded":

The delimiters as shown above
are not part of the result, except for a leading slash in the path component, which is
retained if present.

Additionally, the netloc item is broken down into: username, password, hostname, and port. These are added as additional attributes of the returned object.

% escapes are not decoded.

For example: ...

What do you think?

@idomic
Copy link
Contributor Author

idomic commented May 24, 2020

@taleinat Agree, I like the decoded change, and the first paragraph.
What do you think about:

Additionally, the netloc property is broken down into these additional attributes in the returned object: username, password, hostname, and port.

@idomic
Copy link
Contributor Author

idomic commented Jun 5, 2020

@idomic, please review the urlsplit() docs as currently suggested by this PR much more thoroughly.

@taleinat I think I've added the missing descriptions and changed it to fit to url split. Also tried to remove redundant text.

@idomic
Copy link
Contributor Author

idomic commented Jun 5, 2020

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@taleinat: please review the changes made to this pull request.

@idomic
Copy link
Contributor Author

idomic commented Jun 20, 2020

@taleinat Added the urlsplit() let me know what you think

@csabella csabella requested review from taleinat and removed request for taleinat June 22, 2020 23:44
Copy link
Contributor

@taleinat taleinat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small comments.

Also, please make sure to wrap the lines at 80 characters, in keeping with (the style guide)[https://devguide.python.org/documenting/#style-guide].

Doc/library/urllib.parse.rst Outdated Show resolved Hide resolved
Comment on lines 279 to 282
This should generally be used instead of :func:`urlparse` if the more recent URL
syntax allowing parameters to be applied to each segment of the *path* portion
of the URL (see :rfc:`2396`) is wanted. A separate function is needed to
separate the path segments and parameters. This function returns a 5-item
:term:`named tuple`::

(addressing scheme, network location, path, query, fragment identifier).
separate the path segments and parameters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest keeping these lines together with the first line ("This is similar..."), like it was before.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Comment on lines 46 to 50
The delimiters as shown above are not part of the result, except for a leading slash in the path
component, which is retained if present.

Additionally, the netloc property is broken down into these additional attributes added to
the returned object: username, password, hostname, and port.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the urlparse docs is still not wrapped to 79 characters.

@idomic
Copy link
Contributor Author

idomic commented Jun 28, 2020

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@taleinat: please review the changes made to this pull request.

farazs-github pushed a commit to MediaTek-Labs/cpython that referenced this pull request Nov 12, 2021
@AlexWaygood AlexWaygood changed the title bpo-37970: Added documentation fixes bpo-37970: Improve docs for urllib3.parse Oct 30, 2022
@AlexWaygood AlexWaygood changed the title bpo-37970: Improve docs for urllib3.parse gh-82151: Improve docs for urllib3.parse Oct 30, 2022
Copy link

@tuukkamustonen tuukkamustonen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking into this years old PR.

The function docstrings were adjusted in #16458. What remains are updates to the reference docs (this PR).

As I see it, the author has done all the fixes requested.

Added a few comments/suggestions. But in general, it looks good to me.

Maybe @zware @taleinat @encukou could have a look? @idomic you still around to potentiall adjust?

(Looking into this as part of EuroPython 2024 sprint - I'm just a user, not maintainer, fyi.)


Additionally, the netloc property is broken down into these additional
attributes added to the returned object: username, password, hostname,
and port.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, for formatting the fields.

recent URL syntax allowing parameters to be applied to each segment of the
*path* portion of the URL (see :rfc:`2396`) is wanted. A separate function
is needed to separate the path segments and parameters.
This function returns a 5-item :term:`named tuple`::

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The paragraph here feels a bit confusing to the reader. Could it be rephrased a little bit and an example given:

The function is similar to :func:`urlparse`, but does not split the parameters from the
URL. This is useful, if the given :attr:`url` follows the alternative URL syntax that allows parameters
per path segment (e.g. `http://hostname/path;arg=value/bar`, see :rfc:`2396`).
In that case, the path segments and parameters need to be separated with a separate function.

The function returns a 5-item :term:`named tuple`::

I'd also like to see an explicit mention there, that the "separate function" referenced there is not provided in the stdlib. Now it just leaves the reader dry on that ("okay, which function???").

On the other hand, RFC 2396 (from 1998) has been obsoleted by RFC 3986 (in 2005), and the latter no longer has the query-params-per-path-segment format in the spec (https://stackoverflow.com/a/6548553/165629).

I wonder how useful it is to reference a spec which is no longer valid and which (probably few people) use. Could the paragraph be lifted altogether?

Then again, it makes perfect sense to explain why there is urlparse and urlsplit, when/if the only difference is parsing the parameters vs not (I wonder what is the reasoning that remains, it doesn't do real harm to try parse the ? query parameters when they don't exist).


Additionally, the netloc property is broken down into these additional
attributes added to the returned object: username, password, hostname, and
port.
Copy link

@tuukkamustonen tuukkamustonen Jul 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same information is given in the below table, so I'm not sure how necessary this paragraph is?

I don't see harm in having it, though.

The field names should be formatted:

   Additionally, the netloc property is broken down into these additional
   attributes added to the returned object: :attr:`username`, :attr:`password`, :attr:`hostname`,
   and :attr:`port`.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants