-
-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-82151: Improve docs for urllib3.parse #18631
base: main
Are you sure you want to change the base?
Conversation
idomic
commented
Feb 23, 2020
•
edited by AlexWaygood
Loading
edited by AlexWaygood
- Issue: urllib.parse docstrings incomplete #82151
Tagging @taleinat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix here is good and important.
However, this does not address the central issue brought up in bpo-37970, regarding this line of the docs being misleading, since the parts of the netloc are available as additional attributes on the returned named-tuple object:
The components are not broken up in smaller parts (for example, the network location is a single string)
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
You're right, don't know why I missed this part, fixed for both urlsplit and parse. |
I have made the requested changes; please review again |
Thanks for making the requested changes! @taleinat: please review the changes made to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change to the documentation is still not clear enough. Writing "The components are broken up into smaller parts (for example, ...)" is confusing, since for example the query is not broken down into its parts.
What really happens is that the parts as described in the beginning are used as-is for the named-tuple fields. Additionally, only the netloc is broken down into username, password, hostname, and port, and those are added as additional attributes on the returned object. Further, these extra attributes are only accessible by name, but not by index - they are not technically part of the named-tuple's fields.
We need to convey this clearly and concisely, which needs some more thought. The currently suggested wording doesn't achieve this.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
@taleinat I've changed one description, once we'll decide on the format I'll clone for the other function |
@taleinat ping on that |
Apologies for the delay, @idomic. What you've written is better, I like the direction :) I suggest omitting the first sentence, which I find redundant, changing the order of the sentences, and changing "expanded" to "decoded":
What do you think? |
@taleinat Agree, I like the decoded change, and the first paragraph. Additionally, the netloc property is broken down into these additional attributes in the returned object: username, password, hostname, and port. |
I have made the requested changes; please review again |
Thanks for making the requested changes! @taleinat: please review the changes made to this pull request. |
@taleinat Added the urlsplit() let me know what you think |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two small comments.
Also, please make sure to wrap the lines at 80 characters, in keeping with (the style guide)[https://devguide.python.org/documenting/#style-guide].
Doc/library/urllib.parse.rst
Outdated
This should generally be used instead of :func:`urlparse` if the more recent URL | ||
syntax allowing parameters to be applied to each segment of the *path* portion | ||
of the URL (see :rfc:`2396`) is wanted. A separate function is needed to | ||
separate the path segments and parameters. This function returns a 5-item | ||
:term:`named tuple`:: | ||
|
||
(addressing scheme, network location, path, query, fragment identifier). | ||
separate the path segments and parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest keeping these lines together with the first line ("This is similar..."), like it was before.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Doc/library/urllib.parse.rst
Outdated
The delimiters as shown above are not part of the result, except for a leading slash in the path | ||
component, which is retained if present. | ||
|
||
Additionally, the netloc property is broken down into these additional attributes added to | ||
the returned object: username, password, hostname, and port. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part of the urlparse docs is still not wrapped to 79 characters.
I have made the requested changes; please review again |
Thanks for making the requested changes! @taleinat: please review the changes made to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking into this years old PR.
The function docstrings were adjusted in #16458. What remains are updates to the reference docs (this PR).
As I see it, the author has done all the fixes requested.
Added a few comments/suggestions. But in general, it looks good to me.
Maybe @zware @taleinat @encukou could have a look? @idomic you still around to potentiall adjust?
(Looking into this as part of EuroPython 2024 sprint - I'm just a user, not maintainer, fyi.)
|
||
Additionally, the netloc property is broken down into these additional | ||
attributes added to the returned object: username, password, hostname, | ||
and port. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above, for formatting the fields.
recent URL syntax allowing parameters to be applied to each segment of the | ||
*path* portion of the URL (see :rfc:`2396`) is wanted. A separate function | ||
is needed to separate the path segments and parameters. | ||
This function returns a 5-item :term:`named tuple`:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The paragraph here feels a bit confusing to the reader. Could it be rephrased a little bit and an example given:
The function is similar to :func:`urlparse`, but does not split the parameters from the
URL. This is useful, if the given :attr:`url` follows the alternative URL syntax that allows parameters
per path segment (e.g. `http://hostname/path;arg=value/bar`, see :rfc:`2396`).
In that case, the path segments and parameters need to be separated with a separate function.
The function returns a 5-item :term:`named tuple`::
I'd also like to see an explicit mention there, that the "separate function" referenced there is not provided in the stdlib. Now it just leaves the reader dry on that ("okay, which function???").
On the other hand, RFC 2396 (from 1998) has been obsoleted by RFC 3986 (in 2005), and the latter no longer has the query-params-per-path-segment format in the spec (https://stackoverflow.com/a/6548553/165629).
I wonder how useful it is to reference a spec which is no longer valid and which (probably few people) use. Could the paragraph be lifted altogether?
Then again, it makes perfect sense to explain why there is urlparse
and urlsplit
, when/if the only difference is parsing the parameters vs not (I wonder what is the reasoning that remains, it doesn't do real harm to try parse the ?
query parameters when they don't exist).
|
||
Additionally, the netloc property is broken down into these additional | ||
attributes added to the returned object: username, password, hostname, and | ||
port. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same information is given in the below table, so I'm not sure how necessary this paragraph is?
I don't see harm in having it, though.
The field names should be formatted:
Additionally, the netloc property is broken down into these additional
attributes added to the returned object: :attr:`username`, :attr:`password`, :attr:`hostname`,
and :attr:`port`.