-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
request.full_url: unexpected results on assignment #61474
Comments
When assigning a value to an already instantiated Request object's full_url, unexpected results are found as a consequence in the attributes selector, type and fragment. Selector, type and fragment are only assigned to during instantiation. Unless you know to call Request._parse() after assignment to Request.full_url, the other attributes will not be reassigned new values. The attached patch changes full_url to be a @Property, essentially moving what was going on in the constructor into the property method(s). All tests pass. I ran into this issue when working on an OAuth 2.0 client library when attempting to mutate an already instantiated Request object, injecting the access_token into the query params. Sandboxed code to repro the issue below: In [1]: from urllib.request import Request # as expected In [5]: r.full_url = 'https://example.com?foo=bar&access_token=token_value#baz' # unexpected results in selector In [7]: Request('https://example.com?foo=bar&access_token=token_value#baz') # these results should match that of the full_url setter |
I also meant to mention that of course, the obvious workaround would simply be to instantiate a new Request object with the new URL, but in my mind, this is something that should likely be supported by the Request object itself. |
Well, full_url was not designed to be assignable, and is documented that way (the docs say it is the URL passed to the constructor). Whether or not it is reasonable to make it assignable is an interesting question. |
It could entirely be my lack of being Dutch that didn't make this obvious without reading the docs ;) I guess there's one of three ways that this could go: To help clarify the intention through code, either a) Change full_url to _full_url, indicating that this should be treated as a private member and only expose the url through get_full_url(). This will obviously have the negative side effect of breaking backwards compatibility for anyone using full_url directly. b) Keep the property implementation in the patch, but replace the setter with a read-only exception. And the third option is what's in this patch (there are likely other options that I'm just not seeing at the moment as well). Having said all that, if the mutability of full_url is made explicit, then safeguards should also be put in place for the rest of the attributes. I couldn't think of any hard reason as to why the state of a Request instance /shouldn't/ be mutable and the user should be required to instantiate a new Request in order to use a new URL. |
One of those other options is: do nothing :) Python doesn't normally make things read-only just because mutating them does nothing useful. Sometimes we make things read-only when mutating them does nasty stuff, but even then sometimes we don't. So the real question is, do others consider it a sensible and useful API change to make setting it do something useful, and how many other changes would be required to make the rest of the API consistent with that change? We may be in python-ideas territory here, I'm not sure. |
Yes, I realized that I had forgotten to add the "do nothing" option after posting but figured it was relatively obvious :) "Python doesn't normally make things read-only just because mutating them does nothing useful. Sometimes we make things read-only when mutating them does nasty stuff, but even then sometimes we don't." I realize that this is a higher level question and outside of the scope of this particular issue (and likely belonging more in python-ideas), but doesn't this somewhat contradict "There should be one-- and preferably only one --obvious way to do it."? I'd assume that this should not only apply to sandboxed object implementations, but also at a higher level across /all/ objects. From your post, I assume inconsistency between modules, which would imply non-obvious ways to "do something" when moving from one module (or object) implementation to the next. I'm definitely interested to hear whether or not others would find this change useful as I do. There /should/ be no other changes required for consistency as no other attributes of the Request class that don't already implement assignment methods (i.e. "data") are affected by side effects within __init__ (or anywhere else). |
I believe the issue of reusing request objects after modification has come up before, either on the tracker (but I cannot find an issue) or elsewhere. Senthil may remember better and certainly may have an opinion. I agree that python-idea would be a better place for other opinions. |
Sorry for taking long time to respond. full_url has been in the shape it is for many versions, changing it in backwards incompatible way be do more harm than good.
|
No worries. The change is not backwards incompatible. test_urllib2 test pass without any modification (I'm getting a segfault in test_heapq atm so can't run the full suite). I've simply moved the side effects cause by __init__ to a setter so that full_url may be set after instantiation and will still incur the same results as after initial creation. The biggest problem with this particular attribute and the way that it's currently handled is inconsistent with the rest of the class. The only other attribute that incurs side effects when set (data) is implemented with @Property getter and setters. When set, it incurs side effects on the headers (removing Content-Length). Unless I'm missing something, other attributes are directly mutable and do not incur side effects on instance state when set. In my mind, if full_url is publicly accessible, then it should be publicly accessible /and/ settable. It currently is not without causing invalid state within a given Request instance. |
Having looked at the current handling of the data attribute in the context of another issue, I am now inclined to agree with you that full_url should be updated in order to have the API have a consistent behavior. Although it isn't backward incompatible API wise, it is possible for existing code to depend on the other values *not* being recomputed (presumably unintentionally, since that would be a rather odd thing to do :), so I still think the change should only be made in 3.4. I'm open to argument on that, though. |
Oh, data being a property is 3.4 only. So if consistency is the goal, this should definately be a 3.4 only change. |
Based on your and Andrew's comment in issue bpo-16464 (new behaviour so it should only apply to 3.4), I agree that this should be a 3.4-only change. |
Patch updated (acks update, fixed order) per Terry's instructions on core-mentorship. |
Thanks for working on this, Demian. I made some review comments, mostly style things about the tests. There's one substantial comment about the change in behaivor of the full_url property though (before patch it does not include the fragment, after the patch it does). We need to think about the implications of that change in terms of backward compatibility. It makes more sense, but how likely is it to break working code? |
Senthil, are you saying (in the review) that we should treat the current return value of full_url as a bug, and fix it in maintenance releases? It is certainly true that its value does not match the documentation. Ah. I see that get_full_url used to have the same bug before you fixed it in 3.1. So I guess I agree with you :) |
As suggested by Senthil, I've broken this up into two patches: One that implements this reusable Request (this one) and one that implements the new (consistent) behaviour, in having full_url return the fragment. I will also add a bug/patch requesting the deprecation of get_full_url as it will simply be a passthrough to full_url. |
New changeset 2d4189e9bbe8 by Senthil Kumaran in branch 'default': |
I have committed the first patch which makes Request.full_url a descriptor. The change for .full_url not to include fragments was introduced in 63817:bf3359b7ed2e which says that for reasons that HTTP request should not include fragments, it was done. That's correct. Now, I would fear to introduce that bug again with the second patch wherein we inadvertently send a URL with fragment in a HTTP request. To ensure this will not be the case, I think, increase in test coverage, understanding and documenting the exact expectation will be necessary if we have to change Request.full_url behavior. I will be spending a little more time on it. I thought I will write down the points which should be taken care. |
bpo-8280 reports the error by way of urlopen(). In light of that, would it not make more sense to have the *Opener be responsible for determining which parts of the url should be used? i.e. request.py, line ~1255: r.url = req.get_full_url() might instead be r.url = req.full_url.split('#')[0] To me, it would make more sense to have the client code (in this case, meaning the consumer of the Request object, which is the *Opener) be smart enough to know what parts of the url should be used in the HTTP request than to have the Request object have inconsistencies in the set and subsequent get values for full_url. I wouldn't have an issue at all with adding a new patch that (on top of implementing full_url):
I don't believe that this change would be any more of a backwards compatibility risk than what's currently in the patch as the logic in terms of urlopen is kept. The risk of dependencies on fragment-less full_urls however, would remain. |
Here is patch with tests and docs. I see no changes to opener is required and the selector which is sent to HTTP request is the correct one. I have added tests for redirect url with #fragment too (For testing scenario reported in bpo-8280). I shall close this issue once I commit this patch. In a separate report/ commit, get_full_url should be deprecated. Also, I like to see splittag being replaced with urlparse itself, so that our test and expectation with fragment will be consistent. |
New changeset 51c5870144e7 by Senthil Kumaran in branch 'default': |
This is fixed in 3.4. I dont think backporting is a good idea just to support assignment to .full_url. Thinking of this further, the idea of reusing request by assigning to .full_url may not be a good idea, because if you set .full_url to a completely different URL from different host, the request becomes meaningless. Closing this issue as the enhancement is done. |
New changeset e6d862886e5c by R David Murray in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: