request.full_url: unexpected results on assignment #61474

demianbrecht · 2013-02-22T05:56:43Z

BPO	17272
Nosy	@terryjreedy, @orsenthil, @ezio-melotti, @bitdancer, @demianbrecht
Files	request.patch request_17272.1.patch request_17272.1.reusable.patch: apply first request_17272.2.full_url_w_fragment.patch: apply second 17272-3.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/orsenthil'
closed_at = <Date 2013-05-24.16:19:04.597>
created_at = <Date 2013-02-22.05:56:42.857>
labels = ['type-feature', 'library']
title = 'request.full_url: unexpected results on assignment'
updated_at = <Date 2014-03-10.22:11:24.984>
user = 'https://github.com/demianbrecht'

bugs.python.org fields:

activity = <Date 2014-03-10.22:11:24.984>
actor = 'python-dev'
assignee = 'orsenthil'
closed = True
closed_date = <Date 2013-05-24.16:19:04.597>
closer = 'orsenthil'
components = ['Library (Lib)']
creation = <Date 2013-02-22.05:56:42.857>
creator = 'demian.brecht'
dependencies = []
files = ['29156', '29589', '29994', '29995', '30340']
hgrepos = []
issue_num = 17272
keywords = ['patch']
message_count = 23.0
messages = ['182642', '182643', '182664', '182670', '182671', '182674', '182696', '184456', '184461', '184788', '184789', '185304', '185309', '187420', '187654', '187675', '187779', '187786', '188134', '189815', '189915', '189918', '213101']
nosy_count = 6.0
nosy_names = ['terry.reedy', 'orsenthil', 'ezio.melotti', 'r.david.murray', 'python-dev', 'demian.brecht']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue17272'
versions = ['Python 3.4']

demianbrecht · 2013-02-22T05:56:42Z

When assigning a value to an already instantiated Request object's full_url, unexpected results are found as a consequence in the attributes selector, type and fragment. Selector, type and fragment are only assigned to during instantiation. Unless you know to call Request._parse() after assignment to Request.full_url, the other attributes will not be reassigned new values.

The attached patch changes full_url to be a @Property, essentially moving what was going on in the constructor into the property method(s). All tests pass.

I ran into this issue when working on an OAuth 2.0 client library when attempting to mutate an already instantiated Request object, injecting the access_token into the query params.

Sandboxed code to repro the issue below:

In [1]: from urllib.request import Request
In [2]: r = Request('https://example.com?foo=bar#baz')

# as expected
In [4]: r.__dict__
Out[4]:
{'_data': None,
'_tunnel_host': None,
'fragment': 'baz',
'full_url': 'https://example.com?foo=bar',
'headers': {},
'host': 'example.com',
'method': None,
'origin_req_host': 'example.com',
'selector': '/?foo=bar',
'type': 'https',
'unredirected_hdrs': {},
'unverifiable': False}

In [5]: r.full_url = 'https://example.com?foo=bar&access_token=token_value#baz'

# unexpected results in selector
In [6]: r.__dict__
Out[6]:
{'_data': None,
'_tunnel_host': None,
'fragment': 'baz',
'full_url': 'https://example.com?foo=bar&access_token=token_value#baz',
'headers': {},
'host': 'example.com',
'method': None,
'origin_req_host': 'example.com',
'selector': '/?foo=bar',
'type': 'https',
'unredirected_hdrs': {},
'unverifiable': False}

In [7]: Request('https://example.com?foo=bar&access_token=token_value#baz')
Out[7]: <urllib.request.Request at 0x10ef6ce90>

# these results should match that of the full_url setter
In [8]: Request('https://example.com?foo=bar&access_token=token_value#baz').\_\_dict__
Out[8]:
{'_data': None,
'_tunnel_host': None,
'fragment': 'baz',
'full_url': 'https://example.com?foo=bar&access_token=token_value',
'headers': {},
'host': 'example.com',
'method': None,
'origin_req_host': 'example.com',
'selector': '/?foo=bar&access_token=token_value',
'type': 'https',
'unredirected_hdrs': {},
'unverifiable': False}

demianbrecht · 2013-02-22T05:59:01Z

I also meant to mention that of course, the obvious workaround would simply be to instantiate a new Request object with the new URL, but in my mind, this is something that should likely be supported by the Request object itself.

bitdancer · 2013-02-22T13:33:14Z

Well, full_url was not designed to be assignable, and is documented that way (the docs say it is the URL passed to the constructor). Whether or not it is reasonable to make it assignable is an interesting question.

demianbrecht · 2013-02-22T15:33:18Z

It could entirely be my lack of being Dutch that didn't make this obvious without reading the docs ;)

I guess there's one of three ways that this could go:

To help clarify the intention through code, either

a) Change full_url to _full_url, indicating that this should be treated as a private member and only expose the url through get_full_url(). This will obviously have the negative side effect of breaking backwards compatibility for anyone using full_url directly.

b) Keep the property implementation in the patch, but replace the setter with a read-only exception.

And the third option is what's in this patch (there are likely other options that I'm just not seeing at the moment as well).

Having said all that, if the mutability of full_url is made explicit, then safeguards should also be put in place for the rest of the attributes.

I couldn't think of any hard reason as to why the state of a Request instance /shouldn't/ be mutable and the user should be required to instantiate a new Request in order to use a new URL.

bitdancer · 2013-02-22T16:19:31Z

One of those other options is: do nothing :)

Python doesn't normally make things read-only just because mutating them does nothing useful. Sometimes we make things read-only when mutating them does nasty stuff, but even then sometimes we don't.

So the real question is, do others consider it a sensible and useful API change to make setting it do something useful, and how many other changes would be required to make the rest of the API consistent with that change?

We may be in python-ideas territory here, I'm not sure.

demianbrecht · 2013-02-22T17:10:26Z

Yes, I realized that I had forgotten to add the "do nothing" option after posting but figured it was relatively obvious :)

"Python doesn't normally make things read-only just because mutating them does nothing useful. Sometimes we make things read-only when mutating them does nasty stuff, but even then sometimes we don't."

I realize that this is a higher level question and outside of the scope of this particular issue (and likely belonging more in python-ideas), but doesn't this somewhat contradict "There should be one-- and preferably only one --obvious way to do it."? I'd assume that this should not only apply to sandboxed object implementations, but also at a higher level across /all/ objects. From your post, I assume inconsistency between modules, which would imply non-obvious ways to "do something" when moving from one module (or object) implementation to the next.

I'm definitely interested to hear whether or not others would find this change useful as I do. There /should/ be no other changes required for consistency as no other attributes of the Request class that don't already implement assignment methods (i.e. "data") are affected by side effects within __init__ (or anywhere else).

terryjreedy · 2013-02-22T20:16:37Z

I believe the issue of reusing request objects after modification has come up before, either on the tracker (but I cannot find an issue) or elsewhere. Senthil may remember better and certainly may have an opinion. I agree that python-idea would be a better place for other opinions.

orsenthil · 2013-03-18T14:25:55Z

Sorry for taking long time to respond. full_url has been in the shape it is for many versions, changing it in backwards incompatible way be do more harm than good.

I would be really interested to know why the present form of full_url presented any limitation for developing an client? Apart from get_full_url there are other ways to get full_url and url associated. Any details on "Why this is must"? The explain snippet below gives the assumption on full_url setting, I could not get need for change from this. Personally, I am 0 to -1 on this too, as in I cannot see a clear need for this change.

demianbrecht · 2013-03-18T15:28:17Z

No worries.

The change is not backwards incompatible. test_urllib2 test pass without any modification (I'm getting a segfault in test_heapq atm so can't run the full suite). I've simply moved the side effects cause by __init__ to a setter so that full_url may be set after instantiation and will still incur the same results as after initial creation.

The biggest problem with this particular attribute and the way that it's currently handled is inconsistent with the rest of the class. The only other attribute that incurs side effects when set (data) is implemented with @Property getter and setters. When set, it incurs side effects on the headers (removing Content-Length). Unless I'm missing something, other attributes are directly mutable and do not incur side effects on instance state when set.

In my mind, if full_url is publicly accessible, then it should be publicly accessible /and/ settable. It currently is not without causing invalid state within a given Request instance.

bitdancer · 2013-03-20T20:33:11Z

Having looked at the current handling of the data attribute in the context of another issue, I am now inclined to agree with you that full_url should be updated in order to have the API have a consistent behavior. Although it isn't backward incompatible API wise, it is possible for existing code to depend on the other values *not* being recomputed (presumably unintentionally, since that would be a rather odd thing to do :), so I still think the change should only be made in 3.4. I'm open to argument on that, though.

bitdancer · 2013-03-20T20:38:02Z

Oh, data being a property is 3.4 only. So if consistency is the goal, this should definately be a 3.4 only change.

demianbrecht · 2013-03-26T21:15:14Z

Based on your and Andrew's comment in issue bpo-16464 (new behaviour so it should only apply to 3.4), I agree that this should be a 3.4-only change.

demianbrecht · 2013-03-26T22:04:29Z

Patch updated (acks update, fixed order) per Terry's instructions on core-mentorship.

bitdancer · 2013-04-20T12:41:29Z

Thanks for working on this, Demian. I made some review comments, mostly style things about the tests.

There's one substantial comment about the change in behaivor of the full_url property though (before patch it does not include the fragment, after the patch it does). We need to think about the implications of that change in terms of backward compatibility. It makes more sense, but how likely is it to break working code?

bitdancer · 2013-04-23T15:50:13Z

Senthil, are you saying (in the review) that we should treat the current return value of full_url as a bug, and fix it in maintenance releases? It is certainly true that its value does not match the documentation.

Ah. I see that get_full_url used to have the same bug before you fixed it in 3.1. So I guess I agree with you :)

demianbrecht · 2013-04-23T22:57:45Z

As suggested by Senthil, I've broken this up into two patches: One that implements this reusable Request (this one) and one that implements the new (consistent) behaviour, in having full_url return the fragment.

I will also add a bug/patch requesting the deprecation of get_full_url as it will simply be a passthrough to full_url.

python-dev · 2013-04-25T12:45:58Z

New changeset 2d4189e9bbe8 by Senthil Kumaran in branch 'default':
Issue bpo-17272: Making the urllib.request's Request.full_url a descriptor. Fixes
http://hg.python.org/cpython/rev/2d4189e9bbe8

orsenthil · 2013-04-25T14:41:48Z

I have committed the first patch which makes Request.full_url a descriptor.
As I was looking at the changes to be introduced by second patch, I noticed that we do not have comprehensive test coverage for .full_url public attribute. All the tests are testing the methods like get_full_url.

The change for .full_url not to include fragments was introduced in 63817:bf3359b7ed2e which says that for reasons that HTTP request should not include fragments, it was done. That's correct. Now, I would fear to introduce that bug again with the second patch wherein we inadvertently send a URL with fragment in a HTTP request. To ensure this will not be the case, I think, increase in test coverage, understanding and documenting the exact expectation will be necessary if we have to change Request.full_url behavior. I will be spending a little more time on it. I thought I will write down the points which should be taken care.

demianbrecht · 2013-04-30T07:15:33Z

bpo-8280 reports the error by way of urlopen(). In light of that, would it not make more sense to have the *Opener be responsible for determining which parts of the url should be used?

i.e. request.py, line ~1255:

r.url = req.get_full_url()

might instead be

r.url = req.full_url.split('#')[0]

To me, it would make more sense to have the client code (in this case, meaning the consumer of the Request object, which is the *Opener) be smart enough to know what parts of the url should be used in the HTTP request than to have the Request object have inconsistencies in the set and subsequent get values for full_url.

I wouldn't have an issue at all with adding a new patch that (on top of implementing full_url):

Adds identical tests to get_full_url for full_url
Changes the client code in *Opener (and anywhere else known to exhibit the same behaviour) to strip the URL fragment for the HTTP request

I don't believe that this change would be any more of a backwards compatibility risk than what's currently in the patch as the logic in terms of urlopen is kept. The risk of dependencies on fragment-less full_urls however, would remain.

orsenthil · 2013-05-22T13:32:20Z

Here is patch with tests and docs. I see no changes to opener is required and the selector which is sent to HTTP request is the correct one. I have added tests for redirect url with #fragment too (For testing scenario reported in bpo-8280).

I shall close this issue once I commit this patch. In a separate report/ commit, get_full_url should be deprecated. Also, I like to see splittag being replaced with urlparse itself, so that our test and expectation with fragment will be consistent.

python-dev · 2013-05-24T16:14:34Z

New changeset 51c5870144e7 by Senthil Kumaran in branch 'default':
Fix bpo-17272 - Make Request.full_url and Request.get_full_url return same result under all circumstances.
http://hg.python.org/cpython/rev/51c5870144e7

orsenthil · 2013-05-24T16:19:04Z

This is fixed in 3.4. I dont think backporting is a good idea just to support assignment to .full_url. Thinking of this further, the idea of reusing request by assigning to .full_url may not be a good idea, because if you set .full_url to a completely different URL from different host, the request becomes meaningless.

Closing this issue as the enhancement is done.

python-dev · 2014-03-10T22:11:25Z

New changeset e6d862886e5c by R David Murray in branch 'default':
whatsnew: urllib Request objects are now reusable.
http://hg.python.org/cpython/rev/e6d862886e5c

demianbrecht mannequin added the stdlib Python modules in the Lib dir label Feb 22, 2013

bitdancer added the type-feature A feature request or enhancement label Feb 22, 2013

orsenthil self-assigned this Apr 25, 2013

orsenthil closed this as completed May 24, 2013

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

request.full_url: unexpected results on assignment #61474

request.full_url: unexpected results on assignment #61474

demianbrecht mannequin commented Feb 22, 2013

demianbrecht mannequin commented Feb 22, 2013

demianbrecht mannequin commented Feb 22, 2013

bitdancer commented Feb 22, 2013

demianbrecht mannequin commented Feb 22, 2013

bitdancer commented Feb 22, 2013

demianbrecht mannequin commented Feb 22, 2013

terryjreedy commented Feb 22, 2013

orsenthil commented Mar 18, 2013

demianbrecht mannequin commented Mar 18, 2013

bitdancer commented Mar 20, 2013

bitdancer commented Mar 20, 2013

demianbrecht mannequin commented Mar 26, 2013

demianbrecht mannequin commented Mar 26, 2013

bitdancer commented Apr 20, 2013

bitdancer commented Apr 23, 2013

demianbrecht mannequin commented Apr 23, 2013

python-dev mannequin commented Apr 25, 2013

orsenthil commented Apr 25, 2013

demianbrecht mannequin commented Apr 30, 2013

orsenthil commented May 22, 2013

python-dev mannequin commented May 24, 2013

orsenthil commented May 24, 2013

python-dev mannequin commented Mar 10, 2014

request.full_url: unexpected results on assignment #61474

request.full_url: unexpected results on assignment #61474

Comments

demianbrecht mannequin commented Feb 22, 2013

demianbrecht mannequin commented Feb 22, 2013

demianbrecht mannequin commented Feb 22, 2013

bitdancer commented Feb 22, 2013

demianbrecht mannequin commented Feb 22, 2013

bitdancer commented Feb 22, 2013

demianbrecht mannequin commented Feb 22, 2013

terryjreedy commented Feb 22, 2013

orsenthil commented Mar 18, 2013

demianbrecht mannequin commented Mar 18, 2013

bitdancer commented Mar 20, 2013

bitdancer commented Mar 20, 2013

demianbrecht mannequin commented Mar 26, 2013

demianbrecht mannequin commented Mar 26, 2013

bitdancer commented Apr 20, 2013

bitdancer commented Apr 23, 2013

demianbrecht mannequin commented Apr 23, 2013

python-dev mannequin commented Apr 25, 2013

orsenthil commented Apr 25, 2013

demianbrecht mannequin commented Apr 30, 2013

orsenthil commented May 22, 2013

python-dev mannequin commented May 24, 2013

orsenthil commented May 24, 2013

python-dev mannequin commented Mar 10, 2014