Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Required changes to support redirects on HTTP 307/308 #36809

Merged
merged 6 commits into from
Apr 6, 2018

Conversation

dagwieers
Copy link
Contributor

@dagwieers dagwieers commented Feb 28, 2018

SUMMARY

This PR includes fixes to follow_redirects=all:

  • Ensure HTTP 307 and 308 will redirect the request to the new location without modification. (including support for POST/PUT and payload)
  • Ensure that HTTP 301, 302 and 303 do GET requests only
ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

module_utils/urls.py

ANSIBLE VERSION

v2.5

@dagwieers dagwieers added c:module_utils/urls meraki Cisco Meraki community labels Feb 28, 2018
@dagwieers dagwieers added this to the 2.5.0 milestone Feb 28, 2018
@ansibot ansibot added bugfix_pull_request support:core This issue/PR relates to code supported by the Ansible Engineering Team. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. labels Feb 28, 2018
@dagwieers
Copy link
Contributor Author

dagwieers commented Feb 28, 2018

Here is a reproducer for the problem using urllib2:

#!/usr/bin/python

from ansible.module_utils.urls import fetch_url

class AltModule():
    params = dict(
        follow_redirects='urllib2',
#        follow_redirects='all',
    )
    tmpdir='/tmp'

module = AltModule()
resp, info = fetch_url(
    module,
#    'http://httpstat.us/301',
#    'http://httpstat.us/307',
    'http://httpstat.us/308',
    force=True,
    method='POST',
    data='{"name": "Somewhere", "type": "wireless",  "tags": "a_tag",  "timeZone": "America/Chicago"}',
    headers={
        'Content-Type': 'application/json',
    },
)
print(info['status'])
print(info['url'])

On Python 2.7 this reproducer only seems to show the problem with HTTP 307 on urllib2. But shows the problem for both HTTP 307 and 308 on both urllib2 and all.

Copy link
Member

@sivel sivel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow_redirects=urllib2 should not be updated to support this. The specific purpose of follow_redirects=urllib2 is to work exactly how urllib2 works. When using follow_redirects=urllib2 the expected outcome is that an HTTPError should be raised.

Any change to expand the accepted status codes should only happen to all

@dagwieers
Copy link
Contributor Author

dagwieers commented Feb 28, 2018

Hmm, that seems weird. urllib2 is lacking any support for HTTP 308 redirects (as HTTP 308 predates urllib2 since its RFC says April 2015) and doesn't support 307 with POST/PUT.

https://github.com/python/cpython/blob/2.7/Lib/urllib2.py#L577

I don't see why anyone would want that honestly. And worse, urllib2 is the default method.

https://stackoverflow.com/questions/42136829/whats-difference-between-http-301-and-308-status-codes

Adding HTTP 307 and HTTP 308 support to urllib2 does not make sense either, but would have been to proper course of action. Won't fix Python 2.6 and highly unlikely for Python 2.7. That's why we are turning to this.

Can you explain why urllib2 method needs to stay broken ? And why it is the default method ?

@kbreit
Copy link
Contributor

kbreit commented Feb 28, 2018

I did some tests to check results. These were done with Python 3.5.4.

Devel Branch (Before)
            HTTP 307      HTTP 308
urllib2       307              308
all             200              308

PR Branch (After)
            HTTP 307      HTTP 308
urllib2       200              200
all             200              200

@sivel
Copy link
Member

sivel commented Feb 28, 2018

The urllib2 method exists to explicitly match the default redirect behavior in python.

This ensures that a request made with urllib2/urllib.request outside of ansible, and within ansible follow the same redirect rules.

all is the correct method to accept more redirect rules, which redirects for everything, safe redirects for even fewer. urllib2 is a value somewhere between what httplib2 considered safe, and just allowing all redirects.

A default value of follow_redirects can be made whatever you want. The default in open_url is urllib2, but it doesn't exist in url_argument_spec, so you can default it to whatever value makes the most sense for your use case in the specific modules argument_spec, or by setting it in the module.params explicitly, not allowing a user to override.

@dagwieers
Copy link
Contributor Author

dagwieers commented Feb 28, 2018

@sivel The new implementation does what you want. Tested and verified for follow_redirects=all.

Thanks !

urllib2 is a value somewhere between what httplib2 considered safe, and just allowing all redirects.

It actually doesn't allow all redirects. HTTP 307 and 308 remain broken for urllib2 after this revisited PR.

@dagwieers dagwieers added this to To Do in 2.5.x blocker list via automation Feb 28, 2018
@sivel
Copy link
Member

sivel commented Feb 28, 2018

@dagwieers thanks. I'll likely get back around this tomorrow.


return RequestWithMethod(newurl,
method=method,
headers=req.headers,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we are dropping newheaders here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No probably a left-over of reworking the code. It works fine for our test-cases because there's no rewrite of the URL.

newurl = newurl.replace(' ', '%20')
newheaders = dict((k, v) for k, v in req.headers.items()
if k.lower() not in ("content-length", "content-type"))
if k.lower() not in ("content-length"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we modifying this line, and then not using it later?

Copy link
Contributor Author

@dagwieers dagwieers Mar 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same answer as before.

@ansibot ansibot added the bug This issue/PR relates to a bug. label Mar 1, 2018
@dagwieers
Copy link
Contributor Author

So I have been looking at urllib3 (from v2.7) and urllib (from v3.6) and noticed that rewriting spaces and Content-Length is not performed in urllib3 and I wonder if that is necessary, or just an old work-around for broken servers ?

@dagwieers
Copy link
Contributor Author

@sivel Is this ready to be merged now ?

@sivel
Copy link
Member

sivel commented Mar 2, 2018

I think the only thing left is tests. Integration tests will work to test the behavior of all and safe, but the uri module doesn't accept urllib2 as an option. I am leaning more towards just adding urllib2 as an option to uri so that it can be tested the same way. Otherwise we would need a unit test, but the unit test should not actually make an HTTP request, we would just test the outcome of RedirectHandlerFactory and what the resultant redirect_request would do.

url: "http://{{ httpbin_host }}/redirect-to?status_code=308&url=http%3A%2F%2Fexample.com%2F"

@dagwieers
Copy link
Contributor Author

Well, You need to test the actual request and see if the request used the proper method and did contain the payload. Otherwise the HTTP 307 error would not have been detected, as it did redirect to the correct location, but using the wrong method and no payload.

Not sure if our current integration-testing methodology allow to test redirection + feedback about used method and payload. If you like I can add integration tests for all this based on https://httpstat.us/ and get this merged ASAP. Everything else will likely need infra adaptations.

@sivel
Copy link
Member

sivel commented Mar 2, 2018

httpbin can be made to work the same way, by redirecting to /post that could accept a post. All of our integration tests that perform this type of action utilize httpbin, which would be required. the httptester image uses httpbin.

I need to think on some other things as well. This fundamentally makes changes to allow the body to be redirected as well, which has never been the case before, and could result in unexpected and backwards incompatible changes for people.

I think this need to be it's own, new, separate follow_redirects option.

EDIT: Also, with respect to a unit test, I'd expect things to be mocked and inspected in typical unit test style, not actually making a request.

@dagwieers
Copy link
Contributor Author

@sivel Good luck creating 2 implementations for every bug in our code.

The whole point of HTTP 307 and 308 using POST/PUT is to redo the request to the new URI (and not turning it into a GET without payload or returning 307/308). This did not work before. So anything depending on this to fail, would not be usable. There's no universe where causing a HTTP 200 (or a 307 or 308) and not actually redoing the request is correct. It's broken, can't work.

That's why I don't understand why we wouldn't be fixing urllib2 too. Other HTTP clients don't have options for every incorrect behaviour of the past, regardless of someone was relying on the wrong behaviour. And in this case, you can't rely on the wrong behaviour, as it simply doesn't do anything you depend on. (You might as well NOT do the request...)

@dagwieers dagwieers force-pushed the urllib-redirect-308 branch 3 times, most recently from de12e6e to 9b3ff09 Compare April 4, 2018 10:06
ryancurrah pushed a commit to ryancurrah/ansible that referenced this pull request Apr 4, 2018
This is required if we want to ensure that ansible#36809 doesn't cause any
important behavioral changes.

This PR changes the uri module to support follow_redirects=urllib2

It also adds a better error message when the connection closes before
any data was returned.
@sivel
Copy link
Member

sivel commented Apr 6, 2018

Everything looks good to me so far. It passes all of my recently added unit tests (with small manipulations due to using RequestWithMethod).

After this merges, I'll update my urls.py unit tests, and merge them as well. Potentially adding a few more to verify some additional functionality, but as is, I am ok with this merging now.

Thanks for all the work, and bearing with my review.

@sivel
Copy link
Member

sivel commented Apr 6, 2018

rebuild_merge

@ansibot ansibot removed the needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. label Apr 6, 2018
@dagwieers
Copy link
Contributor Author

@sivel And thanks for persisting despite me not getting initially what you were saying...

@ansibot ansibot added the needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. label Apr 6, 2018
@sivel
Copy link
Member

sivel commented Apr 6, 2018

CI failures. Unrelated. I'll merge manually.

@sivel sivel merged commit 9bb1ee3 into ansible:devel Apr 6, 2018
@nitzmahone nitzmahone moved this from 2.5.1 Holding to 2.5.2 holding in 2.5.x blocker list Apr 18, 2018
@nitzmahone nitzmahone moved this from 2.5.2 holding to 2.5.5 holding in 2.5.x blocker list May 30, 2018
ilicmilan pushed a commit to ilicmilan/ansible that referenced this pull request Nov 7, 2018
This is required if we want to ensure that ansible#36809 doesn't cause any
important behavioral changes.

This PR changes the uri module to support follow_redirects=urllib2

It also adds a better error message when the connection closes before
any data was returned.
ilicmilan pushed a commit to ilicmilan/ansible that referenced this pull request Nov 7, 2018
* Required changes to support redirects on HTTP 307/308

This ensures HTTP 307 and 308 will redirect the request to the new
location without modification.

* Fix the unused newheaders reference

* Be more compliant

* Add integration tests for follow_redirects=all

* Improve other tests for new behaviour

* Make follow_redirects values more strict
@dagwieers dagwieers added the cisco Cisco technologies label Feb 22, 2019
@ansible ansible locked and limited conversation to collaborators Apr 27, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue/PR relates to a bug. c:module_utils/urls cisco Cisco technologies meraki Cisco Meraki community needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. support:core This issue/PR relates to code supported by the Ansible Engineering Team. test This PR relates to tests.
Projects
No open projects
2.5.x blocker list
  
2.5.5 holding (needs backport PR)
Development

Successfully merging this pull request may close these issues.

None yet

5 participants