Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default Authentication Overwrites Authentication Header #2062

Closed
joe42 opened this issue May 24, 2014 · 28 comments · Fixed by #2945
Closed

Default Authentication Overwrites Authentication Header #2062

joe42 opened this issue May 24, 2014 · 28 comments · Fixed by #2945

Comments

@joe42
Copy link

joe42 commented May 24, 2014

Hi,

And thanks for maintaining requests. When I set an Authorization header in a request, I do not expect it to be overwritten. I set up default credentials in my .netrc file (for automatizing the cadaver command line tool), and suddenly I could not authenticate to my service any longer.
Please check if an Authorization header does already exist in your default authorization handler, instead of overwriting it. I see that you documented this behaviour in the authentication section, but since I never used this form of authentication it took quite some time to find the problem. Also, overwriting a header that the user set does not seem to be a sensible implementation.

Thanks,
Joe

@sigmavirus24
Copy link
Contributor

Hey @joe42 There are several headers we don't expect users to set on their own because they really shouldn't be. If you want to prevent us from overwriting the header you should use the auth= parameter to your request. By default it takes a two-tuple (username, and password) or any other Authentication Handler.

@joe42
Copy link
Author

joe42 commented May 24, 2014

@sigmavirus24 Sorry, I should have mentioned that I already circumvented the problem by writing an authentication handler that does not modify the request. My post was meant as a suggestion to improve requests, as I suppose other users might run into the same problem.

Hey @joe42 There are several headers we don't expect users to set on their own because they really shouldn't be.

I see. I do set several headers directly. Can you point me to a resource explaining which headers should not be set and why? Setting the headers directly seemed to be the way to go after reading the documentation's "Custom Headers" section.

Thx,
Joe

@Lukasa
Copy link
Member

Lukasa commented May 25, 2014

This is another of those situations where the right thing to do is not entirely obvious. For instance, what should the output be if this happens:

import requests

r = requests.get(url, headers={'Authorization': auth}, auth=(user, pass))

Which of the two do we respect first? Our general principle is to respect the specific argument over the general one: auth over headers, body over Content-Length header etc.

For this reason, if you can auth via an Auth handler you should: that's how requests expects you to do it.

@joe42
Copy link
Author

joe42 commented May 25, 2014

@Lukasa : I can understand why you choose to design it this way. But I think that the result of the following request with default credentials in .netrc is not obvious at all. The point I do still not get is that I cannot imagine a case where a user would actually set a specific header (like in your example) and not expect it to be used by requests. Why should he bother doing this? So it might be good to either make it obvious (by documentation/warning/error) or to respect the users explicit settings.

import requests
r = requests.get(url, headers={'Authorization': auth}, auth=None)

@sigmavirus24
Copy link
Contributor

So it might be good to either make it obvious (by documentation/warning/error) or to respect the users explicit settings.

I appreciate that you've shared your opinion @joe42. You're right that explicit is better than implicit and in this case you can explicitly turn off requests using your .netrc file. With it off, you wouldn't have this trouble. That said, the documentation should be improved to explain which headers users should refrain from setting themselves. We will not try to internally keep track of which headers were provided by the users and which we have generated or which ones are generated by urllib3 or which ones are provided by httplib. It is certainly possible to do, but it is not something we will do. Further, if we allow users to set their own headers in cases like this (and others) it will likely lead to far more cases where the user is surprised by requests' behaviour. We're doing the right thing here.

@migurski
Copy link

I had a related problem in #2066.

My issue arose using Heroku’s API, which has two confusing factors relevant to this issue. The toolbelt quietly creates a .netrc file (mine was a year old; I didn't know it existed) and Heroku asks users to create a custom Authorization header:

For example, given the access token 01234567-89ab-cdef-0123-456789abcdef, request headers should be set to Authorization: Bearer 01234567-89ab-cdef-0123-456789abcdef.

I am now using Session.trust_env to prevent this issue thanks to @Lukasa, but I found the silent overwrite of the Authorization header to be a nasty surprise. I’m in agreement with @joe42 on this: specifically providing a header ought to overrule environmental factors. Alternatively, could the .netrc behavior be documented under API, custom authentication, and quickstart? My searches for “auth” in the requests documentation should have mentioned this.

@Lukasa
Copy link
Member

Lukasa commented May 26, 2014

First, I'd like to address something:

My searches for “auth” in the requests documentation should have mentioned this.

Your searching was not terribly effective. The docs front page contains a heading entitled 'Authentication'. That section contains a heading entitled netrc Authentication. This is not very difficult to find, and it's in exactly the location in the documentation I'd expect to find it. I'm open to linking to the Authentication section from the Custom Authentication section, but anything more seems incredible to me.

Now, I feel like we've seen several related issues recently that warrant a more careful and broader response. The key problem is this: lots of users with previous HTTP client experience are expecting that requests will treat the headers they set as a fundamental source of truth: anything set in the headers should be treated as true by requests, or at least not removed.

Requests has the opposite approach: headers are the lowest source of truth, and we feel quite happy to replace many kinds of headers that the user sets. Some examples:

  • We will replace Authorization headers when an alternative auth source is found.
  • We will remove Authorization headers when you get redirected off-host.
  • We will replace Proxy-Authorization headers when you provide proxy credentials in the URL.
  • We will replace Content-Length headers when we can determine the length of the content.

We do this for very good reason: a headers-based API is utterly terrible. Affecting the library behaviour based on arbitrary key-value pairs set in the headers dictionary is asking for all kinds of terrible bugs, and is the kind of API-design-footgunnery that I'd expect to see from the OpenSSL project.

Perhaps we need a section of the docs that explicitly says: you cannot trust that your headers will remain unmodified. Maybe even a list of the headers you absolutely should not set yourself (currently I think it's Authorization, Proxy-Authorization, and Content-Length).

@migurski
Copy link

You’re right, I don't know how I missed that section and it mentions .netrc. My mistake!

@Lukasa
Copy link
Member

Lukasa commented May 26, 2014

@migurski Whew, I was worried we had a bigger discoverability problem. Our docs are pretty large, sadly, which makes it easy to miss things. =(

@migurski
Copy link

A central explanation of the headers approach you describe linked throughout the docs, wherever headers is documented, would be amazingly helpful. It’s counterintuitive to me for certain headers, but I respect that Requests has a design intent here. Thanks for an excellent library!

@joe42
Copy link
Author

joe42 commented May 26, 2014

@Lukasa: Thanks for your detailed answer.

headers are the lowest source of truth, and we feel quite happy to replace many kinds of headers that the user sets. Some examples:

We will replace Authorization headers when an alternative auth source is found.
We will remove Authorization headers when you get redirected off-host.
We will replace Proxy-Authorization headers when you provide proxy credentials in the URL.
We will replace Content-Length headers when we can determine the length of the content.

We do this for very good reason: a headers-based API is utterly terrible. Affecting the library behaviour based on arbitrary key-value pairs set in the headers dictionary is asking for all kinds of terrible bugs

It would be great if you could add this sentence to the "Custom Headers" section, after it states:

If you’d like to add HTTP headers to a request, simply pass in a dict to the headers parameter.

@Lukasa
Copy link
Member

Lukasa commented May 26, 2014

@sigmavirus24 Do you feel like this would be a good addition to the docs?

@sigmavirus24
Copy link
Contributor

I feel like there have been a small number of users very violently surprised by the fact that we don't analyze the headers and do "smart" things based upon their presence (or lack thereof). I think adding to the Custom Headers section to say "By the way, we do no analysis of the headers you set so setting different headers will have no effect on the way requests handles your request." We should also mention the fact that we will happily (and with good reason) ignore headers set by the user in preference for headers we generate ourselves. That said, the idea of linking this throughout the docs wherever headers are mentioned is clearly overkill so please don't do that.

@migurski
Copy link

At minimum, include a link from places where the headers param is documented. “Simply pass in a dict” is not accurate without some explanation of order of operations: first the dict is applied, then the environment is inspected for conflicting information to use instead.

@benjaminran
Copy link
Contributor

Borrowing from past comments, here's what I think would address the issue well in the documentation:

Custom headers are given less precedence than more specific sources of information. For example:

  • Authorization headers will be overridden by any alternative auth source found.
  • Authorization headers will be removed if you get redirected off-host.
  • Proxy-Authorization headers will be overridden by proxy credentials provided in the URL.
  • Content-Length headers will be overridden when we can determine the length of the content.

Furthermore, Requests does not change its behavior at all based on which custom headers are specified. The headers are simply passed on into the final request."

Is there anything this doesn't address? Also, I feel like it should be more concise. Thoughts?

@Lukasa
Copy link
Member

Lukasa commented Apr 3, 2015

@benjaminran Agreed, that looks like extremely useful documentation to me. If you open a pull request that adds it, I'll happily merge it. =)

@sigmavirus24
Copy link
Contributor

@benjaminran one thing: documentation shouldn't be overly verbose but they also shouldn't be too concise. The difficulty in writing documentation is finding the right balance such that everyone can understand what's going on without reading a novel. ;)

@migurski
Copy link

migurski commented Apr 3, 2015

As long as it’s made clear what is covered by “any alternative auth source,” then yes.

@benjaminran
Copy link
Contributor

That's a good point--that would probably confuse me if I saw it in unfamiliar documentation. How's this?

Authorization headers will be overridden if an auth= parameter is used.

Also, is it clear from context that 'Authorization headers' refers to custom headers?

@sigmavirus24
Copy link
Contributor

A more accurate description would be:

Authorization headers will be overridden if credentials are passed via the auth parameter or are specified in a .netrc accessible in the environment.

The latter part is the root of this issue.

@ibnIrshad
Copy link

I was about to suggest that we close this issue, since the documentation given by @benjaminran is now merged into the master docs (in Quickstart->Custom Headers). However I have one more question: what happens when the auth= parameter is used, AND the .netrc environment config is set? It is not clear from the docs which one takes precedence. And as @migurski pointed out, we have a definite case where a hosting provider (Heroku) quietly created a .netrc file, so it would be important to know this.

I suspect that auth= should override .netrc, but I wanted to confirm before creating a PR.

@migurski
Copy link

+1 to preferring that auth= overrides .netrc.

ibnIrshad pushed a commit to ibnIrshad/requests that referenced this issue Dec 21, 2015
This closes psf#2062 by clarifying which auth header takes precedence:
1. auth=
2. .netrc
3. headers=
@sigmavirus24
Copy link
Contributor

@ibnIrshad please do not create a PR. That would be a breaking change that we will not accept until 3.0.0

@ibnIrshad
Copy link

What I meant is I will create a PR to clarify the current expected behaviour in the docs, not to change anything. Is it true that auth= has precedence over .netrc currently? It is not clear from the docs, to me atleast. Whether it should or should not is a different issue, but right now I simply want to document it.

@ibnIrshad
Copy link

After scouring the tests, it seems auth= IS designed to override .netrc. Great! I'll just create PR to update the docs.

test_requests.py, line 372

            # Given auth should override and fail.
            r = requests.get(url, auth=wrong_auth)
            assert r.status_code == 401

ibnIrshad pushed a commit to ibnIrshad/requests that referenced this issue Dec 21, 2015
This closes psf#2062 by clarifying in the docs which auth
header takes precedence:
1st auth=
2nd .netrc
3rd headers=

This precedence order is already tested in test_requests.py,
in the test_basicauth_with_netrc method. Perhaps we should
add further tests for non-basic auth schemes.
@migurski
Copy link

In my experience previously, .netrc overrode auth= —should I re-run my tests to see if this is now a real bug?

@ibnIrshad
Copy link

Well the tests only cover basic auth. What kind of auth are you doing?

@migurski
Copy link

Looking back on my original issue, looks I was setting and Authorization header: https://github.com/kennethreitz/requests/issues/2066

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
6 participants