Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove header in middle of redirection #3490

Closed
saveman71 opened this issue Aug 9, 2016 · 8 comments
Closed

Remove header in middle of redirection #3490

saveman71 opened this issue Aug 9, 2016 · 8 comments

Comments

@saveman71
Copy link

saveman71 commented Aug 9, 2016

Hey there,

I'm having a pretty specific issue while trying to reproduce the login sequence of a heavy professional website (lots of redirect / cookies involved).
The login sequence was tested working on Chrome.

I spent some time debugging and finally tracked down the issue, that is that the server will not accept the Content-Type header that was required on the first request on the first redirect, thus making the second request crash with a 500.

First request shown in Chrome's dev inspector:
screenshot from 2016-08-09 20-03-26

Second request (first redirect) show in Chrome's dev inspector:
screenshot from 2016-08-09 20-15-02

Notice of how on the second request, no Content-Type header. The presence of a Content-Type header would make the request fail.

The code (URLs are mangled):

#!/usr/bin/env python

import requests
import urllib.parse

headers = {
    'Origin': 'https://www.example.com',
    'Accept-Encoding': 'gzip, deflate, br',
    'Accept-Language': 'en-US,en;q=0.8,fr-FR;q=0.6,fr;q=0.4',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36',
    'Content-Type': 'application/x-www-form-urlencoded',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
    'Cache-Control': 'max-age=0',
    'Referer': 'https://www.example.com/',
    'Connection': 'keep-alive',
}

data = 'some_data'

s = requests.Session()

r = s.post('https://www.exampleonanotherdomain.com/openam/UI/Login?realm=front_office&service=EEService&goto=https://www.exampleonanotherdomain.com/ice/rest/aiguillagemp/redirect?dest=',
           headers=headers, data=data)

I worked it around by removing the Content-Type header at runtime on the 3rd send (first is the first request header, 2nd is the first request data) to confirm that it was the issue (it was!)

Patch code added before the request.

import http.client as http_client

def patch():
    old_send = http_client.HTTPConnection.send
    def new_send( self, data ):
        print(data.decode('utf-8'))
        if new_send.i == 2:
            data = '\n'.join([l for l in data.decode('utf-8').split('\n') if not l.startswith('Content-Type:')]).encode('utf-8')
        print(data.decode('utf-8'))
        new_send.i += 1
        return old_send(self, data)
    new_send.i = 0
    http_client.HTTPConnection.send = new_send
    old_read = http_client.HTTPResponse.read
    def new_read( self , amt):
        data = old_read(self, amt)
        print(data)
        return data
    http_client.HTTPResponse.read = new_read

patch()

Any idea of how to solve this one?

Note: I have no control at all of the remote server.
Note: I intend to make a lot of requests while being logged in, so the I would like to stay with a session.

@Lukasa
Copy link
Member

Lukasa commented Aug 9, 2016

Can you run the code with a normal, unchanged Requests, and then once the request is complete run this:

for h in r.history:
    print(h.headers)
print(r.headers)

And then show us the output?

@Lukasa
Copy link
Member

Lukasa commented Aug 9, 2016

Oh, sorry.

Just stop setting the header yourself! You should avoid setting the Content-Type, Accept-Encoding, Content-Type, and Accept-Encoding headers in this case, as requests will handle all of those for you.

@saveman71
Copy link
Author

@Lukasa That was it! I was using the data field preformatted with urlib.parse.urlencode, my bad.
Felt bad when I read 2mn later in the official documentation:

Requests allows you to send organic, grass-fed HTTP/1.1 requests, without the need for manual labor.

There's no need to manually add query strings to your URLs, or to form-encode your POST data.

Thank you for your quick and effective answer 👍

@Lukasa
Copy link
Member

Lukasa commented Aug 9, 2016

My pleasure!

@saveman71
Copy link
Author

saveman71 commented Aug 10, 2016

Hey @Lukasa,

Yesterday I tested it rapidly and thought it worked. Today I'm not able to reproduce what worked...

#!/usr/bin/env python

import requests

ua = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36'

headers = {
    'User-Agent': ua,
}

s = requests.Session()

data = {
    'user': 'user.name@example.com',
    'pass': 'password'
}

def patch():
    import http.client as http_client
    old_send = http_client.HTTPConnection.send
    def new_send( self, data ):
        print('=== SEND CALL NB {}'.format(new_send.i))
        print(data.decode('utf-8'))
        if new_send.i == 2:
            data = '\n'.join([l for l in data.decode('utf-8').split('\n') if not l.startswith('Content-Type:')]).encode('utf-8')
            print('=== ALTERED DATA')
            print(data.decode('utf-8'))
        print('=== END')
        new_send.i += 1
        return old_send(self, data)
    new_send.i = 0
    http_client.HTTPConnection.send = new_send

patch()

r = s.post('https://www.example.com/openam/UI/Login?realm=front_office&service=EEService&goto=https://www.example.com/ice/rest/aiguillagemp/redirect?dest=',
           headers=headers, data=data)

The output with the patch commented is:

=== SEND CALL NB 0
POST /openam/UI/Login?realm=front_office&service=EEService&goto=https://www.example.com/ice/rest/aiguillagemp/redirect?dest= HTTP/1.1
Host: www.example.com
Connection: keep-alive
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36
Accept: */*
Content-Length: 168
Content-Type: application/x-www-form-urlencoded


=== END
=== SEND CALL NB 1
user=user.name@example.com&pass=password
=== END
=== SEND CALL NB 2
GET /ice/rest/aiguillagemp/redirect?dest= HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36
Accept: */*
Connection: keep-alive
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded
Cookie: AMAuthCookie=somecookie; amlbcookie=01; iPlanetDirectoryPro=somecookie

=== END
{"errorId":"20160810-16a27d9fb6be5720:46364bb9:1567196d7d5:-3b1b","codeEvt":"portail.core.java.ko.runtimeerreur","type":"tech","message":"Le service est actuellement indisponible, veuillez réessayer ultérieurement","errorData":{}}

You can see that the header Content-Type is repeated in the redirect although it was not specified explicitly in the request headers as before.

The log if I activate the workaround suppressing the header for the specific request:

=== SEND CALL NB 0
POST /openam/UI/Login?realm=front_office&service=EEService&goto=https://www.example.com/ice/rest/aiguillagemp/redirect?dest= HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36
Accept: */*
Connection: keep-alive
Accept-Encoding: gzip, deflate
Content-Length: 168
Content-Type: application/x-www-form-urlencoded


=== END
=== SEND CALL NB 1
user=user.name@example.com&pass=password
=== END
=== SEND CALL NB 2
GET /ice/rest/aiguillagemp/redirect?dest= HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36
Accept: */*
Connection: keep-alive
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded
Cookie: AMAuthCookie=somecookie; amlbcookie=01; iPlanetDirectoryPro=somecookie


=== ALTERED DATA
GET /ice/rest/aiguillagemp/redirect?dest= HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36
Accept: */*
Connection: keep-alive
Accept-Encoding: gzip, deflate
Cookie: AMAuthCookie=somecookie; amlbcookie=01; iPlanetDirectoryPro=somecookie


=== END
=== SEND CALL NB 3
GET /eefo/servlet/login HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36
Accept: */*
Connection: keep-alive
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded
Cookie: AMAuthCookie=somecookie; amlbcookie=01; iPlanetDirectoryPro=somecookie; dtCookie=somecookie|somecookie; SITE=Autre; APPSSESSIONID=somecookie; XSRF-TOKEN=gGlv1DzI


=== END
=== SEND CALL NB 4
GET /eefo/servlet/login?authenticate HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36
Accept: */*
Connection: keep-alive
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded
Cookie: AMAuthCookie=somecookie; amlbcookie=01; iPlanetDirectoryPro=somecookie; dtCookie=somecookie|somecookie; SITE=Autre; APPSSESSIONID=somecookie; EEAtnToken=user.name@example.com; XSRF-TOKEN=gGlv1DzI; FOCOOKIENAME=somecookie


=== END
=== SEND CALL NB 5
GET /eefo/appmanager/portail/clients?_nfpb=true&_pageLabel=eefo_page_accueil_membre HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36
Accept: */*
Connection: keep-alive
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded
Cookie: AMAuthCookie=somecookie; amlbcookie=01; iPlanetDirectoryPro=somecookie; dtCookie=somecookie|somecookie; SITE=Autre; APPSSESSIONID=somecookie; userType=userFO; USER=user.name@example.com; FOCOOKIENAME=somecookie; XSRF-TOKEN=gGlv1DzI; _WL_AUTHCOOKIE_FOCOOKIENAME=somecookie


=== END

Sorry for the french in error messages/URLs. I, unfortunately, cannot reveal what is the website responsible for the misbehaviour (NDA).

Am I doing something wrong here?

@saveman71 saveman71 reopened this Aug 10, 2016
@Lukasa
Copy link
Member

Lukasa commented Aug 10, 2016

Hrm, no. It does look like we don't strip the content-type header when we remove the body. We should probably do that. This is a bug.

@Lukasa
Copy link
Member

Lukasa commented Aug 10, 2016

For any contributor who wants to submit a patch for this, look at resolve_redirects, around line 143 of sessions.py. We want to do the same for Content-Type as we do for Content-Length, and probably also for Transfer Encoding too.

sigmavirus24 added a commit that referenced this issue Aug 12, 2016
#3490 removing Content-Type and Transfer-Encoding headers on redirect
@sigmavirus24
Copy link
Contributor

This was fixed in #3493

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants