Skip to content
This repository

Content-Length is missing #223

Closed
foxx opened this Issue · 63 comments

13 participants

Cal Leeming Kenneth Reitz Zoran Zaric Berndt Jung Arve Knudsen Shivaram Lingamneni Dan Fairs Mark Nottingham nettok Ian Cordasco Arup Malakar Piotr Dobrogost Cory Benfield
Cal Leeming

Please see the following example


_data = """callCount=1 
page=/internetsales/iSnapVehicle.xhtml?_gk=_c083894BB-4392-5290-8580-DE159D03B2B0_k9C5EF4F0-0269-3CC9-0994-633224433265

httpSessionId=
scriptSessionId=DD51AC690A8BC3BEDB1AEDF1B2A7A9DD575
c0-scriptName=PresentationRulesFacade
c0-methodName=execute
c0-id=0
c0-param0=string:VehicleForm c0-param1=string:VehicleForm%3Amake
c0-param2=string:%7B%22VehicleForm%3AdaysDriven%22%20%3A%20%221%22%2C%20%22VehicleForm%3AmilesDriven%22%20%3A%20%220%22%2C%20%22VehicleForm%3Ayear%22%20%3A%20%221998%22%2C%20%22VehicleForm%3AestimatedMileage%22%20%3A%20%22%22%2C%20%22VehicleForm%3AodometerReading%22%20%3A%20%22%22%2C%20%22VehicleForm%3Amake%22%20%3A%20%22AUDI%22%2C%20%22VehicleForm%3Ayear-txt%22%20%3A%20%22%22%2C%20%22VehicleForm%3Amake-txt%22%20%3A%20%22%22%2C%20%22VehicleForm%3Amodel-txt%22%20%3A%20%22%22%2C%20%22VehicleForm%3AmodelDesc%22%20%3A%20%22Other%22%2C%20%22VehicleForm%3Amodel%22%20%3A%20%22Other%22%2C%20%22VehicleForm%3Abodystyle%22%20%3A%20%22VAN%22%7D
c0-param3=boolean:false
batchId=327
"""

print len(_data)
_headers = {
    'referer' : 'https://sales2.geico.com/internetsales/iSnapVehicle.xhtml?_gk=_c083894BB-4392-5290-8580-DE159D03B2B0_k9C5EF4F0-0269-3CC9-0994-633224433265'
}
r = requests.post(
    url = 'https://sales2.geico.com/internetsales/dwr/call/plaincall/PresentationRulesFacade.execute.dwr',
    data = _data,
    headers = _headers
)

print r.request.headers
{'referer': 'https://sales2.geico.com/internetsales/iSnapVehicle.xhtml?_gk=_c083894BB-4392-5290-8580-DE159D03B2B0_k9C5EF4F0-0269-3CC9-0994-633224433265', 'Accept-Encoding': 'identity, deflate, compress, gzip', 'User-Agent': 'python-requests/0.7.3'}

In this example, the 'Content-Length' is missing.

Haven't got any spare time to try and patch the bug - as this was only a quick test to explore new libs.

Cal

[revised post 3 - multiple edits due to mistake on bug report]

Kenneth Reitz

Should be good now

Zoran Zaric
import requests
URL = "http://github.com"

g_url = "http://clients6.google.com/rpc?key=AIzaSyCKSbrvQasunBoV16zDH9R33D88CeLr9gQ"
params = {
    'method': 'pos.plusones.get',
    'id': 'p',
    'params': {
        'nolog': 'true',
        'id': URL,
        'source': 'widget',
        'userId': '@viewer',
        'groupId':'@self'
    },
    'jsonrpc': '2.0',
    'key': 'p',
    'apiVersion': 'v1'
}
headers = {
    'Content-type': 'application/json'
}

r = requests.post(g_url, params=params, headers=headers)
print r.request.headers

content length still isn't set in this code with version 0.8.3

Kenneth Reitz

You're not uploading any body. Why would there be a content-length?

Cal Leeming

kennethreitz, iirc - the Content-length header should always be set (even if it's zero) when a POST is involved.

But by looking at the code pasted by zoranzaric, he is indeed sending a POST with a request body - so the content-length should surely be included?

Am I missing something here??

Cal

Zoran Zaric

foxx, I thought setting params was correct for a POST. What am I doing wrong?

kennethreitz, the Google API that I'm talking to responds with 411 (Length Required)

Kenneth Reitz

@zoranzaric: params is used for query url parameters. data is used for body data.

Zoran Zaric

@kennethreitz ok yeah snap... with data it works... thanks and sorry for the trouble!

Kenneth Reitz

@zoranzaric, no worries! A lot of other libraries are quite inconsistent, so I completely understand the confusion. That's the whole reason I started Requests :)

Cal Leeming

Ah - sorry I missed the 'params' / 'data' difference. My bad!

Berndt Jung

This is still broken with regard to sending a blank POST request:

r = requests.post(full_uri)

While a Content-Length header is sent, the value is blank (it should be 0 per the RFC). The request as the server see is below:

{'CONTENT_LENGTH': '',
 'CONTENT_TYPE': '',
 'HTTP_ACCEPT': '*/*',
 'HTTP_ACCEPT_ENCODING': 'identity, deflate, compress, gzip',
 'HTTP_HOST': 'localhost:8086',
 'HTTP_USER_AGENT': 'python-requests/0.8.3',
 'PATH_INFO': '/__snap__/sldb/cc/heartbeat/4f14a8282be5c42b87000002',
 'QUERY_STRING': '',
 'REMOTE_ADDR': '127.0.0.1',
 'REMOTE_PORT': 49676,
 'REQUEST_METHOD': 'POST',
 'SCRIPT_NAME': '',
 'SERVER_NAME': '127.0.0.1',
 'SERVER_PORT': '8086',
 'SERVER_PROTOCOL': 'HTTP/1.1',
 'SERVER_SOFTWARE': 'Werkzeug/0.8.1',
 'werkzeug.request': <BaseRequest 'http://localhost:8086/__snap__/sldb/cc/heartbeat/4f14a8282be5c42b87000002' [POST]>,
 'werkzeug.server.shutdown': <function shutdown_server at 0x105cd3de8>,
 'wsgi.errors': <open file '<stderr>', mode 'w' at 0x105373270>,
 'wsgi.input': <socket._fileobject object at 0x105d62350>,
 'wsgi.multiprocess': False,
 'wsgi.multithread': False,
 'wsgi.run_once': False,
 'wsgi.url_scheme': 'http',
 'wsgi.version': (1, 0)}```
Arve Knudsen
aknuds1 commented

Are you working on the invalid content-length for POST requests without data payload issue? I was just bit by this bug while trying to POST and PUT to a WCF service. I could luckily work around it by specifying a content-length of 0 via the 'headers' keyword argument to the 'post' and 'put' functions.

Shivaram Lingamneni

@aknuds1 what version are you using? This is what I see in trunk:

>>> import requests
>>> print requests.post('http://httpbin.org/post').text
{
  "origin": "[REDACTED]", 
  "files": {}, 
  "form": {}, 
  "url": "http://httpbin.org/post", 
  "args": {}, 
  "headers": {
    "Content-Length": "0", 
    "Accept-Encoding": "identity, deflate, compress, gzip", 
    "Connection": "keep-alive", 
    "Accept": "*/*", 
    "User-Agent": "python-requests/0.11.3", 
    "Host": "httpbin.org", 
    "Content-Type": ""
  }, 
  "json": null, 
  "data": ""
}
Arve Knudsen
aknuds1 commented

@slingamn Ah, maybe it's fixed in trunk then. I'm just using the version installed via pip, i.e. 0.11.2.

Shivaram Lingamneni

Oh, weird, I get "Content-Length": "0" on 0.11.2 (2159c80) also.

Can you try the above test case and post the result?

Arve Knudsen
aknuds1 commented

I tried your case and get the following output:

{
  "origin": "[REDACTED]",
  "files": {},
  "form": {},
  "url": "http://httpbin.org/post",
  "args": {},
  "headers": {
    "Content-Length": "0",
    "Via": "1.1 EUR-PRXY-13",
    "Connection": "keep-alive",
    "Accept": "*/*",
    "User-Agent": "python-requests/0.11.2",
    "Host": "httpbin.org",
    "Content-Type": ""
  },
  "json": null,
  "data": ""
}

Is there a bug at httpbin.org then, since there is clearly a difference in the request when I define the content-length header myself?

Shivaram Lingamneni

Sorry, I'm confused. Isn't that the expected output?

$ curl -d "" http://httpbin.org/post
{
  "origin": "[REDACTED]", 
  "files": {}, 
  "form": {}, 
  "url": "http://httpbin.org/post", 
  "args": {}, 
  "headers": {
    "Content-Length": "0", 
    "Connection": "keep-alive", 
    "Accept": "*/*", 
    "User-Agent": "curl/7.21.7 (x86_64-redhat-linux-gnu) libcurl/7.21.7 NSS/3.13.3.0 zlib/1.2.5 libidn/1.22 libssh2/1.2.7", 
    "Host": "httpbin.org", 
    "Content-Type": "application/x-www-form-urlencoded"
  }, 
  "json": null, 
  "data": ""
}
Arve Knudsen
aknuds1 commented

@slingamn It's the desired output, yes, but why is there a difference towards WCF when I define the content-length header myself?

I mean, httpbin.org reports that the request has defined content-length as "0", even though the same request directed at WCF fails with error 411.

Shivaram Lingamneni

I'm inclined to suspect a bug with the WCF service, since I get the same result using http://requestb.in.

>>> requests.post('http://requestb.in/10uyxph1')
<Response [200]>
>>> 

and Requestbin saw the following headers:

Content-Length 0
Accept-Encoding identity, deflate, compress, gzip
Connection keep-alive
Accept */*
User-Agent python-requests/0.11.1
Host requestb.in
Arve Knudsen
aknuds1 commented

But there must be a difference wrt. content-length, since defining that header myself in the client library (requests) results in a request accepted by WCF. I'll see if I can get hold of the POST request received by WCF.

Shivaram Lingamneni

One possible explanation: this results in the headers appearing in a different order, and the WCF service (incorrectly) takes note of this.

To get the POST request, you could (as a last resort) try something like Wireshark.

Arve Knudsen
aknuds1 commented

According to Fiddler, the POST request without user-defined content-length header looks like so:

POST [REDACTED] HTTP/1.1
Host: [REDACTED]
Connection: Keep-Alive
Accept-Encoding: identity, deflate, compress, gzip
Accept: */*
User-Agent: python-requests/0.11.2

If I define the content-length header, via the 'headers' argument to requests.post, however, the request looks like so:

POST [REDACTED] HTTP/1.1
Host: [REDACTED]
content-length: 0
Connection: Keep-Alive
Accept-Encoding: identity, deflate, compress, gzip
Accept: */*
User-Agent: python-requests/0.11.2
Shivaram Lingamneni

Interesting! The line of code that produces the bad behavior is just requests.post(my_url), right?

Shivaram Lingamneni

I'm on Python 2.7.2. Here's what I see in my httplib.py:

    def _set_content_length(self, body):
        # Set the content-length based on the body.
        thelen = None
        try:
            thelen = str(len(body))
        except TypeError, te:
            # If this is a file-like object, try to
            # fstat its file descriptor
            try:
                thelen = str(os.fstat(body.fileno()).st_size)
            except (AttributeError, OSError):
                # Don't send a length if this failed
                if self.debuglevel > 0: print "Cannot stat!!"

        if thelen is not None:
            self.putheader('Content-Length', thelen)

If you can't provide a reproducible test case for security/privacy reasons (totally understandable), can you try and monkeypatch your httplib and see if that except block is being hit? You could also try setting debuglevel from the client code in requests.packages.urllib3.

Arve Knudsen
aknuds1 commented

My code is as simple as this (the HTTP proxy is Fiddler):

import requests

resp = requests.post("http://[REDACTED]", proxies={"http": "localhost:8888"})
resp.raise_for_status()

If you run this script against Fiddler on your machine, you would probably see the same request as I do.

I can't see that httplib._set_content_length is called at all. I injected some code in there to see if it was called by requests, but that code wasn't hit.

Shivaram Lingamneni

And Fiddler shows a Content-length: 0 header when querying httpbin.org and requestsb.in?

Arve Knudsen
aknuds1 commented
resp = requests.post("http://httpbin.org/post", proxies={"http": "localhost:8888"})
POST http://httpbin.org/post HTTP/1.1
Host: httpbin.org
Proxy-Connection: Keep-Alive
Accept-Encoding: identity, deflate, compress, gzip
Accept: */*
User-Agent: python-requests/0.11.2
Shivaram Lingamneni

Neat, I'll check up on that.

By the way, what's your platform and Python version?

Shivaram Lingamneni

Also, just for kicks, you might want to upgrade to Requests 0.12.0 (although I don't think any of the changes are relevant to this issue).

Arve Knudsen
aknuds1 commented

I'm on Python 2.7.2/Windows 7. I tried Requests 0.12.0, it made no difference unfortunately.

Shivaram Lingamneni

You, sir, are correct!

>>> print requests.post('http://httpbin.org/post').content
{
  "origin": "[REDACTED]", 
  "files": {}, 
  "form": {}, 
  "url": "http://httpbin.org/post", 
  "args": {}, 
  "headers": {
    "Content-Length": "0", 
    "Accept-Encoding": "identity, deflate, compress, gzip", 
    "Connection": "keep-alive", 
    "Accept": "*/*", 
    "User-Agent": "python-requests/0.12.0", 
    "Host": "httpbin.org", 
    "Content-Type": ""
  }, 
  "json": null, 
  "data": ""
}

but strace -v -e trace=network -p 32122 -s 2000 says:

connect(4, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("107.21.123.247")}, 16) = 0
sendto(4, "POST /post HTTP/1.1\r\nHost: httpbin.org\r\nAccept-Encoding: identity, deflate, compress, gzip\r\nAccept: */*\r\nUser-Agent: python-requests/0.12.0\r\n\r\n", 143, 0, NULL, 0) = 143

Content-length is not being set on zero-data POSTs; httpbin.org is reporting it anyway.

Shivaram Lingamneni

Filed kennethreitz/httpbin#46 for the incorrect reporting issue; I'll see whether this specific bug lies on Requests or urllib3.

Shivaram Lingamneni

OK, there's good news and there's bad news. This is from my 2.7.2 httplib.py:

    def _send_request(self, method, url, body, headers):
        # Honor explicitly requested Host: and Accept-Encoding: headers.
        header_names = dict.fromkeys([k.lower() for k in headers])
        skips = {}
        if 'host' in header_names:
            skips['skip_host'] = 1
        if 'accept-encoding' in header_names:
            skips['skip_accept_encoding'] = 1

        self.putrequest(method, url, **skips)

        if body and ('content-length' not in header_names):
            self._set_content_length(body)
        for hdr, value in headers.iteritems():
            self.putheader(hdr, value)
        self.endheaders(body)

The relevant piece of code is the if body and --- httplib is explicitly declining to set Content-length for an empty body.

The RFCs are not entirely clear to me. http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html says:

For compatibility with HTTP/1.0 applications, HTTP/1.1 requests containing a message-body MUST include a valid Content-Length header field unless the server is known to be HTTP/1.1 compliant.

so it depends on whether a zero-length body counts as "containing a message-body". But http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html says:

Applications SHOULD use this field to indicate the transfer-length of the message-body, unless this is prohibited by the rules in section 4.4.

so it seems like adding the header wouldn't hurt.

@aknuds1 do you think it would be appropriate to file a bug against Python?

Shivaram Lingamneni

A ruling from the Python maintainers would clarify the case for adding a workaround for the issue inside Requests.

Anyway, thanks for putting up with my interrogation!

Arve Knudsen
aknuds1 commented

@slingamn Yes, it seems appropriate to me to file a bug against Python, considering this is a real world problem. I suspect it's actually IIS 7.5 that rejects the requests without defined content-length, rather than WCF, since these rejected requests never show up in my WCF log. I've enabled logging of all levels and malformed messages etc, so I'm quite sure these client errors would be logged had they been raised by WCF (as opposed to IIS), the way that for instance invalid service operation invocations are.

I found a StackOverflow question on this problem re. IIS, which confirms that this server does indeed require the content-length header on POST requests.

Would you like to file the bug against Python, or should I?

Shivaram Lingamneni

@aknuds1 can you do it? :-) And if you could post a link to the ticket here, that'd be awesome.

Arve Knudsen
aknuds1 commented

I raised an issue at python.org: http://bugs.python.org/issue14721.

Shivaram Lingamneni

Fantastic, thanks.

Arve Knudsen
aknuds1 commented

@slingamn Which HTTP methods do you think content-length should be defined for? POST and PUT? I've only seen it mentioned so far that content-length should be defined for requests that intend to place something on the server, and I'm hardly the HTTP expert myself.

Arve Knudsen
aknuds1 commented

Update: I can see that Chrome defines content-length: 0 for the following methods: POST, PUT, PATCH, DELETE and HEAD. I figure it'd be a good idea to model this behaviour.

Shivaram Lingamneni

I found it difficult to interpret the RFC. Some guy did an attempt at a close reading of their language here:

http://stackoverflow.com/questions/299628/is-an-entity-body-allowed-for-an-http-delete-request

This seems to indicate that Content-length could be set even on GET (which was also my reading of the RFC), but as you noted in the Python ticket, it looks like that breaks a test of some kind.

Sounds like you could do worse than use Chrome as a reference implementation :-) This is a longshot, but if you check the Chromium source, you might find some explanation or clarification?

Kenneth Reitz
Owner

GETs can definitely have request content.

The question is, will including content-length with every request break anything?

Arve Knudsen
aknuds1 commented

Yes, I only noted that content-length is not specified for certain request methods so long as requests are content-less. I don't know why Chrome does it like this, but there could be some reason?

Dan Fairs

As an aside - this also affects PUT requests. I bumped into it while trying to PUT to a URL with an empty body, while proxying through Nginx. Nginx returned a 411.

Kenneth Reitz
Owner

Anyone want to send a pull request attaching content length to everything?

Mark Nottingham
mnot commented

We've clarified this in HTTPbis (although it's there in 2616 too). Content-Length is not required on requests; if it's missing, the default is a length of 0. See:
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-20#section-3.3.3 (item 6)

nettok

I need to set the "Content-Length" for a GET request. Anybody knows how to fix this?

Ian Cordasco
Collaborator

@nettok something along the lines of:

from requests import get
r = get('http://httpbin.org', headers={'Content-Length': '0'})

But of course if you're sending data, it wouldn't be zero. If you're sending a body, e.g.,

r = get('http://httpbin.org/get', data='spam')
r.json['headers']['Content-Length'] == '4'  # True
Ian Cordasco
Collaborator

Also, just realized this had bitten me a while back here. If @kennethreitz still wants a PR attaching Content-Length to everything, I'd be happy to do so.

Kenneth Reitz
Owner

Yes.

Ian Cordasco
Collaborator

@kennethreitz where should I branch from, develop or adapters?

Kenneth Reitz
Owner

develop, adapters will be a while :)

Ian Cordasco
Collaborator

Roger that.

Arup Malakar

I am getting a 411 while using netflix API too. I am getting this while trying to get the request token using the requests-oauth module.

Things were working fine, but the API end point have changed recently which may indicate additional changes in server side as well.

Code: https://github.com/amalakar/pyflix2/blob/master/pyflix2/pyflix2.py#L76

Surprisingly even though I am passing post data as data={'oauth_callback': 'oob'}
I am still seeing no content-length header.

Headers sent are:
{'Accept-Encoding': 'identity, deflate, compress, gzip', 'Accept': '*/*', 'User-Agent': 'python-requests/0.14.0 CPython/2.7.3 Darwin/10.8.0'}

I am on requests-0.14.0, not sure if the issue is in requests or the oauth module. I would be happy to provide additional information.

Arup Malakar amalakar referenced this issue in amalakar/pyflix2
Closed

netflix api url has changed #8

Arup Malakar

@kennethreitz Looks like the content-length is being set in case of GET request as well. Where the data being passed is going as query params in the URL. This could be a 400 Bad Request case (which is what is happening for me).

Following is an example of headers sent:
{'Content-Length': u'33', 'Content-Type': 'application/x-www-form-urlencoded', 'Accept-Encoding': 'gzip, deflate, compress', 'Accept': '/', 'User-Agent': 'python-requests/0.14.2 CPython/2.7.3 Darwin/10.8.0'}

And this was for a GET request.

Ian Cordasco
Collaborator

Can you provide the call @amalakar? This must be a mistake on my part and I'd like to fix it.

Ian Cordasco
Collaborator

To clarify @amalakar, testing against HTTPBIN, with requests.get('http://httpbin.org/get', params={'foo': 'bar'}) I get in the response Headers like:

{u'Content-Length': u'', u'Accept-Encoding': u'gzip, deflate, compress', u'Connection': u'keep-alive', u'Accept': u'*/*', u'User-Agent': u'python-requests/0.14.2 CPython/2.6.6 Linux/2.6.37.6', u'Host': u'httpbin.org', u'Content-Type': u''}

which is a bit bizarre but probably a bug on HTTPBIN's end. (The fact that Content-Length is u'' is what I find bizarre.)

Arup Malakar

Notice I am using data argument here, not params. The server side script is a simple php script that prints all the http headers it received in the http request. You can see that it got the Content-Length as 7.

>>> r = requests.get('http://localhost/~malakar/headers.php', data={'foo': 'bar'})
>>> print r.content
Host: localhost
Content-Length: 7
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/0.14.2 CPython/2.7.3 Darwin/10.8.0
>>> print r.request.headers
{'Content-Length': u'7', 'Content-Type': 'application/x-www-form-urlencoded', 'Accept-Encoding': 'gzip, deflate, compress', 'Accept': '*/*', 'User-Agent': 'python-requests/0.14.2 CPython/2.7.3 Darwin/10.8.0'}
Ian Cordasco
Collaborator
Cory Benfield
Collaborator

Yup, the server is doing the wrong thing here. (Though, in its defense, your request is semantically meaningless, as @piotr-dobrogost has pointed out. =D )

Piotr Dobrogost

(...) your request is semantically meaningless (...)

Well, on the contrary. The semantics of such a request is clearly defined :)

Arup Malakar

Well from the documentation I get the following:

  1. It is allowed to provide body with GET request
  2. The content-length in such case should reflect the same

Given this can I conclude that if I provide the dictionary as data param it will go as the body and if I provide it as params it would go as query params? And both are valid use cases in case of GET too.

Well I have another problem it seems. So far I have been using the https://github.com/maraujop/requests-oauth and the data argument has been passed as query param in such cases. Following is a output snippet from my code:

URL: http://api-public.netflix.com/catalog/titles/autocomplete?term=the+matrix&oauth_nonce=27743677&oauth_timestamp=1354038508&oauth_consumer_key=3u2tmnge5649gfx9h9yr9p2j&oauth_signature_method=HMAC-SHA1&oauth_version=1.0&v=2.0&output=json&oauth_signature=NQFw4Tv0R%2Fd8VwvVuBE68p1i76w%3D
Method: GET
Data: {'output': 'json', 'term': 'the matrix', 'v': 2.0}
Headers sent: {'Content-Length': u'33', 'Content-Type': 'application/x-www-form-urlencoded', 'Accept-Encoding': 'gzip, deflate, compress', 'Accept': '*/*', 'User-Agent': 'python-requests/0.14.2 CPython/2.7.3 Darwin/10.8.0'}
Response: <?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head>
  <title>400 - Bad Request</title>
 </head>
 <body>
  <h1>400 - Bad Request</h1>
 </body>
</html>

Notice that here the data dictionary has become query params as seen in the URL. This seems inconsistent wrt `requests.get in which case data dictionary is not sent as query params but as the body.

Ian Cordasco
Collaborator
Shivaram Lingamneni slingamn referenced this issue in kennethreitz/httpbin
Open

inaccurate headers reported in some cases #46

Kristoffer Berdal flexd referenced this issue in eve-val/evelink
Closed

Fix: Proper usage of requests #172

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.