Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content-Length is missing #223

Closed
foxx opened this issue Oct 24, 2011 · 63 comments
Closed

Content-Length is missing #223

foxx opened this issue Oct 24, 2011 · 63 comments
Labels

Comments

@foxx
Copy link

foxx commented Oct 24, 2011

Please see the following example


_data = """callCount=1 
page=/internetsales/iSnapVehicle.xhtml?_gk=_c083894BB-4392-5290-8580-DE159D03B2B0_k9C5EF4F0-0269-3CC9-0994-633224433265

httpSessionId=
scriptSessionId=DD51AC690A8BC3BEDB1AEDF1B2A7A9DD575
c0-scriptName=PresentationRulesFacade
c0-methodName=execute
c0-id=0
c0-param0=string:VehicleForm c0-param1=string:VehicleForm%3Amake
c0-param2=string:%7B%22VehicleForm%3AdaysDriven%22%20%3A%20%221%22%2C%20%22VehicleForm%3AmilesDriven%22%20%3A%20%220%22%2C%20%22VehicleForm%3Ayear%22%20%3A%20%221998%22%2C%20%22VehicleForm%3AestimatedMileage%22%20%3A%20%22%22%2C%20%22VehicleForm%3AodometerReading%22%20%3A%20%22%22%2C%20%22VehicleForm%3Amake%22%20%3A%20%22AUDI%22%2C%20%22VehicleForm%3Ayear-txt%22%20%3A%20%22%22%2C%20%22VehicleForm%3Amake-txt%22%20%3A%20%22%22%2C%20%22VehicleForm%3Amodel-txt%22%20%3A%20%22%22%2C%20%22VehicleForm%3AmodelDesc%22%20%3A%20%22Other%22%2C%20%22VehicleForm%3Amodel%22%20%3A%20%22Other%22%2C%20%22VehicleForm%3Abodystyle%22%20%3A%20%22VAN%22%7D
c0-param3=boolean:false
batchId=327
"""

print len(_data)
_headers = {
    'referer' : 'https://sales2.geico.com/internetsales/iSnapVehicle.xhtml?_gk=_c083894BB-4392-5290-8580-DE159D03B2B0_k9C5EF4F0-0269-3CC9-0994-633224433265'
}
r = requests.post(
    url = 'https://sales2.geico.com/internetsales/dwr/call/plaincall/PresentationRulesFacade.execute.dwr',
    data = _data,
    headers = _headers
)

print r.request.headers
{'referer': 'https://sales2.geico.com/internetsales/iSnapVehicle.xhtml?_gk=_c083894BB-4392-5290-8580-DE159D03B2B0_k9C5EF4F0-0269-3CC9-0994-633224433265', 'Accept-Encoding': 'identity, deflate, compress, gzip', 'User-Agent': 'python-requests/0.7.3'}

In this example, the 'Content-Length' is missing.

Haven't got any spare time to try and patch the bug - as this was only a quick test to explore new libs.

Cal

[revised post 3 - multiple edits due to mistake on bug report]

@kennethreitz
Copy link
Contributor

Should be good now

@zoranzaric
Copy link

import requests
URL = "http://github.com"

g_url = "http://clients6.google.com/rpc?key=AIzaSyCKSbrvQasunBoV16zDH9R33D88CeLr9gQ"
params = {
    'method': 'pos.plusones.get',
    'id': 'p',
    'params': {
        'nolog': 'true',
        'id': URL,
        'source': 'widget',
        'userId': '@viewer',
        'groupId':'@self'
    },
    'jsonrpc': '2.0',
    'key': 'p',
    'apiVersion': 'v1'
}
headers = {
    'Content-type': 'application/json'
}

r = requests.post(g_url, params=params, headers=headers)
print r.request.headers

content length still isn't set in this code with version 0.8.3

@kennethreitz
Copy link
Contributor

You're not uploading any body. Why would there be a content-length?

@foxx
Copy link
Author

foxx commented Nov 29, 2011

kennethreitz, iirc - the Content-length header should always be set (even if it's zero) when a POST is involved.

But by looking at the code pasted by zoranzaric, he is indeed sending a POST with a request body - so the content-length should surely be included?

Am I missing something here??

Cal

@zoranzaric
Copy link

foxx, I thought setting params was correct for a POST. What am I doing wrong?

kennethreitz, the Google API that I'm talking to responds with 411 (Length Required)

@kennethreitz
Copy link
Contributor

@zoranzaric: params is used for query url parameters. data is used for body data.

@zoranzaric
Copy link

@kennethreitz ok yeah snap... with data it works... thanks and sorry for the trouble!

@kennethreitz
Copy link
Contributor

@zoranzaric, no worries! A lot of other libraries are quite inconsistent, so I completely understand the confusion. That's the whole reason I started Requests :)

@foxx
Copy link
Author

foxx commented Nov 29, 2011

Ah - sorry I missed the 'params' / 'data' difference. My bad!

@berndtj
Copy link

berndtj commented Jan 16, 2012

This is still broken with regard to sending a blank POST request:

r = requests.post(full_uri)

While a Content-Length header is sent, the value is blank (it should be 0 per the RFC). The request as the server see is below:

{'CONTENT_LENGTH': '',
 'CONTENT_TYPE': '',
 'HTTP_ACCEPT': '*/*',
 'HTTP_ACCEPT_ENCODING': 'identity, deflate, compress, gzip',
 'HTTP_HOST': 'localhost:8086',
 'HTTP_USER_AGENT': 'python-requests/0.8.3',
 'PATH_INFO': '/__snap__/sldb/cc/heartbeat/4f14a8282be5c42b87000002',
 'QUERY_STRING': '',
 'REMOTE_ADDR': '127.0.0.1',
 'REMOTE_PORT': 49676,
 'REQUEST_METHOD': 'POST',
 'SCRIPT_NAME': '',
 'SERVER_NAME': '127.0.0.1',
 'SERVER_PORT': '8086',
 'SERVER_PROTOCOL': 'HTTP/1.1',
 'SERVER_SOFTWARE': 'Werkzeug/0.8.1',
 'werkzeug.request': <BaseRequest 'http://localhost:8086/__snap__/sldb/cc/heartbeat/4f14a8282be5c42b87000002' [POST]>,
 'werkzeug.server.shutdown': <function shutdown_server at 0x105cd3de8>,
 'wsgi.errors': <open file '<stderr>', mode 'w' at 0x105373270>,
 'wsgi.input': <socket._fileobject object at 0x105d62350>,
 'wsgi.multiprocess': False,
 'wsgi.multithread': False,
 'wsgi.run_once': False,
 'wsgi.url_scheme': 'http',
 'wsgi.version': (1, 0)}```

@aknuds1
Copy link

aknuds1 commented May 2, 2012

Are you working on the invalid content-length for POST requests without data payload issue? I was just bit by this bug while trying to POST and PUT to a WCF service. I could luckily work around it by specifying a content-length of 0 via the 'headers' keyword argument to the 'post' and 'put' functions.

@slingamn
Copy link
Contributor

slingamn commented May 2, 2012

@aknuds1 what version are you using? This is what I see in trunk:

>>> import requests
>>> print requests.post('http://httpbin.org/post').text
{
  "origin": "[REDACTED]", 
  "files": {}, 
  "form": {}, 
  "url": "http://httpbin.org/post", 
  "args": {}, 
  "headers": {
    "Content-Length": "0", 
    "Accept-Encoding": "identity, deflate, compress, gzip", 
    "Connection": "keep-alive", 
    "Accept": "*/*", 
    "User-Agent": "python-requests/0.11.3", 
    "Host": "httpbin.org", 
    "Content-Type": ""
  }, 
  "json": null, 
  "data": ""
}

@aknuds1
Copy link

aknuds1 commented May 2, 2012

@slingamn Ah, maybe it's fixed in trunk then. I'm just using the version installed via pip, i.e. 0.11.2.

@slingamn
Copy link
Contributor

slingamn commented May 2, 2012

Oh, weird, I get "Content-Length": "0" on 0.11.2 (2159c80) also.

Can you try the above test case and post the result?

@aknuds1
Copy link

aknuds1 commented May 2, 2012

I tried your case and get the following output:

{
  "origin": "[REDACTED]",
  "files": {},
  "form": {},
  "url": "http://httpbin.org/post",
  "args": {},
  "headers": {
    "Content-Length": "0",
    "Via": "1.1 EUR-PRXY-13",
    "Connection": "keep-alive",
    "Accept": "*/*",
    "User-Agent": "python-requests/0.11.2",
    "Host": "httpbin.org",
    "Content-Type": ""
  },
  "json": null,
  "data": ""
}

Is there a bug at httpbin.org then, since there is clearly a difference in the request when I define the content-length header myself?

@slingamn
Copy link
Contributor

slingamn commented May 2, 2012

Sorry, I'm confused. Isn't that the expected output?

$ curl -d "" http://httpbin.org/post
{
  "origin": "[REDACTED]", 
  "files": {}, 
  "form": {}, 
  "url": "http://httpbin.org/post", 
  "args": {}, 
  "headers": {
    "Content-Length": "0", 
    "Connection": "keep-alive", 
    "Accept": "*/*", 
    "User-Agent": "curl/7.21.7 (x86_64-redhat-linux-gnu) libcurl/7.21.7 NSS/3.13.3.0 zlib/1.2.5 libidn/1.22 libssh2/1.2.7", 
    "Host": "httpbin.org", 
    "Content-Type": "application/x-www-form-urlencoded"
  }, 
  "json": null, 
  "data": ""
}

@aknuds1
Copy link

aknuds1 commented May 2, 2012

@slingamn It's the desired output, yes, but why is there a difference towards WCF when I define the content-length header myself?

I mean, httpbin.org reports that the request has defined content-length as "0", even though the same request directed at WCF fails with error 411.

@slingamn
Copy link
Contributor

slingamn commented May 2, 2012

I'm inclined to suspect a bug with the WCF service, since I get the same result using http://requestb.in.

>>> requests.post('http://requestb.in/10uyxph1')
<Response [200]>
>>> 

and Requestbin saw the following headers:

Content-Length 0
Accept-Encoding identity, deflate, compress, gzip
Connection keep-alive
Accept */*
User-Agent python-requests/0.11.1
Host requestb.in

@aknuds1
Copy link

aknuds1 commented May 2, 2012

But there must be a difference wrt. content-length, since defining that header myself in the client library (requests) results in a request accepted by WCF. I'll see if I can get hold of the POST request received by WCF.

@slingamn
Copy link
Contributor

slingamn commented May 2, 2012

One possible explanation: this results in the headers appearing in a different order, and the WCF service (incorrectly) takes note of this.

To get the POST request, you could (as a last resort) try something like Wireshark.

@aknuds1
Copy link

aknuds1 commented May 3, 2012

According to Fiddler, the POST request without user-defined content-length header looks like so:

POST [REDACTED] HTTP/1.1
Host: [REDACTED]
Connection: Keep-Alive
Accept-Encoding: identity, deflate, compress, gzip
Accept: */*
User-Agent: python-requests/0.11.2

If I define the content-length header, via the 'headers' argument to requests.post, however, the request looks like so:

POST [REDACTED] HTTP/1.1
Host: [REDACTED]
content-length: 0
Connection: Keep-Alive
Accept-Encoding: identity, deflate, compress, gzip
Accept: */*
User-Agent: python-requests/0.11.2

@slingamn
Copy link
Contributor

slingamn commented May 3, 2012

Interesting! The line of code that produces the bad behavior is just requests.post(my_url), right?

@slingamn
Copy link
Contributor

slingamn commented May 3, 2012

I'm on Python 2.7.2. Here's what I see in my httplib.py:

    def _set_content_length(self, body):
        # Set the content-length based on the body.
        thelen = None
        try:
            thelen = str(len(body))
        except TypeError, te:
            # If this is a file-like object, try to
            # fstat its file descriptor
            try:
                thelen = str(os.fstat(body.fileno()).st_size)
            except (AttributeError, OSError):
                # Don't send a length if this failed
                if self.debuglevel > 0: print "Cannot stat!!"

        if thelen is not None:
            self.putheader('Content-Length', thelen)

If you can't provide a reproducible test case for security/privacy reasons (totally understandable), can you try and monkeypatch your httplib and see if that except block is being hit? You could also try setting debuglevel from the client code in requests.packages.urllib3.

@aknuds1
Copy link

aknuds1 commented May 3, 2012

My code is as simple as this (the HTTP proxy is Fiddler):

import requests

resp = requests.post("http://[REDACTED]", proxies={"http": "localhost:8888"})
resp.raise_for_status()

If you run this script against Fiddler on your machine, you would probably see the same request as I do.

I can't see that httplib._set_content_length is called at all. I injected some code in there to see if it was called by requests, but that code wasn't hit.

@slingamn
Copy link
Contributor

slingamn commented May 3, 2012

And Fiddler shows a Content-length: 0 header when querying httpbin.org and requestsb.in?

@aknuds1
Copy link

aknuds1 commented May 3, 2012

resp = requests.post("http://httpbin.org/post", proxies={"http": "localhost:8888"})
POST http://httpbin.org/post HTTP/1.1
Host: httpbin.org
Proxy-Connection: Keep-Alive
Accept-Encoding: identity, deflate, compress, gzip
Accept: */*
User-Agent: python-requests/0.11.2

@mnot
Copy link

mnot commented Aug 5, 2012

We've clarified this in HTTPbis (although it's there in 2616 too). Content-Length is not required on requests; if it's missing, the default is a length of 0. See:
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-20#section-3.3.3 (item 6)

@nettok
Copy link

nettok commented Sep 5, 2012

I need to set the "Content-Length" for a GET request. Anybody knows how to fix this?

@sigmavirus24
Copy link
Contributor

@nettok something along the lines of:

from requests import get
r = get('http://httpbin.org', headers={'Content-Length': '0'})

But of course if you're sending data, it wouldn't be zero. If you're sending a body, e.g.,

r = get('http://httpbin.org/get', data='spam')
r.json['headers']['Content-Length'] == '4'  # True

@sigmavirus24
Copy link
Contributor

Also, just realized this had bitten me a while back here. If @kennethreitz still wants a PR attaching Content-Length to everything, I'd be happy to do so.

@kennethreitz
Copy link
Contributor

Yes.

@sigmavirus24
Copy link
Contributor

@kennethreitz where should I branch from, develop or adapters?

@kennethreitz
Copy link
Contributor

develop, adapters will be a while :)

@sigmavirus24
Copy link
Contributor

Roger that.

@amalakar
Copy link
Contributor

I am getting a 411 while using netflix API too. I am getting this while trying to get the request token using the requests-oauth module.

Things were working fine, but the API end point have changed recently which may indicate additional changes in server side as well.

Code: https://github.com/amalakar/pyflix2/blob/master/pyflix2/pyflix2.py#L76

Surprisingly even though I am passing post data as data={'oauth_callback': 'oob'}
I am still seeing no content-length header.

Headers sent are:
{'Accept-Encoding': 'identity, deflate, compress, gzip', 'Accept': '*/*', 'User-Agent': 'python-requests/0.14.0 CPython/2.7.3 Darwin/10.8.0'}

I am on requests-0.14.0, not sure if the issue is in requests or the oauth module. I would be happy to provide additional information.

@amalakar
Copy link
Contributor

@kennethreitz Looks like the content-length is being set in case of GET request as well. Where the data being passed is going as query params in the URL. This could be a 400 Bad Request case (which is what is happening for me).

Following is an example of headers sent:
{'Content-Length': u'33', 'Content-Type': 'application/x-www-form-urlencoded', 'Accept-Encoding': 'gzip, deflate, compress', 'Accept': '/', 'User-Agent': 'python-requests/0.14.2 CPython/2.7.3 Darwin/10.8.0'}

And this was for a GET request.

@sigmavirus24
Copy link
Contributor

Can you provide the call @amalakar? This must be a mistake on my part and I'd like to fix it.

@sigmavirus24
Copy link
Contributor

To clarify @amalakar, testing against HTTPBIN, with requests.get('http://httpbin.org/get', params={'foo': 'bar'}) I get in the response Headers like:

{u'Content-Length': u'', u'Accept-Encoding': u'gzip, deflate, compress', u'Connection': u'keep-alive', u'Accept': u'*/*', u'User-Agent': u'python-requests/0.14.2 CPython/2.6.6 Linux/2.6.37.6', u'Host': u'httpbin.org', u'Content-Type': u''}

which is a bit bizarre but probably a bug on HTTPBIN's end. (The fact that Content-Length is u'' is what I find bizarre.)

@amalakar
Copy link
Contributor

Notice I am using data argument here, not params. The server side script is a simple php script that prints all the http headers it received in the http request. You can see that it got the Content-Length as 7.

>>> r = requests.get('http://localhost/~malakar/headers.php', data={'foo': 'bar'})
>>> print r.content
Host: localhost
Content-Length: 7
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/0.14.2 CPython/2.7.3 Darwin/10.8.0
>>> print r.request.headers
{'Content-Length': u'7', 'Content-Type': 'application/x-www-form-urlencoded', 'Accept-Encoding': 'gzip, deflate, compress', 'Accept': '*/*', 'User-Agent': 'python-requests/0.14.2 CPython/2.7.3 Darwin/10.8.0'}

@sigmavirus24
Copy link
Contributor

Well this seems to be more of a problem with the server if I'm reading RFC
2616 correctly:

https://tools.ietf.org/html/rfc2616#page-119

The Content-Length entity-header field indicates the size of the
entity-body, in decimal number of OCTETs, sent to the recipient or,
in the case of the HEAD method, the size of the entity-body that
would have been sent had the request been a GET.

   Content-Length    = "Content-Length" ":" 1*DIGIT

An example is

   Content-Length: 3495

Applications SHOULD use this field to indicate the transfer-length of
the message-body, unless this is prohibited by the rules in section
4.4.

(And section 4.4 doesn't mention anything about sending data on a GET
request.)

Any Content-Length greater than or equal to zero is a valid value.
Section 4.4 describes how to determine the length of a message-body
if a Content-Length is not given.

Note that the meaning of this field is significantly different from
the corresponding definition in MIME, where it is an optional field
used within the "message/external-body" content-type. In HTTP, it
SHOULD be sent whenever the message's length can be determined prior
to being transferred, unless this is prohibited by the rules in
section 4.4.

@piotr-dobrogost
Copy link

@Lukasa
Copy link
Member

Lukasa commented Nov 27, 2012

Yup, the server is doing the wrong thing here. (Though, in its defense, your request is semantically meaningless, as @piotr-dobrogost has pointed out. =D )

@piotr-dobrogost
Copy link

(...) your request is semantically meaningless (...)

Well, on the contrary. The semantics of such a request is clearly defined :)

@amalakar
Copy link
Contributor

Well from the documentation I get the following:

  1. It is allowed to provide body with GET request
  2. The content-length in such case should reflect the same

Given this can I conclude that if I provide the dictionary as data param it will go as the body and if I provide it as params it would go as query params? And both are valid use cases in case of GET too.

Well I have another problem it seems. So far I have been using the https://github.com/maraujop/requests-oauth and the data argument has been passed as query param in such cases. Following is a output snippet from my code:

URL: http://api-public.netflix.com/catalog/titles/autocomplete?term=the+matrix&oauth_nonce=27743677&oauth_timestamp=1354038508&oauth_consumer_key=3u2tmnge5649gfx9h9yr9p2j&oauth_signature_method=HMAC-SHA1&oauth_version=1.0&v=2.0&output=json&oauth_signature=NQFw4Tv0R%2Fd8VwvVuBE68p1i76w%3D
Method: GET
Data: {'output': 'json', 'term': 'the matrix', 'v': 2.0}
Headers sent: {'Content-Length': u'33', 'Content-Type': 'application/x-www-form-urlencoded', 'Accept-Encoding': 'gzip, deflate, compress', 'Accept': '*/*', 'User-Agent': 'python-requests/0.14.2 CPython/2.7.3 Darwin/10.8.0'}
Response: <?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head>
  <title>400 - Bad Request</title>
 </head>
 <body>
  <h1>400 - Bad Request</h1>
 </body>
</html>

Notice that here the data dictionary has become query params as seen in the URL. This seems inconsistent wrt ``requests.get` in which case data dictionary is not sent as query params but as the body.

@sigmavirus24
Copy link
Contributor

As far as I know, data has always been for the body and parameters for the query string. requests-coauthor must have messed that up unless he did it purposefully.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests