It would be nice to have support for the Expect -> 100 Continue flow. This is specially important when we are doing uploads, since we will not need to transfer the whole data before we encounter a 401, for example.
A typical use-case would be a streaming upload, with the data coming from a client and proxied on-the-fly to our destination server. Requests would read the data from the input source socket only if the destination server had already sent the 100-Continue header. When we are dealing with S3, 401 errors are common, and if the data is not read we can retry the request.
Requests can check the request headers, and if it finds an "Expect" header it will wait for the 100 Continue response from the server. If it does not come, this error should flow to the caller in a way that it can distinguish a network problem from a failed expectation error.
This would be wonderful.
I believe @durin42 has some thoughts about this as well.
nod We can do this, but it'll be easiest with a rewritten http client library. I've got one that's mostly API compatible with httplib (http://code.google.com/p/httpplus/) that I'm writing as part of my 20% work on Mercurial that we could try.
Note that the flow proposed in the initial bug isn't strictly RFC compliant - you can't wait indefinitely for the response to the Expect: header. You have to wait an unspecified delay and then continue optimistically, as servers SHOULD (not MUST) understand Expect: headers. Reverse proxies are allowed to strip Expect headers entirely, and many do so.
@kennethreitz I don't have any particular time to dedicate to this, but I'm happy to chat more about what we can do to make this better. Are you going to be at OSCON?
Since we now have streaming uploads (if I remember correctly) is anyone interested in taking care of this?
@kennethreitz do you have support for streaming uploads? I've not followed development of requests closely.
@durdin42 yessir! http://docs.python-requests.org/en/latest/user/advanced/#streaming-uploads
I haven't really thought about this for any length of time, but is this a Requests-level issue or a urllib3-level issue? Put another way, does it belong in the HTTP Adapter or in urllib3?
I think we can handle it on our end.
Having looked at this a little this morning, it'll be very challenging to implement fully proper expect header handling, because some servers assume that if you send an expect header you can handle an early response, and can close the socket early in a way that breaks httplib.
That said, you should be able to fix this by waiting ~500ms after you send headers and peeking for a response, and if you have a non-100-continue response pretend to the httplib guts that the request body was completely sent.
In particular, if you tried to read from the low_conn.sock() here with a timeout, I think that might be the right jumping-off point:
Ah time.sleep the cause of and solution to, all of life's problems.
We could do:
if 'expects' in request.headers:
So we don't degrade performance for other cases, correct?
Any progress on this?
@edevil Nope. Feel free to pick it up if you want something to hack on. =)
I don't have the Python/requests/httplib proficiency to do this, but as a user and big fan of HTTPie, I'd love to see this implemented. =) Thanks for the consideration, and good luck with the implementation!
So I'm starting to wonder if this is entirely an adapter-level change. I hadn't thought much about this, but I think it would be easiest and make the most sense to be on the adapter level. My reasoning why:
Any update on this?
This would be very useful for our use case (Uploading large objects to openstack swift.)
We're awaiting this feature.
There's been no progress on this, and it's not high on the list of priorities for any of the core development team. This is only likely to happen any time soon if someone else develops it. =)
Are there any updates on this bit? We're facing an issue and 100-continue can help make our lives much better.
See my above comment.
For those interested, this is probably not possible with our current stack. We talked with @durin42 tonight and it seems that httplib just swallows the 100-Continue and does not really wait for it either. This may be far more difficult than originally thought.
To continue keeping this issue up to date, the HTTPbis WG is considering deprecating the entire 1XX series of status codes as an erratum to RFC 723[0-5]. It's not entirely clear that this will happen, but what is clear is that the prevailing opinion in the WG is that 100 Continue is bad.
The most recent relevant WG thread is here. Note that the discussion is heavily informed by HTTP/2, which has no need of the 100 Continue mechanism. It's not clear exactly where this will go, but it puts the 1XX codes at risk.
I think this issue can now be characterised as "impossible without massive rewrite of httplib". For that reason, I want to close this down: having issues we cannot take action on is problematic.
If anyone needs a feature like this I ended up going with a patched httplib:
@edevil Id your patched httplib available anywhere for as a package?
BTW, Amazon S3 recommends using 100-continues: http://docs.aws.amazon.com/AmazonS3/latest/dev/RESTRedirect.html
@cancan101 Sorry, no. But you have the patch in the ticket: http://bugs.python.org/file26357/issue1346874-273.patch
This is how this is supported in the AWS s3 python cli:
This would be useful for OpenStack Swift for the same reasons it's useful for s3: you get a 401 back before you've uploaded any data. This prevents wasted (potentially large) uploads and allows you to retry if your input is a stream.
We're desperately trying to avoid further patching httplib. I highly recommend trying to make this enhancement upstream if OpenStack values it: that would make it much easier for us to backport that support to earlier versions.
upstream: ie httplib?
Right, it would definitely be nice and clean to have it in there.
The patch to httplib referenced above (http://bugs.python.org/issue1346874) has been up since 2005!
Have the requests folks had any luck in fixing functionality gaps by pushing stuff upstream?
Some, but it's not been as fast as we'd like in all cases. However, our only other option is to jettison httplib entirely, which is a bit excessive and puts a substantial maintenance burden on us. Where possible, working with upstream is the ideal thing to do.