Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL3_WRITE_PENDING error from urllib3.contrib.pyopenssl#sendall #717

Closed
evnm opened this issue Oct 14, 2015 · 24 comments
Closed

SSL3_WRITE_PENDING error from urllib3.contrib.pyopenssl#sendall #717

evnm opened this issue Oct 14, 2015 · 24 comments

Comments

@evnm
Copy link
Contributor

evnm commented Oct 14, 2015

I'm writing a program that makes somewhat large (e.g. tens to hundreds of kilobytes) POST requests over HTTPS using requests. Such POSTs fail due to SSL3_WRITE_PENDING errors from OpenSSL. In short, SSL3_WRITE_PENDING occurs when an incomplete write operation (i.e. a write that resulted in a WantWriteError) is retried with arguments not exactly equal to those passed in the initial call (see here and here).

Stracktrace from my call into requests down through urllib3:

File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 511, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/site-packages/requests/adapters.py", line 370, in send
timeout=timeout
File "/usr/local/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 559, in urlopen
body=body, headers=headers)
File "/usr/local/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 353, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/local/lib/python2.7/httplib.py", line 1053, in request
self._send_request(method, url, body, headers)
File "/usr/local/lib/python2.7/httplib.py", line 1093, in _send_request
self.endheaders(body)
File "/usr/local/lib/python2.7/httplib.py", line 1049, in endheaders
self._send_output(message_body)
File "/usr/local/lib/python2.7/httplib.py", line 893, in _send_output
self.send(msg)
File "/usr/local/lib/python2.7/httplib.py", line 869, in send
self.sock.sendall(data)
File "/usr/local/lib/python2.7/site-packages/requests/packages/urllib3/contrib/pyopenssl.py", line 220, in sendall
sent = self._send_until_done(data[total_sent:total_sent+SSL_WRITE_BLOCKSIZE])
File "/usr/local/lib/python2.7/site-packages/requests/packages/urllib3/contrib/pyopenssl.py", line 206, in _send_until_done
return self.connection.send(data)
File "/usr/local/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1271, in send
self._raise_ssl_error(self._ssl, result)
File "/usr/local/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1187, in _raise_ssl_error
_raise_current_error()
File "/usr/local/lib/python2.7/site-packages/OpenSSL/_util.py", line 48, in exception_from_error_queue
raise exception_type(errors)
Error: [('SSL routines', 'SSL3_WRITE_PENDING', 'bad write retry')]

A bit of digging revealed #412 and #626 as issues that relate to the relevant code in urllib3. WrappedSocket.sendall proxies to WrappedSocket._send_until_done, which invokes OpenSSL.SSL.send in a loop. The conditions for loop termination are a) the write succeeds or b) a timeout expires while waiting for the socket to become writable in the case of a WantWriteError.

I think that this issue boils down to:

  1. WrappedSocket.sendall converts the data to be sent to a memoryview.
  2. WrappedSocket._send_until_done invokes OpenSSL.SSL.send in a loop according to the conditions above.
  3. OpenSSL.SSL.send calls the memoryview.tobytes() and passes the bytestring result in the OpenSSL write call.
  4. In the WantWriteError case, 2 and 3 are repeated and a new bytestring is passed to the OpenSSL write, resulting in a SSL3_WRITE_PENDING error.

I was able to get around this issue by patching urllib3/contrib/pyopenssl.py to enforce that a single bytestring is used for all calls to OpenSSL.SSL.send: https://gist.github.com/evnm/af92092c6a7776e08339

Please advise on whether this is a good way to fix the issue. I tried adding a test case to test/with_dummyserver/test_https.py, but haven't figured out how to tickle the specific issue I've run into.

Relevant versions in use:

  • Python 2.7.10
  • requests 2.8.1
  • pyOpenSSL 0.15.1
  • urllib3 1.12
@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Oct 15, 2015

Thanks for this enormously detailed bug report @evnm! You've done a lot of really great digging here, which really makes our job a lot easier. 🍮

So your analysis seems generally to be correct. I haven't dived into whether it's correct that openssl's send method requires exactly the same data (the same pointers), but I am prepared to believe that it is, especially as your patch apparently resolves the problem.

This behaviour was introduced in #628, where we pressed for the use of a memoryview despite the fact that PyOpenSSL will roundtrip back to bytes. Unfortunately, that was bad advice. It was predicated on the assumption that PyOpenSSL could in future behave better with treating memoryviews as buffers and avoid the need to convert them to bytestrings.

Having since worked more closely with CFFI in this area, I don't believe this is likely to ever happen because of limitations in CFFI. Specifically, the discussions in CFFI issue 47 indicate that CFFI does not expect to support creating a char * object directly from a memoryview.

For this reason, PyOpenSSL can't make the enhancement we thought it could make, which means the memoryview is not an optimisation but a pessimisation. With that in mind, I think the correct fix is actually to remove the cast to memoryview and simply accept the copy when we slice the string.

@evnm Given that you did such good work diagnosing the bug and proposing a draft patch, would you like to open a pull request with a fix?

@evnm
Copy link
Contributor Author

evnm commented Oct 15, 2015

Sounds good. I confirmed that removing the memoryview usage entirely from WrappedSocket.sendall also fixes my issue. I spent some more time trying to write a unit test for this case, but gave up after failing to find a way to accurately mock the specific interaction I'm observing with OpenSSL. I will submit a PR and we can continue any remaining discussion there.

@shazow
Copy link
Member

shazow commented Oct 17, 2015

Fixed in #719

@kengruven
Copy link

I upgraded my requests library from 2.8.1 to 2.9.1 yesterday, and I started seeing this bug. I've confirmed in production that it's running requests 2.9.1, with its included urllib3 1.13.1, and that the absolute pathnames in the stack trace are the right ones. (It's a freshly built Docker image, so there's pretty much no chance of an old library sitting around.)

The only difference in the stack trace is that in pyopenssl.py, it says "line 218" for me (the 1.13 location of _send_until_done()) rather than "line 220" (the 1.12 location of that call).

This is on Python 2.7.10 here.

I'm going to try downgrading requests to 2.7.0 to see if that does any good. It's before the memoryview got added, but then, there's no memoryview in the library I've got, either.

Ideas?

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Jan 7, 2016

If you can provide a good repro of this that would be enormously helpful!

@kengruven
Copy link

I'll see if I can isolate it, and provide a simple repro.

Things we (think we) know so far:

  • We're using requests here indirectly, to connect to ElasticSearch via the elasticsearch Python library, with connection_class=elasticsearch.connection.RequestsHttpConnection
  • It still connects, and can run queries just fine
  • Sometimes, adding data works. The problem only seems to occur when adding a lot of data to ES, which we do via elasticsearch.helpers.bulk
  • We call bulk() with the defaults, which are up to 500 documents at once (we probably hit this sometimes) and 100MB (we never hit this)
  • It started happening when we upgraded requests from 2.8.1 to 2.9.1 (we also used 2.3.0 for a while before upgrading to 2.8.1), but a few other things changed around the same time

@nitzmahone
Copy link

I've got this very reproducible while developing a requests-based fork of pywinrm (against a Windows wsman endpoint), and I'm getting some form of it on everything from 2.3.0 to today's master. If I back off to requests 2.3.0-2.4.0, it fails with a WantWriteError that I've seen referenced lots of places, but even using the latest requests master (which has urllib3 1.13.1) still exhibits the issue with the naked OpenSSL SSL3_WRITE_PENDING/bad write retry for me. The larger the payload, the more likely it is to occur- wsman breaks after about 450k, so I can't go higher than that, but right near 450k, it'll happen on ~1:5 requests. OSX 10.11.2/Python 2.7.9, talking to a local VirtualBox-hosted Windows Server 2012R2 instance.

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Mar 31, 2016

@nitzmahone And your stack trace is exactly the same as this?

@nitzmahone
Copy link

Yup (this one is from PyPI-installed release 2.9.1):

Traceback (most recent call last):   
(my stuff elided...)
File "/Users/mdavis/.virtualenvs/ansiblev2/lib/python2.7/site-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)   
File "/Users/mdavis/.virtualenvs/ansiblev2/lib/python2.7/site-packages/requests/adapters.py", line 376, in send
timeout=timeout   
File "/Users/mdavis/.virtualenvs/ansiblev2/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 559, in urlopen
body=body, headers=headers)   
File "/Users/mdavis/.virtualenvs/ansiblev2/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 353,
in _make_request     conn.request(method, url, **httplib_request_kw)   
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1001, in request     self._send_request(method, url,
body, headers)   
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1035, in _send_request
self.endheaders(body)   
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 997, in endheaders
self._send_output(message_body)   
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 854, in
_send_output     self.send(message_body)   
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 826,
in send     self.sock.sendall(data)   
File "/Users/mdavis/.virtualenvs/ansiblev2/lib/python2.7/site-packages/requests/packages/urllib3/contrib/pyopenssl.py",
line 218, in sendall     sent = self._send_until_done(data[total_sent:total_sent + SSL_WRITE_BLOCKSIZE])   
File "/Users/mdavis/.virtualenvs/ansiblev2/lib/python2.7/site-packages/requests/packages/urllib3/contrib/pyopenssl.py", line 207, in _send_until_done     return
self.connection.send(data)   
File "/Users/mdavis/.virtualenvs/ansiblev2/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1271, in send
self._raise_ssl_error(self._ssl, result)   
File "/Users/mdavis/.virtualenvs/ansiblev2/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1187, in
_raise_ssl_error     _raise_current_error()   
File "/Users/mdavis/.virtualenvs/ansiblev2/lib/python2.7/site-packages/OpenSSL/_util.py", line 48, in
exception_from_error_queue     raise exception_type(errors) Error: [('SSL routines', 'SSL3_WRITE_PENDING', 'bad write retry')]

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Mar 31, 2016

Are you sending your large payload from a file object or something else?

@nitzmahone
Copy link

No- manually constructing the payloads as strings (b64-encoded file data inside SOAP envelope).

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Mar 31, 2016

@nitzmahone Does this work reproducibly in the same virtual environment just repeatedly POSTing large payloads (e.g. 800kB of data) to httpbin.org?

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Mar 31, 2016

What I'm aiming to do here is to get something that I can reproduce on my own machine.

@nitzmahone
Copy link

The client's a Mac- it fails against my local Windows VM with or without Fiddler in the middle. I'll try to put together an isolated repro that hits the problem on that setup, then run it on something else...

@nitzmahone
Copy link

@Lukasa : Finally figured out what's making this fail- the data string has to be of type unicode (instead of str), then it'll blow up quite reliably on an SSL retry. The following repro blows up great against httpbin.org for me (on a Mac, anyway):

import os
from requests import Request, Session
from random import randint
import base64

endpoint = 'https://httpbin.org/post'

def sendbig():

    session = Session()
    session.trust_env = True
    settings = session.merge_environment_settings(endpoint, {}, None, False, None)
    session.proxies = settings['proxies']

    for x in range(1,1000):
        data = os.urandom(randint(400000, 500000))
        bare_req = Request('POST', endpoint, data=unicode(base64.b64encode(data)))

        req = session.prepare_request(bare_req)
        print "sending %d bytes" % len(data)
        response = session.send(req, timeout=60, verify=False)
        print "done - %s" % str(x)
        response.raise_for_status()

sendbig()

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Apr 5, 2016

Ah, good catch. Ironically, now that you've pointed that out I can tell you that if you had deprecation warnings turned on you'd have had warnings pop out that pointed the finger at exactly this problem. =( The ongoing downside of having deprecation warnings disabled in Python by default.

The simplest fix here is to just not send unicode data. It's generally unwise to rely on the auto-encoding of unicode objects when attempting to send them over the wire, as you can get stunningly unexpected results. This is one of them.

The core problem is that PyOpenSSL itself will convert a unicode object to a bytes object before sending it to OpenSSL: it has to, as OpenSSL has no concept of handling a unicode object. This necessarily creates a new object each time it's done. That means that if a unicode object is passed to PyOpenSSL and a WantWriteError fires, we will leak the bytes object and cause the exception you're now seeing.

I have mixed feelings on whether or not urllib3 should handle this appropriately. I'm inclined to say "no" because Python 3 doesn't allow you to send unicode objects into sockets (it'll raise a TypeError). So my position on this is that your code is in error, though I'm willing to be convinced otherwise.

@nitzmahone
Copy link

Yep, once I realized it only failed on unicode-typed input, I figured that might be the case after following the previous issues with WantWriteError and SSL_WRITE_PENDING. Just wanted to follow up with my resolution in case others trip over this later. There's no great reason for our input string to be unicode-typed in this case anyway (it was kinda accidental just because of the source of the original data), so easy fix for us. An outright early failure on HTTPS + unicode-typed input would've caught this (since we're on Python 2.6/2.7 and don't run w/ deprecation warnings), but I suspect that'd surprise a lot of folks as a breaking change (even though they're already subject to Heisenbugs from this now on Python 2.x). I prefer fail-early over fail-maybe, but it's your show, boss. ;) Thanks!

@Akdora
Copy link

Akdora commented Mar 6, 2017

I had the same problem. I was doing fn_some_replacements(unicode(data)) before sending the data.
After I commented this row, (not doing unicode process) it worked. thanks for the idea. unicoding the data causes some internal problems.

@cristianocca
Copy link

I'm having the exact problem, using python elasticsearch, and requests_aws4auth, any work around or expected fixes?

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Sep 15, 2017

@cristianocca Yes: stop sending unicode objects as request bodies. 😁

@cristianocca
Copy link

I have tried that but it doesn't seem I actually have control, somewhere in requests, requests_aws4auth or python-elasticsearch sends unicode. Guess I might have to report on those instead

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Sep 15, 2017

It won't be requests. I have a fuzzy recollection that python-elasticsearch has had a tendency of doing this.

@haikuginger
Copy link
Contributor

haikuginger commented Sep 15, 2017

Yeah - a past version of requests could cause this issue, but I think we've mostly seen it in the context of python-elasticsearch, when that library attempts to send unicode.

@cristianocca
Copy link

The issue seems to come from requests_aws4auth:

tedder/requests-aws4auth#24

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants