Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSLContext.wrap_socket() throws OSError with errno == 0 #75305

Closed
nikratio mannequin opened this issue Aug 4, 2017 · 18 comments
Closed

SSLContext.wrap_socket() throws OSError with errno == 0 #75305

nikratio mannequin opened this issue Aug 4, 2017 · 18 comments
Assignees
Labels
3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes topic-SSL type-bug An unexpected behavior, bug, or error

Comments

@nikratio
Copy link
Mannequin

nikratio mannequin commented Aug 4, 2017

BPO 31122
Nosy @gpshead, @tiran, @jab, @bdarnell, @vadmium, @dimaqq, @Safihre, @xgid, @miss-islington, @remilapeyre, @shevisjohnson
PRs
  • bpo-31122: ssl.wrap_socket() now raises ssl.SSLEOFError rather than OSError when peer closes connection during TLS negotiation #18772
  • [3.9] bpo-31122: ssl.wrap_socket() now raises ssl.SSLEOFError rather than OSError when peer closes connection during TLS negotiation (GH-18772) #21888
  • [3.8] bpo-31122: ssl.wrap_socket() now raises ssl.SSLEOFError rather than OSError when peer closes connection during TLS negotiation (GH-18772) #21889
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = <Date 2020-08-15.17:06:01.924>
    created_at = <Date 2017-08-04.19:16:41.456>
    labels = ['expert-SSL', '3.8', 'type-bug', '3.7', '3.9']
    title = 'SSLContext.wrap_socket() throws OSError with errno == 0'
    updated_at = <Date 2020-08-15.17:45:01.059>
    user = 'https://bugs.python.org/nikratio'

    bugs.python.org fields:

    activity = <Date 2020-08-15.17:45:01.059>
    actor = 'miss-islington'
    assignee = 'gregory.p.smith'
    closed = True
    closed_date = <Date 2020-08-15.17:06:01.924>
    closer = 'gregory.p.smith'
    components = ['SSL']
    creation = <Date 2017-08-04.19:16:41.456>
    creator = 'nikratio'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 31122
    keywords = ['patch']
    message_count = 18.0
    messages = ['299759', '299766', '299793', '328093', '332779', '336732', '363318', '363322', '363329', '363333', '364067', '367335', '375471', '375478', '375479', '375481', '375482', '375484']
    nosy_count = 12.0
    nosy_names = ['gregory.p.smith', 'christian.heimes', 'jab', 'nikratio', 'Ben.Darnell', 'martin.panter', 'Dima.Tisnek', 'Safihre', 'xgdomingo', 'miss-islington', 'remi.lapeyre', 'shevis']
    pr_nums = ['18772', '21888', '21889']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue31122'
    versions = ['Python 3.7', 'Python 3.8', 'Python 3.9']

    @nikratio
    Copy link
    Mannequin Author

    nikratio mannequin commented Aug 4, 2017

    With a particularly atrocious network connection, I often get the following exception:

    File "/usr/lib/python3/dist-packages/dugong/init.py", line 503, in connect
    self._sock = self.ssl_context.wrap_socket(self._sock, server_hostname=server_hostname)
    File "/usr/lib/python3.5/ssl.py", line 385, in wrap_socket
    _context=self)
    File "/usr/lib/python3.5/ssl.py", line 760, in __init__
    self.do_handshake()
    File "/usr/lib/python3.5/ssl.py", line 996, in do_handshake
    self._sslobj.do_handshake()
    File "/usr/lib/python3.5/ssl.py", line 641, in do_handshake
    self._sslobj.do_handshake()
    OSError: [Errno 0] Error

    I don't think an error with errno == 0 should ever be raised by Python.

    @nikratio nikratio mannequin assigned tiran Aug 4, 2017
    @nikratio nikratio mannequin added topic-SSL type-bug An unexpected behavior, bug, or error labels Aug 4, 2017
    @vadmium
    Copy link
    Member

    vadmium commented Aug 5, 2017

    It might help if you explained what “atrocities” are happening on your network. Is there a proxy or man-in-the-middle (or the remote peer) that shuts down TCP connections?

    If so, perhaps this is similar to bpo-10808. From my memory, in that case an OS “recv” or “read” call returns zero to indicate a connection was shut down. Python and/or Open SSL correctly treats this as an error, but then assumes “errno” is valid. Perhaps Python should be raising SSLEOFError in this situation.

    Maybe also check the “suppress_ragged_eofs” setting, but I think that only affects later stages, after the handshake succeeds.

    @nikratio
    Copy link
    Mannequin Author

    nikratio mannequin commented Aug 5, 2017

    Regarding "atrocious connection": I wish I knew, but I have no control of the connection. All I can tell is that there are frequent disconnects, occasional latency spikes, my remote ip address seems to change frequently (while the apparent local one stays as-is, so presumably some NAT in between), temporary bandwidth drops, etc.

    @bdarnell
    Copy link
    Mannequin

    bdarnell mannequin commented Oct 20, 2018

    We have an easy reproduction of this "[Errno 0] Error" on the server side in tornadoweb/tornado#2504 (comment)

    It is triggered by a connection from nc -z (which I think is doing a TCP handshake and shutting down the connection cleanly, but I'm not sure. It might just send an RST instead of the clean shutdown). On macos, I get SSL_ERROR_EOF (as expected), but on linux it raises an OSError with errno 0. (Note that the script as posted has a small mistake in that it is using a client-side SSLContext on the server side. The same error is seen when that mistake is fixed)

    I'm going to add "errno 0" to the list of errors that Tornado should swallow silently here, so if you're trying to reproduce this in the future use Tornado 5.1.1.

    @gpshead
    Copy link
    Member

    gpshead commented Dec 31, 2018

    This flakiness just caused a PR merge to be blocked by AppVeyor for me:

    ======================================================================
    ERROR: test_with_statement (test.test_nntplib.NetworkedNNTP_SSLTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "C:\projects\cpython\lib\test\test_nntplib.py", line 242, in wrapped
        meth(self)
      File "C:\projects\cpython\lib\test\test_nntplib.py", line 264, in test_with_statement
        with self.NNTP_CLASS(self.NNTP_HOST, timeout=TIMEOUT, usenetrc=False) as server:
      File "C:\projects\cpython\lib\nntplib.py", line 1077, in __init__
        self.sock = _encrypt_on(self.sock, ssl_context, host)
      File "C:\projects\cpython\lib\nntplib.py", line 292, in _encrypt_on
        return context.wrap_socket(sock, server_hostname=hostname)
      File "C:\projects\cpython\lib\ssl.py", line 405, in wrap_socket
        return self.sslsocket_class._create(
      File "C:\projects\cpython\lib\ssl.py", line 853, in _create
        self.do_handshake()
      File "C:\projects\cpython\lib\ssl.py", line 1117, in do_handshake
        self._sslobj.do_handshake()
    OSError: [Errno 0] Error

    https://ci.appveyor.com/project/python/cpython/builds/21299396

    I lucked out by kicking it with a no-op change to re-trigger an appveyor run where it passed.

    @gpshead gpshead added 3.7 (EOL) end of life 3.8 (EOL) end of life labels Dec 31, 2018
    @Safihre
    Copy link
    Mannequin

    Safihre mannequin commented Feb 27, 2019

    In the CherryPy project it is also observed on Windows with Python 3,7.2.
    In CherryPy it's triggered by Checker plugin, which connects to the app listening to the socket port in TLS mode via plain HTTP during startup (from the same process).
    It has been around for a while: cherrypy/cherrypy#1618 (comment)

    @dimaqq
    Copy link
    Mannequin

    dimaqq mannequin commented Mar 4, 2020

    I volunteer to test the theory that the connection is closed mid-flight.

    @dimaqq
    Copy link
    Mannequin

    dimaqq mannequin commented Mar 4, 2020

    Rejoice: https://github.com/dimaqq/bpo-31122
    Short, easy to reproduce :)
    (I've tested on Mac: 3.7, 3.8, 3.9a from python.org, linked against OpenSSL 1.1.1c/d)
    Funnily enough, Python 2.7 raises an ssl.SSLEOFError instead 🤷‍♂️

    @dimaqq dimaqq mannequin added the 3.9 only security fixes label Mar 4, 2020
    @dimaqq
    Copy link
    Mannequin

    dimaqq mannequin commented Mar 4, 2020

    I've traced it down to here:

    cpython/Modules/_ssl.c

    Lines 791 to 798 in be501ca

    if (err.c) {
    errno = err.c;
    return PyErr_SetFromErrno(PyExc_OSError);
    }
    Py_INCREF(s);
    s->errorhandler();
    Py_DECREF(s);
    return NULL;

    err.c (errno) == 0, no error, and err.ssl == 5, SSL_ERROR_SYSCALL, helpfully commented "look at error stack/return value/errno" in openssl/ssl.h 😅

    I'm a bit suspicious about s->errorhandler() which is some old convention (git blame: 8 years ago), commented "checks errno, returns NULL, set a Python exception", but at this point, we know that errno is 0, so why call it?

    I'm thinking to just change that to SSLEOFError, but I wonder if something else might break?

    @dimaqq
    Copy link
    Mannequin

    dimaqq mannequin commented Mar 4, 2020

    #18772 posted

    @dimaqq
    Copy link
    Mannequin

    dimaqq mannequin commented Mar 13, 2020

    If someone can review #18772 then pretty-please review 🙏

    @dimaqq
    Copy link
    Mannequin

    dimaqq mannequin commented Apr 27, 2020

    I know Christian is very busy, so what can I do to have this patch reviewed?

    • it's concise
    • there's a reproducer

    @Safihre
    Copy link
    Mannequin

    Safihre mannequin commented Aug 15, 2020

    Would anyone be able to review this? People keep reporting this bug in my project.
    Months are just passing while the PR is just a few lines of code:
    https://github.com/python/cpython/pull/18772/files

    @miss-islington
    Copy link
    Contributor

    New changeset 495bd03 by Dima Tisnek in branch 'master':
    bpo-31122: ssl.wrap_socket() now raises ssl.SSLEOFError rather than OSError when peer closes connection during TLS negotiation (GH-18772)
    495bd03

    @gpshead
    Copy link
    Member

    gpshead commented Aug 15, 2020

    Thanks!

    fyi for confirmation incase anyone doubted:

    >>> issubclass(ssl.SSLEOFError, OSError)
    True

    So from a code point of view, anything already catching the error still catches the error. 100% bugfix.

    @gpshead gpshead assigned gpshead and unassigned tiran Aug 15, 2020
    @gpshead
    Copy link
    Member

    gpshead commented Aug 15, 2020

    While this is present in 3.7 (and earlier?), 3.7 is EOL - security fix only stage. the 3.8 and 3.9 PRs should automerge after CI finishes.

    please reopen the issue or ping me on those PRs if they somehow fail to do so.

    @gpshead gpshead closed this as completed Aug 15, 2020
    @miss-islington
    Copy link
    Contributor

    New changeset 2434581 by Miss Islington (bot) in branch '3.8':
    bpo-31122: ssl.wrap_socket() now raises ssl.SSLEOFError rather than OSError when peer closes connection during TLS negotiation (GH-18772)
    2434581

    @miss-islington
    Copy link
    Contributor

    New changeset fc8ffe2 by Miss Islington (bot) in branch '3.9':
    bpo-31122: ssl.wrap_socket() now raises ssl.SSLEOFError rather than OSError when peer closes connection during TLS negotiation (GH-18772)
    fc8ffe2

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes topic-SSL type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants