Severe open file leakage running asyncio SSL server #74156
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
assignee = 'https://github.com/1st1' closed_at = <Date 2017-12-19.20:02:01.780> created_at = <Date 2017-04-03.14:10:15.014> labels = ['expert-asyncio', '3.7', 'performance'] title = 'Severe open file leakage running asyncio SSL server' updated_at = <Date 2017-12-20.19:27:29.217> user = 'https://github.com/kyuupichan'
activity = <Date 2017-12-20.19:27:29.217> actor = 'asvetlov' assignee = 'yselivanov' closed = True closed_date = <Date 2017-12-19.20:02:01.780> closer = 'asvetlov' components = ['asyncio'] creation = <Date 2017-04-03.14:10:15.014> creator = 'kyuupichan' dependencies =  files =  hgrepos =  issue_num = 29970 keywords = ['patch'] message_count = 14.0 messages = ['291071', '291135', '296294', '296297', '296298', '308084', '308146', '308673', '308676', '308765', '308769', '308784', '308785', '308786'] nosy_count = 6.0 nosy_names = ['giampaolo.rodola', 'asvetlov', 'mocmocamoc', 'yselivanov', 'fafhrd91', 'kyuupichan'] pr_nums = ['4825', '4939'] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'resource usage' url = 'https://bugs.python.org/issue29970' versions = ['Python 3.7']
The text was updated successfully, but these errors were encountered:
Original report at old repo here: python/asyncio#483
There this is reported fixed by #480
I wish to report that whilst the above patch might have a small positive effect, it is far from solving the actual issue. Several users report eventual exhaustion of the open file resource running SSL asyncio servers.
Here are graphs provided by a friend running my ElectrumX server software, first accepting SSL connections and the second accepting TCP connections only. Both of the servers were monkey-patched with the pull-480 fix above, so this is evidence it isn't solving the issue.
As you can see, the TCP server (which has far less connections; most users use SSL) has no leaked file handles, whereas the SSL server has over 300.
This becomes an easy denial of service vector against asyncio servers. One way to trigger this (though I doubt it explains the numbers above) is simply to connect to the SSL server from telnet, and do nothing. asyncio doesn't time you out, the telnet session seems to sit there forever, and the open file resources are lost in the SSL handshake stage until the remote host kindly decides to disconnect.
I suspect these resource issues all revolve around the SSL handshake process, certainly at the opening of a connection, but also perhaps when closing.
As the application author I am not informed by asyncio of a potential connection until the initial handshake is complete, so I cannot do anything to close these phantom socket connections. I have to rely on asyncio to be properly handling DoS issues and it is not currently doing so robustly.
I think there's been some confusion about what PR 480 was meant to fix - it helps in cases where connections are closed during handshake, but if a server connection is waiting for a handshake but never receives any data at all then it stays in that state forever.
As for a fix, how about giving SSLProtocol a method like:
def checkHandshakeDone(self): if self._in_handshake == True: self._abort()
and then at the end of _start_handshake() adding:
Then if the handshake is not complete within ten seconds of starting, the connection will be aborted.