Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent asyncio.open_connection / futures.InvalidStateError #65646

Closed
ryderlewis mannequin opened this issue May 6, 2014 · 8 comments
Closed

Intermittent asyncio.open_connection / futures.InvalidStateError #65646

ryderlewis mannequin opened this issue May 6, 2014 · 8 comments
Labels
topic-asyncio topic-IO type-bug An unexpected behavior, bug, or error

Comments

@ryderlewis
Copy link
Mannequin

ryderlewis mannequin commented May 6, 2014

BPO 21447
Nosy @gvanrossum, @pitrou, @vstinner, @giampaolo, @asvetlov, @1st1

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2014-06-30.14:09:02.143>
created_at = <Date 2014-05-06.18:28:19.321>
labels = ['type-bug', 'expert-IO', 'expert-asyncio']
title = 'Intermittent asyncio.open_connection / futures.InvalidStateError'
updated_at = <Date 2014-07-05.13:31:25.954>
user = 'https://bugs.python.org/ryderlewis'

bugs.python.org fields:

activity = <Date 2014-07-05.13:31:25.954>
actor = 'python-dev'
assignee = 'none'
closed = True
closed_date = <Date 2014-06-30.14:09:02.143>
closer = 'vstinner'
components = ['IO', 'asyncio']
creation = <Date 2014-05-06.18:28:19.321>
creator = 'ryder.lewis'
dependencies = []
files = []
hgrepos = []
issue_num = 21447
keywords = []
message_count = 8.0
messages = ['217999', '218003', '218018', '218025', '218027', '220999', '221962', '222358']
nosy_count = 8.0
nosy_names = ['gvanrossum', 'pitrou', 'vstinner', 'giampaolo.rodola', 'asvetlov', 'python-dev', 'yselivanov', 'ryder.lewis']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue21447'
versions = ['Python 3.4']

@ryderlewis
Copy link
Mannequin Author

ryderlewis mannequin commented May 6, 2014

Intermittently, when using asyncio.wait_for() with asyncio.open_connection() to cancel the open_connection() task after a timeout period, an asyncio.futures.InvalidStateError is raised. It seems to be a race condition, if the open_connection() call succeeds roughly at the same time as the timeout.

In order to recreate this issue, I used a slightly-modified version of the example at https://docs.python.org/3.4/library/asyncio-stream.html?highlight=open_connection#example

==== example.py ====

import asyncio
import urllib.parse
import sys

@asyncio.coroutine
def print_http_headers(url, timeout):
    url = urllib.parse.urlsplit(url)
    try:
        reader, writer = yield from asyncio.wait_for(asyncio.open_connection(url.hostname, 80), timeout)
    except asyncio.TimeoutError:
        print('Timed out with {}-second timeout'.format(timeout))
        return

    query = ('HEAD {url.path} HTTP/1.0\r\n'
             'Host: {url.hostname}\r\n'
             '\r\n').format(url=url)
    writer.write(query.encode('latin-1'))
    while True:
        line = yield from reader.readline()
        if not line:
            break
        line = line.decode('latin1').rstrip()
        if line:
            print('HTTP header> %s' % line)

    print('Success with {}-second timeout'.format(timeout))

url = sys.argv[1]
loop = asyncio.get_event_loop()

for timeout in range(5, 0, -1):
    task = asyncio.async(print_http_headers(url, timeout/100))
    loop.run_until_complete(task)

loop.close()

==== Output of "./example.py http://www.yahoo.com/" after running multiple times ====

HTTP header> HTTP/1.0 301 Redirect
HTTP header> Date: Tue, 06 May 2014 18:07:43 GMT
HTTP header> Connection: close
HTTP header> Via: http/1.1 ir12.fp.ne1.yahoo.com (ApacheTrafficServer/4.0.2)
HTTP header> Server: ATS
HTTP header> Cache-Control: no-store
HTTP header> Content-Type: text/html; charset=utf-8
HTTP header> Content-Language: en
HTTP header> Location: https://www.yahoo.com/
HTTP header> Content-Length: 214
Success with 0.04-second timeout
Timed out with 0.03-second timeout
Timed out with 0.02-second timeout
Exception in callback <bound method Future.set_result of Future<CANCELLED>>(None,)
handle: Handle(<bound method Future.set_result of Future<CANCELLED>>, (None,))
Traceback (most recent call last):
  File "/opt/python3/lib/python3.4/asyncio/events.py", line 39, in _run
    self._callback(*self._args)
  File "/opt/python3/lib/python3.4/asyncio/futures.py", line 298, in set_result
    raise InvalidStateError('{}: {!r}'.format(self._state, self))
asyncio.futures.InvalidStateError: CANCELLED: Future<CANCELLED>
Timed out with 0.01-second timeout

@ryderlewis ryderlewis mannequin added topic-IO type-bug An unexpected behavior, bug, or error labels May 6, 2014
@ryderlewis
Copy link
Mannequin Author

ryderlewis mannequin commented May 6, 2014

Another run raised a different exception,
again running ./example.py http://www.yahoo.com/

Timed out with 0.05-second timeout
HTTP header> HTTP/1.0 301 Redirect
HTTP header> Date: Tue, 06 May 2014 18:58:53 GMT
HTTP header> Connection: close
HTTP header> Via: http/1.1 ir38.fp.bf1.yahoo.com (ApacheTrafficServer/4.0.2)
HTTP header> Server: ATS
HTTP header> Cache-Control: no-store
HTTP header> Content-Type: text/html; charset=utf-8
HTTP header> Content-Language: en
HTTP header> Location: https://www.yahoo.com/
HTTP header> Content-Length: 214
Success with 0.04-second timeout
HTTP header> HTTP/1.0 301 Redirect
HTTP header> Date: Tue, 06 May 2014 18:58:53 GMT
HTTP header> Connection: close
HTTP header> Via: http/1.1 ir45.fp.bf1.yahoo.com (ApacheTrafficServer/4.0.2)
HTTP header> Server: ATS
HTTP header> Cache-Control: no-store
HTTP header> Content-Type: text/html; charset=utf-8
HTTP header> Content-Language: en
HTTP header> Location: https://www.yahoo.com/
HTTP header> Content-Length: 214
Success with 0.03-second timeout
Timed out with 0.02-second timeout
Timed out with 0.01-second timeout
exception calling callback for <Future at 0x7f24e43aa710 state=finished returned list>
Traceback (most recent call last):
  File "/opt/python3/lib/python3.4/concurrent/futures/_base.py", line 297, in _invoke_callbacks
    callback(self)
  File "/opt/python3/lib/python3.4/asyncio/futures.py", line 370, in <lambda>
    new_future._copy_state, fut))
  File "/opt/python3/lib/python3.4/asyncio/base_events.py", line 285, in call_soon_threadsafe
    self._write_to_self()
  File "/opt/python3/lib/python3.4/asyncio/selector_events.py", line 91, in _write_to_self
    self._csock.send(b'x')
AttributeError: 'NoneType' object has no attribute 'send'

@gvanrossum
Copy link
Member

The second error is easy to explain and fix: it's a race condition between the OS thread used to call getaddrinfo() and the main thread. The method _write_to_self() in selector_events.py is hopelessly naive. It should probably become something like this:

    def _write_to_self(self):
        csock = self._csock
        if csock is not None:
            try:
                self._csock.send(b'x')
            except OSError:
		pass

It is possible that the main thread closes csock at any time, and calling send() on a closed socket will raise OSError with errno=9 (EBADF). Fortunately this is because close() sets the fd to -1; so there is no worry about reuse of the fd.

I will investigate the first traceback next.

@python-dev
Copy link
Mannequin

python-dev mannequin commented May 6, 2014

New changeset afe8c4bd3230 by Guido van Rossum in branch '3.4':
asyncio: Fix the second half of issue bpo-21447: race in _write_to_self().
http://hg.python.org/cpython/rev/afe8c4bd3230

New changeset c0538334f4df by Guido van Rossum in branch 'default':
Merge 3.4->default: asyncio: Fix the second half of issue bpo-21447: race in _write_to_self().
http://hg.python.org/cpython/rev/c0538334f4df

@gvanrossum
Copy link
Member

I'm a little closer to understanding the first (more common) traceback. I can repro it by running your demo program in a loop -- it may take a while but occasionally I do get the same InvalidStateError.

This appears to be an example of the problem speculated about in message #5 of https://code.google.com/p/tulip/issues/detail?id=58.

In order to find out which occurrence of call_later(future.set_result, None) is causing this, I replaced all of them (about half a dozen) with an equivalent lambda -- the lambda shows up in the traceback, unlike the location where call_soon() is called.

In this particular example, it is the last line of _SelectorSocketTransport.__init__() in selector_events.py (line 446).

I suspect there are other places where a similar problem can occur. I still have to think more about how to fix this (without changing set_result() to always ignore a cancelled Future -- such a change would mask certain errors that I find important to catch).

@vstinner
Copy link
Member

I'm unable to reproduce the issue with Python 3.5 (development version).

@vstinner
Copy link
Member

This issue contains two sub-issues:

  • race condition in_write_to_self() => already fixed
  • race condition with scheduled call to future.set_result(), InvalidStateError => I just opened the issue bpo-21886 to discuss it

@ryder: If you are able to reproduce the second issue (InvalidStateError), please use set the environment variable PYTHONASYNCIODEBUG=1 to see the traceback where the call to set_result() was scheduled. It requires the latest development version of Tulip, Python 3.4 or Python 3.5 to get the traceback.

I close this issue because I prefer to discuss the InvalidStateError in the issue bpo-21886.

Thanks for the report Ryder. Thanks for the fix for the race condition in _write_to_self() Guido.

@python-dev
Copy link
Mannequin

python-dev mannequin commented Jul 5, 2014

New changeset d7e4efd5e279 by Victor Stinner in branch '3.4':
Closes bpo-21886, bpo-21447: Fix a race condition in asyncio when setting the result
http://hg.python.org/cpython/rev/d7e4efd5e279

New changeset 50c995bdc00a by Victor Stinner in branch 'default':
(Merge 3.4) Closes bpo-21886, bpo-21447: Fix a race condition in asyncio when setting
http://hg.python.org/cpython/rev/50c995bdc00a

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-asyncio topic-IO type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants