Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement asyncio Future in C to improve performance #70269

Closed
1st1 opened this issue Jan 11, 2016 · 50 comments
Closed

Implement asyncio Future in C to improve performance #70269

1st1 opened this issue Jan 11, 2016 · 50 comments
Assignees
Labels
3.7 expert-asyncio performance Performance or resource usage

Comments

@1st1
Copy link
Member

1st1 commented Jan 11, 2016

BPO 26081
Nosy @gvanrossum, @vstinner, @giampaolo, @ned-deily, @methane, @serhiy-storchaka, @1st1, @mpaolini, @RemiCardona
PRs
  • [Do Not Merge] Convert Misc/NEWS so that it is managed by towncrier #552
  • Files
  • futures.patch
  • futures.patch: bugfix and implement remove_done_callback
  • futures.patch
  • fastfuture.patch: merge master
  • fastfuture.patch: recreate patch
  • fastfuture.patch: _blocking -> _asyncio_future_blocking
  • fastfuture.patch
  • fastfuture2.patch
  • fastfuture3-wip.patch
  • fastfuture3.patch
  • fastfuture4.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/1st1'
    closed_at = <Date 2016-11-09.00:06:18.117>
    created_at = <Date 2016-01-11.17:23:42.941>
    labels = ['expert-asyncio', '3.7', 'performance']
    title = 'Implement asyncio Future in C to improve performance'
    updated_at = <Date 2017-03-31.16:36:12.613>
    user = 'https://github.com/1st1'

    bugs.python.org fields:

    activity = <Date 2017-03-31.16:36:12.613>
    actor = 'dstufft'
    assignee = 'yselivanov'
    closed = True
    closed_date = <Date 2016-11-09.00:06:18.117>
    closer = 'yselivanov'
    components = ['asyncio']
    creation = <Date 2016-01-11.17:23:42.941>
    creator = 'yselivanov'
    dependencies = []
    files = ['41579', '43670', '44081', '44531', '44544', '44547', '44670', '44993', '45003', '45012', '45023']
    hgrepos = []
    issue_num = 26081
    keywords = ['patch']
    message_count = 50.0
    messages = ['257984', '269773', '270037', '270055', '270068', '270069', '270070', '270175', '270183', '270221', '270232', '270924', '270928', '272242', '272245', '272262', '272278', '272285', '272327', '272347', '272498', '273802', '275658', '275729', '275734', '275740', '275903', '275904', '275905', '275906', '276494', '276504', '276505', '276513', '278204', '278206', '278224', '278227', '278243', '278294', '278325', '278328', '278348', '278357', '278365', '278366', '278368', '278370', '280359', '280363']
    nosy_count = 10.0
    nosy_names = ['gvanrossum', 'vstinner', 'giampaolo.rodola', 'ned.deily', 'methane', 'python-dev', 'serhiy.storchaka', 'yselivanov', 'mpaolini', 'RemiCardona']
    pr_nums = ['552']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue26081'
    versions = ['Python 3.6', 'Python 3.7']

    @1st1
    Copy link
    Member Author

    1st1 commented Jan 11, 2016

    Some info on this: python/asyncio#282 (comment) Long story short, Future implemented in C can speedup some asyncio code up to 25%.

    I'm attaching a patch with a WIP implementation. There are some failing assertions deep in GC, which I need to track down. 'Future.remove_done_callback' still needs to be properly implemented.

    @1st1 1st1 self-assigned this Jan 11, 2016
    @1st1 1st1 added expert-asyncio performance Performance or resource usage labels Jan 11, 2016
    @gvanrossum
    Copy link
    Member

    gvanrossum commented Jul 3, 2016

    We should really try to get this into 3.6.

    --Guido (mobile)

    @methane
    Copy link
    Member

    methane commented Jul 9, 2016

    I'm working on this. Some bugs are fixed, but doesn't pass tests for now.
    methane#5

    @methane
    Copy link
    Member

    methane commented Jul 9, 2016

    Passing all tests now.
    Yury, could you explain what the comment "This isn't a Future class; it's a BaseFuture" means?

    Should it be "_futures.Future" or "_futures.BaseFuture"?

    @methane
    Copy link
    Member

    methane commented Jul 10, 2016

    Should I send pull request to github.com/python/asyncio?
    Or should I post patch here?

    @1st1
    Copy link
    Member Author

    1st1 commented Jul 10, 2016

    I'm working on this. Some bugs are fixed, but doesn't pass tests for now.

    Thanks a lot! I couldn't find time to finish this myself. I can definitely help you and review the patch once it's ready.

    Yury, could you explain what the comment "This isn't a Future class; it's a BaseFuture" means?

    Unfortunately I don't remember :(

    Should I send pull request to github.com/python/asyncio?
    Or should I post patch here?

    Please post it here. AFAIK we haven't yet transitioned to the GitHub.

    @methane
    Copy link
    Member

    methane commented Jul 10, 2016

    OK. Here is current version.

    @methane
    Copy link
    Member

    methane commented Jul 11, 2016

    In my patch, test_asyncio runs against C version Future.

    I saw how test_json tests against C version and pure Python version.
    But test_asyncio is more larger than test_json.

    Before working on it, could someone give me idea to run whole test_asyncio
    with and without C version Future easily?

    And, which is master repository of asyncio? github? or hg.python.org?
    If github, can I send separated pull request changing test_asyncio after
    this patch is merged?

    @1st1
    Copy link
    Member Author

    1st1 commented Jul 11, 2016

    Before working on it, could someone give me idea to run whole test_asyncio
    with and without C version Future easily?

    asyncio uses loop.create_future() to create sockets. I'd suggest you to create two base test classes: one that monkeypatches loop.create_future to return pure python Future in its setUp method; an another, that makes create_future to return a C version of the Future.

    The derive some unittests from those base classes (which will effectively double the number of tests).

    And, which is master repository of asyncio? github? or hg.python.org?
    If github, can I send separated pull request changing test_asyncio after
    this patch is merged?

    The master repo for asyncio is github, but since the C version won't be a part of asyncio (it will be checked in only in CPython source tree), I think it's fine to continue the work here, on bugs.python.org.

    @methane
    Copy link
    Member

    methane commented Jul 12, 2016

    Thanks. I'll working on test_asyncio in next few days.

    @methane
    Copy link
    Member

    methane commented Jul 12, 2016

    asyncio uses loop.create_future() to create sockets. I'd suggest you to create two base test classes: one that monkeypatches loop.create_future to return pure python Future in its setUp method; an another, that makes create_future to return a C version of the Future.

    windows_events.py has some classes extends futures.Future.
    Task extends Future.
    There are some isinstance(future, futures.Future).

    So monkeypatching baseevent.create_future seems not enough.
    I want a way to completely reload asyncio and test_asyncio packages with and without C future.

    @mpaolini
    Copy link
    Mannequin

    mpaolini mannequin commented Jul 21, 2016

    THe guys developing uvloop say their implementation is already quite fast [1]. Is it worth integrating it?

    [1] https://github.com/MagicStack/uvloop

    @1st1
    Copy link
    Member Author

    1st1 commented Jul 21, 2016

    Yes. Most people will use vanilla asyncio anyways.

    @methane
    Copy link
    Member

    methane commented Aug 9, 2016

    Yury, could you review this before 3.6a4?

    @1st1
    Copy link
    Member Author

    1st1 commented Aug 9, 2016

    Yury, could you review this before 3.6a4?

    Left a couple of comments; the important one -- Future.__await__ (and Future.__iter__) should always return a *new* instance of a generator-like object (tied to the Future object).

    @serhiy-storchaka
    Copy link
    Member

    serhiy-storchaka commented Aug 9, 2016

    See also a discussion on Python-Dev about rewriting contextlib.contextmanager in C: https://mail.python.org/pipermail/python-dev/2016-August/thread.html#145786 .

    What parts of Future are performance critical? Maybe it is worth to implement in C only the most critical code.

    @1st1
    Copy link
    Member Author

    1st1 commented Aug 9, 2016

    What parts of Future are performance critical? Maybe it is worth to implement in C only the most critical code.

    Basically everything. Contrary to @contextmanager, Futures are the building blocks of asyncio, so instantiation + awaiting on them + setting results must be fast.

    To cover instantiation, I want to add a freelist for Futures, so this basically requires them to be implemented in C (and it's not a lot of C code actually).

    @1st1
    Copy link
    Member Author

    1st1 commented Aug 10, 2016

    I'd also think about implementing asyncio.Handle in C (with a freelist).

    @methane
    Copy link
    Member

    methane commented Aug 10, 2016

    Left a couple of comments; the important one -- Future.__await__ (and Future.__iter__) should always return a *new* instance of a generator-like object (tied to the Future object).

    Implementing full behavior of generator seems difficult to me.
    I'll implement minimum implementation in next patch.

    @1st1
    Copy link
    Member Author

    1st1 commented Aug 10, 2016

    Implementing full behavior of generator seems difficult to me.
    I'll implement minimum implementation in next patch.

    Sure, but you have to implement send() and throw().

    @methane
    Copy link
    Member

    methane commented Aug 12, 2016

    Implemented FutureIter

    @methane
    Copy link
    Member

    methane commented Aug 28, 2016

    There are only two weeks until 3.6 beta.
    Yury, could you review this again?

    Or should I implement freelist before review?
    Implementing freelist may be easy, but measuring the effect of freelist from
    realistic application is not easy.

    @gvanrossum
    Copy link
    Member

    gvanrossum commented Sep 10, 2016

    The actual _futures module appears missing from your latest patch -- what's up with that?

    @methane
    Copy link
    Member

    methane commented Sep 11, 2016

    Oh, I'm sorry.
    I usually working on git, and convert git diff to hg diff when posting patch.

    I've used patch -p1 instead of hg import --no-edit to apply git patch into hg workdir.
    I wonder if Rietveld accepts git diff format...

    @gvanrossum
    Copy link
    Member

    gvanrossum commented Sep 11, 2016

    Thanks! I can't review the whole thing, but I patched it in and tried running the asyncio/examples/crawl.py example, like so:

    $ ~/src/cpython36/python.exe examples/crawl.py xkcd.com -q
    Exception RuntimeError('yield was used instead of yield from in task <Task pending coro=<Crawler.fetch() running at examples/crawl.py:778>> with <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /Users/guido/src/cpython36/Lib/asyncio/futures.py:472]>',) for ('xkcd.com', 80)
    ERROR:asyncio:Task exception was never retrieved
    future: <Task finished coro=<Crawler.fetch() done, defined at examples/crawl.py:769> exception=RuntimeError('yield was used instead of yield from in task <Task pending coro=<Crawler.fetch() running at examples/crawl.py:778>> with <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /Users/guido/src/cpython36/Lib/asyncio/futures.py:472]>',)>
    Traceback (most recent call last):
      File "/Users/guido/src/cpython36/Lib/asyncio/tasks.py", line 241, in _step
        result = coro.throw(exc)
      File "examples/crawl.py", line 778, in fetch
        yield from fetcher.fetch()  # Fetcher gonna fetch.
      File "examples/crawl.py", line 507, in fetch
        yield from self.request.connect()
      File "examples/crawl.py", line 315, in connect
        self.port, self.ssl)
      File "examples/crawl.py", line 143, in get_connection
        ipaddrs = yield from self.loop.getaddrinfo(host, port)
    RuntimeError: yield was used instead of yield from in task <Task pending coro=<Crawler.fetch() running at examples/crawl.py:778>> with <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /Users/guido/src/cpython36/Lib/asyncio/futures.py:472]>
    *** Report ***
    http://xkcd.com no response object
    Finished 0 urls in 0.041 secs (max_tasks=100) (0.000 urls/sec/task)
    Todo: 0
    Busy: 1
    Done: 0
    Date: Sat Sep 10 21:50:08 2016 local time
    Traceback (most recent call last):
      File "examples/crawl.py", line 864, in <module>
        main()
      File "examples/crawl.py", line 852, in main
        loop.run_until_complete(crawler.crawl())  # Crawler gonna crawl.
      File "/Users/guido/src/cpython36/Lib/asyncio/base_events.py", line 438, in run_until_complete
        return future.result()
      File "/Users/guido/src/cpython36/Lib/asyncio/tasks.py", line 241, in _step
        result = coro.throw(exc)
      File "examples/crawl.py", line 766, in crawl
        yield from self.termination.wait()
      File "/Users/guido/src/cpython36/Lib/asyncio/locks.py", line 326, in wait
        yield from fut
    RuntimeError: yield was used instead of yield from in task <Task pending coro=<Crawler.crawl() running at examples/crawl.py:766> cb=[_run_until_complete_cb() at /Users/guido/src/cpython36/Lib/asyncio/base_events.py:164]> with <Future pending>

    Without your diff, that works, and the output includes this line:

    Finished 1786 urls in 7.105 secs (max_tasks=100) (2.514 urls/sec/task)

    @methane
    Copy link
    Member

    methane commented Sep 11, 2016

    Sorry, again. fixed.
    Now this passes ./python -m test.test_asyncio

    @gvanrossum
    Copy link
    Member

    gvanrossum commented Sep 12, 2016

    Yury: What do you think of the code? How solid is it? (The issue I found was due to my own very recent changes to _blocking.)

    Ned: Is it better to do this in 3.6b1 or to wait for 3.6b2?

    @1st1
    Copy link
    Member Author

    1st1 commented Sep 12, 2016

    Yury: What do you think of the code? How solid is it? (The issue I found was due to my own very recent changes to _blocking.)

    The code looks fine, I can fix the remaining nits myself. I've left a couple of comments in review.

    Ned: Is it better to do this in 3.6b1 or to wait for 3.6b2?

    TBH it would be way more convenient if we could push this into b2. I can push this on Tuesday without rushing things, and we'll have plenty of time to watch buildbots etc.

    @gvanrossum
    Copy link
    Member

    gvanrossum commented Sep 12, 2016

    Yeah, let's do this in 3.6b2.

    @ned-deily
    Copy link
    Member

    ned-deily commented Sep 12, 2016

    This change touches a lot of files and affect both the unix* and Windows build processes so, yeah, I think it's too risky to go in to b1. Let's get it in as soon as possible after b1.

    @1st1
    Copy link
    Member Author

    1st1 commented Sep 14, 2016

    INADA, would you be able to address my last review comments? Also, I'm wondering what if we could implement __del__ and __repr__ in C too, so that we could drop BaseFuture class?

    @methane
    Copy link
    Member

    methane commented Sep 15, 2016

    I'm working on fixing points you commented. Wait a minute.

    Implementing __del__ and __repr__ in C is bit hard task to me.
    I can't do it in this week. (maybe I can't do it in this month too.)

    On Thu, Sep 15, 2016 at 7:37 AM, Yury Selivanov <report@bugs.python.org> wrote:

    Yury Selivanov added the comment:

    INADA, would you be able to address my last review comments? Also, I'm wondering what if we could implement __del__ and __repr__ in C too, so that we could drop BaseFuture class?

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue26081\>


    @1st1
    Copy link
    Member Author

    1st1 commented Sep 15, 2016

    Implementing __del__ and __repr__ in C is bit hard task to me.
    I can't do it in this week. (maybe I can't do it in this month too.)

    NP. I'll take a look myself after you upload the next iteration of the patch...

    @methane
    Copy link
    Member

    methane commented Sep 15, 2016

    This is the patch.
    And git branch is here methane#5

    @1st1
    Copy link
    Member Author

    1st1 commented Oct 6, 2016

    The most recent patch segfaults... Will try to debug.

    @1st1
    Copy link
    Member Author

    1st1 commented Oct 6, 2016

    INADA, would you be able to take a look?

    @methane
    Copy link
    Member

    methane commented Oct 7, 2016

    FutureIter_throw is wrong, maybe.
    Removing FutureIter_send and FutureIter_throw from FutureIter_methods solves the segv and test passed.

    @methane
    Copy link
    Member

    methane commented Oct 7, 2016

    fixed

    @methane
    Copy link
    Member

    methane commented Oct 7, 2016

    fastfuture3-wip.patch is work in progress implementation of
    implementing __repr__ and __del__ in C.
    I post it to avoid duplicated works.

    Known TODOs:

    • Support overriding Future._repr_info()
    • Fix __del__ is not called (Research how tp_del, tp_finalize, and tp_deallocate works)

    I hope I have enough time to finish in next week, but I'm not sure.

    @methane
    Copy link
    Member

    methane commented Oct 8, 2016

    Fixed overriding Future._repr_info().
    But I failed to implement overridable Future.__del__ in C yet.

    (FYI, fastfuture2.patch passes tests by mix-in __del__ and __repr__)

    @methane
    Copy link
    Member

    methane commented Oct 8, 2016

    Now I understand tp_dealloc, tp_finalize and subtype_dealloc.
    Attached patch passes tests.

    @1st1
    Copy link
    Member Author

    1st1 commented Oct 9, 2016

    I quickly looked over the patch and I think it's good. If anything we still have time to hunt down any bugs or even revert this before 3.6 final.

    INADA, feel free to commit it before Monday to 3.6 and default branches.

    @methane
    Copy link
    Member

    methane commented Oct 9, 2016

    I've committed the patch with trivial fixes (adding curly braces to if statements).
    And I'm sorry, I committed with wrong issue number.

    https://hg.python.org/cpython/rev/678424183b38 (3.6)
    https://hg.python.org/cpython/rev/f8815001a390 (default)

    I fixed NEWS entry already.

    @methane
    Copy link
    Member

    methane commented Oct 9, 2016

    I close this issue for now.
    Further improvements can be new issue.

    @1st1
    Copy link
    Member Author

    1st1 commented Oct 9, 2016

    Thank you, INADA! Next task -- optimize asyncio.Task in C in 3.7. Another 10-15% performance improvement.

    @1st1
    Copy link
    Member Author

    1st1 commented Oct 9, 2016

    I mean another optimization possibility.

    @methane
    Copy link
    Member

    methane commented Oct 9, 2016

    How about changing module name?
    _asyncio_speedup for example.

    @1st1
    Copy link
    Member Author

    1st1 commented Oct 9, 2016

    Yes, I think it's a good idea.

    @1st1
    Copy link
    Member Author

    1st1 commented Nov 8, 2016

    This patch introduced multiple refleaks in test_asyncgen.

    @1st1 1st1 reopened this Nov 8, 2016
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 9, 2016

    New changeset 345904bd0456 by Yury Selivanov in branch '3.6':
    Issue bpo-26081: Fix refleak in _asyncio.Future.__iter__().throw.
    https://hg.python.org/cpython/rev/345904bd0456

    New changeset b977775aa07d by Yury Selivanov in branch 'default':
    Merge 3.6 (issue bpo-26081)
    https://hg.python.org/cpython/rev/b977775aa07d

    @1st1 1st1 closed this as completed Nov 9, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 expert-asyncio performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants