Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized @async_unsafe. #14911

Merged
merged 1 commit into from Sep 30, 2021

Conversation

adamchainz
Copy link
Sponsor Member

Switched the order of the checks to reduce the overhead. Async unsafe methods are normally called syncrhonously, so we can avoid the overhead of checking the environment variable in the regular path.

@adamchainz
Copy link
Sponsor Member Author

Before:

In [1]: from django.utils.asyncio import async_unsafe

In [2]: ai = async_unsafe(int)

In [3]: %timeit ai()
1.37 µs ± 13.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

After:

In [1]: from django.utils.asyncio import async_unsafe

In [2]: ai = async_unsafe(int)

In [3]: %timeit ai()
395 ns ± 1.68 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Calling int() takes ~50ns, so we can take that from each number, giving ~1300ns before and ~350ns after, ~4 times faster.

A small boost but it multiplies as async_unsafe decorated functions are called several times per query.

Copy link
Contributor

@kezabelle kezabelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm. I never thought fetching from os.environ would be the slow part, especially compared to try/except (which won't be zero-cost until .... 3.11?).

Nevertheless, I can replicate the difference.

In [1]: import os
In [2]: %timeit os.environ.get('DJANGO_ALLOW_ASYNC_UNSAFE') # pssst, it's not set...
1.06 µs ± 2.17 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Before:

[... prelude]
In [3]: %timeit ai()
1.79 µs ± 23.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

After:

[...prelude]
In [3]: %timeit ai()
454 ns ± 2.26 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

On that basis, the proposed change makes sense to me, but I'm not well-versed enough in the async story in general to say it's correct (beyond the tests passing), so I'll leave it without approval.

(eg: is there any semantic difference in asking for a loop regardless of the DJANGO_ALLOW_ASYNC_UNSAFE var? is that side-effect free? I'm not qualified to say tbh)

@adamchainz
Copy link
Sponsor Member Author

Good question on the semantics. I think it's side effect free to ask for the running loop, but this is obscured a bit by some thread local and caching logic in https://github.com/python/cpython/blob/main/Modules/_asynciomodule.c

Copy link
Member

@carltongibson carltongibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree get_running_loop should be side-effect free. (Ultimately it just looks up the pointer set in set_event_loop().)

I never thought fetching from os.environ would be the slow part...

Indeed 🧐

Are the import changes related to the optimisation? 🤔

@adamchainz
Copy link
Sponsor Member Author

Are the import changes related to the optimisation? 🤔

They're another micro-optimization to require fewer attribute accesses.

Copy link
Member

@carltongibson carltongibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fine, seems sensible 👍

(I'll process this in a bit.)

@tim-mccurrach
Copy link
Contributor

tim-mccurrach commented Sep 30, 2021

Whilst we're micro-optimising could we use _get_running_loop to shave a few more microseconds off. Then inner would become:

def inner(*args, **kwargs):
    if _get_running_loop() and not os.environ.get('DJANGO_ALLOW_ASYNC_UNSAFE'):
        raise SynchronousOnlyOperation(message)
    return func(*args, **kwargs)

Or would the use of the private function be too subject-to-change for us to use?
Unfortunately, I'm not able to test this for time right now, but I suspect it would make it marginally quicker (famous last words...).

Switched the order of the checks to reduce the overhead. Async unsafe methods
are *normally* called syncrhonously, so we can avoid the overhead of checking
the environment variable in the regular path.
@carltongibson
Copy link
Member

@tim-mccurrach I think that would be a bridge too far.

@tim-mccurrach
Copy link
Contributor

@tim-mccurrach I think that would be a bridge too far.

haha. Fair enough :)

@tim-mccurrach
Copy link
Contributor

@carltongibson
Although having said that, quickly testing just now, it does seem to cut the time in half again:

>>> timeit.repeat("ai()",repeat=7, number=100000, setup="from __main__ import ai")
[0.0292515020000792, 0.023990991999653488, 0.020490372999574902, 0.020217061000039394, 0.0210146330000498, 0.020252889999937906, 0.020329448000211414]
>>> timeit.repeat("ai2()",repeat=7, number=100000, setup="from __main__ import ai2")
[0.05905059799988521, 0.04216317100008382, 0.042790552000042226, 0.04461314699983632, 0.04397789999984525, 0.04326298099977066, 0.04322846100012612]

(ai2 is the function as it stands in this PR).

It's also (perhaps more importantly) more readable (IMO).

But I agree, it may well be a bridge too far 🤷

@carltongibson
Copy link
Member

@tim-mccurrach given that Django 4.1 will support 4 or 5 versions of Python using private APIs is not a good idea.

@tim-mccurrach
Copy link
Contributor

@tim-mccurrach given that Django 4.1 will support 4 or 5 versions of Python using private APIs is not a good idea.

👍 Yes, that seems sensible.

@carltongibson carltongibson merged commit 37d9ea5 into django:main Sep 30, 2021
@adamchainz adamchainz deleted the micro_optimize_async_unsafe branch September 30, 2021 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants