Improve Detection of Scripts that Require Top-Level Await #715

JeffersGlass · 2022-08-23T19:55:37Z

Issue

As noted in #714, testing whether a script needs to run with top-level-await enabled by looking for 'asyncio' in the source code misses some cases where top-level-await is necessary.

Solution

This PR changes the detection method. Instead of looking for the string 'asyncio' in the source, we use ast.Parse() to parse the source file, then walk the syntax tree to determine if any await, await for, or await with statements are used outside of async def functions. (This is precisely the situation that requires code to be compiled with the PyCF_ALLOW_TOP_LEVEL_AWAIT flag, which is what runPythonAsync() achieves.)

Testing

I've created a separate repository for the tests for the included code (TopLevelAsyncFinder and the is_async function). These tests demonstrate that is_async correctly differentiates between the three top-level-await statements inside and outside of async def statements. I'd be happy to include that code in this codebase somewhere, but I wasn't sure how it would fit into the testing regimen.

Timing

Of course, parsing and walking the AST is slowing than just looking for a string in the source, so I did some benchmarking. The gist is: it's slower, yes, but compared to the full pyodide/pyscript loading time, it's negligible. Running on the examples in this repo, the longest time to parse and walk the AST is roughly 6.5 milliseconds for the WebGL example, compared to 2.5 seconds from page-load to first-PyScript-execution in the same circumstances. (Testing for 'asyncio' in the source text takes <10 microseconds.)

I've compiled the test data for all the examples, for the curious.

Efficiency

The AST walking isn't as efficient as it could be - we always walk the entire tree instead of returning as soon as a top-level-await statement is found. This is a possible improvement for down the road.

…cript into async-detection

JeffersGlass · 2022-08-24T17:25:58Z

Given that this intersects with both #659 and #698, I won't be at all offended if this takes a long time to merge, or its work gets subsumed into some other initiative.

fpliger · 2022-08-24T18:21:35Z

Good point @JeffersGlass :) @madhur-tandon , what do you think of merging this soon and you solve conflicts in your PR?

Before we merge though, I wanted to check with @hoodmane @rth since I recall a discussion we had a few weeks ago where I think one of you guys hinted at "you could just always use runPythonAsync (that has been on my mental backlog for some time now :) ).

Independently of that, thank you @JeffersGlass this is great work

hoodmane · 2022-08-24T18:45:31Z

I think one of you guys hinted at "you could just always use runPythonAsync"

Yes. If you are ever checking something and picking either runPython and runPythonAsync it's better to always use runPythonAsync. The only disadvantage of runPythonAsync is that since it's async it can't be used in a synchronous way.

madhur-tandon · 2022-08-25T09:06:48Z

Hi @JeffersGlass Thank you for the PR. Based on the standardising effort for different runtimes (and based on @hoodmane's comment above), we have decided to use runPythonAsync for the Pyodide runtime. As such, the need for determination of asyncio is removed since runPythonAsync is used everywhere underneath now.

But the determination of asyncio in this PR is an optimisation indeed, albeit not being used right now (maybe in future?) and I think we can still merge it, after adding some tests of course (and removing conflicts due to Runtime Refactor being merged yesterday).

JeffersGlass · 2022-08-25T19:44:20Z

Thanks @madhur-tandon and all - makes sense to me, seems always using runPythonAsync provides what this PR was trying to achieve in terms of always correctly executing scripts with top level await, which is swell. I've been trying to put together a blog post on concurrent programming in PyScript (as it currently exists), and, selfishly, not having to explain why # asyncio makes your code work will be wonderful. 👍

I'll resolve the conflicts with the new runtime refactoring and add tests to this PR.

hoodmane · 2022-08-25T19:53:25Z

Even just checking for await<space> would be a lot better than the current situation...

JeffersGlass · 2022-08-25T20:20:12Z

@hoodmane Definitely, though it would miss ugly cases like await(foo()).

hoodmane · 2022-08-25T20:40:56Z

await(foo())

I guess that's valid syntax though formatters will convert it to await (foo()). I once worked briefly with someone who wrote all of their return statements like return(x) which really triggered me, but I was unable to convince them to stop.

antocuni · 2022-08-30T08:42:56Z

FWIW, in case it's needed in the future, another alternative which is better than the current logic but more lightweight than full AST parsing is to use the tokenize module, e.g. something like this:

import io
import tokenize

def is_async(code):
    f = io.BytesIO(code)
    for token in tokenize.tokenize(f.readline):
        if token.type == tokenize.NAME and token.string in ('async', 'await'):
            return True
    return False

assert is_async(b"""
await foo()
""")

assert not is_async(b"""
print("I am not async and I don't use asyncio")
""")

JeffersGlass · 2022-08-31T13:01:10Z

@antocuni That's true, though that will look for any instances of using async/await, whereas the goal of this PR was to look for situations that require top-level await.

For example, the following code uses async/await, but doesn't require compiling with top level await enabled.

import asyncio
from js import console

async def clock():
  count = 0
  while(True):
    console.log(count := count + 1)
    await asyncio.sleep(1)

asyncio.create_task(clock())

However, the following does require top-level await, and would have to be run with pyodide.runPythonAsync or similar:

import asyncio
from js import console

count = 0
while(True):
  console.log(count := count + 1)
  await asyncio.sleep(1)

JeffersGlass · 2022-09-30T20:30:47Z

For what it's worth, this PR is not necessarily dead, but I've been deprioritizing it since it didn't have an immediate impact on the codebase for the 2022.09.1 release.

I do intend to continue (and hopefully merge) it though, since we're still having ongoing conversations about lifecycle, sync/async, and the webloop, and I think this functionality may have some use in there.

tedpatrick · 2022-10-03T09:25:44Z

If we were explicit about denoting a <py-script> element instance as sync or async, would we need this detection?

<py-script mode="sync"></py-script>
<py-script mode="async"></py-script>

JeffersGlass · 2022-10-03T12:23:06Z

@tedpatrick That would depend exactly what the 'sync' and 'async' modes do, which I don't think we're clear on yet, or at least I'm not. Let's get together a proposal around that, and we can revive/merge/kill this PR if the code becomes useful in that endeavor.

JeffersGlass · 2022-11-09T03:58:15Z

The code in this PR has been folded into #928 - with the deprecation of Implicit Coroutine Scheduling/top-level-await, the code developed here is being used to show a useful error message if the user still tries to use Top-Level Await abilities.

Closing in favor of that PR.

JeffersGlass added 12 commits August 18, 2022 14:20

Add ast-based async detection

f451a82

Move async node tuple outside check loop

d637a18

Define is_async function at runtime init

6348f51

Only check ast for top level await if asyncio not in source text

43177ad

Await statements inside async defs are do not require top level await

7ee2660

Add ast-based async detection

691ec36

Move async node tuple outside check loop

faaa388

Define is_async function at runtime init

789d474

Only check ast for top level await if asyncio not in source text

e6b9143

Await statements inside async defs are do not require top level await

4b38b79

Merge branch 'async-detection' of https://github.com/JeffersGlass/pys…

6caa916

…cript into async-detection

Remove asyncio text in source detection

57c059a

antocuni mentioned this pull request Sep 6, 2022

async <py-script> tags are blocking the execution flow #751

Closed

marimeireles added the waiting on feedback Issue or PR waiting on feedback from core team label Sep 12, 2022

JeffersGlass mentioned this pull request Nov 1, 2022

PyodideRuntime.run should not await #879

Closed

JeffersGlass closed this Nov 9, 2022

JeffersGlass added status: superseded PR that has been superseded by another PR and removed waiting on feedback Issue or PR waiting on feedback from core team labels Nov 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Detection of Scripts that Require Top-Level Await #715

Improve Detection of Scripts that Require Top-Level Await #715

JeffersGlass commented Aug 23, 2022 •

edited

Loading

JeffersGlass commented Aug 24, 2022

fpliger commented Aug 24, 2022

hoodmane commented Aug 24, 2022

madhur-tandon commented Aug 25, 2022

JeffersGlass commented Aug 25, 2022

hoodmane commented Aug 25, 2022

JeffersGlass commented Aug 25, 2022 •

edited

Loading

hoodmane commented Aug 25, 2022

antocuni commented Aug 30, 2022

JeffersGlass commented Aug 31, 2022

JeffersGlass commented Sep 30, 2022

tedpatrick commented Oct 3, 2022

JeffersGlass commented Oct 3, 2022

JeffersGlass commented Nov 9, 2022

Improve Detection of Scripts that Require Top-Level Await #715

Improve Detection of Scripts that Require Top-Level Await #715

Conversation

JeffersGlass commented Aug 23, 2022 • edited Loading

Issue

Solution

Testing

Timing

Efficiency

JeffersGlass commented Aug 24, 2022

fpliger commented Aug 24, 2022

hoodmane commented Aug 24, 2022

madhur-tandon commented Aug 25, 2022

JeffersGlass commented Aug 25, 2022

hoodmane commented Aug 25, 2022

JeffersGlass commented Aug 25, 2022 • edited Loading

hoodmane commented Aug 25, 2022

antocuni commented Aug 30, 2022

JeffersGlass commented Aug 31, 2022

JeffersGlass commented Sep 30, 2022

tedpatrick commented Oct 3, 2022

JeffersGlass commented Oct 3, 2022

JeffersGlass commented Nov 9, 2022

JeffersGlass commented Aug 23, 2022 •

edited

Loading

JeffersGlass commented Aug 25, 2022 •

edited

Loading