Add filename argument to runPython and use linecache for better tracebacks #3993

hoodmane · 2023-07-12T18:57:36Z

As suggested by @antocuni.

Add a CHANGELOG entry
Add / update tests
Add new / update outdated documentation

…backs This adds a filename argument to runPython and runPythonAsync. Also, in eval_code and eval_code_async if the filename is not wrapped in <lt-gt-signs> it will put the source into the linecache so that the tracebacks can show the lines in them.

src/py/_pyodide/_base.py

antocuni

LGTM and it's a BIG improvement over the current situation 🎉.
I have two extra comments, though:

it would be nice to have "good tracebacks by default" for all the people randomly using runPython() without knowing that they need to specify a filename. You could use the same approach used by the good old py.code.Source.compile(): invent a name and stick the lines into linecache. This is the relevant code:
https://github.com/pytest-dev/py/blob/6b219734fcd8d2f6489c2f50585a435b34c029c2/py/_code/source.py#L160-L198

Example of usage (note the filename <0-codegen ...> -- maybe you could use something like <0-runPython ...> or similar):

import py
src = py.code.Source("""
def foo():
    0/0
""")
d = {}
exec(src.compile(), d)
d['foo']()

$ python /tmp/foo.py 
Traceback (most recent call last):
  File "/tmp/foo.py", line 8, in <module>
    d['foo']()
  File "<0-codegen /tmp/foo.py:7>", line 3, in foo
ZeroDivisionError: division by zero

I know that this makes the logic harder so maybe it's worth a separate PR, but I think that for pyscript we want the ability of specifying the same filename with different line ranges: e.g.:

<script type="py"> ... </script>
...
<script type="py"> ... </script>

Ideally, we want to use runPython(..., {filename: "index.html", lineno: 10}) for the first, and runPython(..., {filename: "index.html", lineno: 20}) for the second, I think?

src/py/_pyodide/_base.py

ryanking13

Thanks, @hoodmane! It generally looks good to me.

I also generally agree with Antonio's review (thanks for the review by the way). I think this PR is okay as-is, then we can add more fine-grained control of tracebacks in a separate PRs.

hoodmane · 2023-07-14T00:29:48Z

@antocuni Thanks for the suggestions. I agree we should address these in followups. I was trying to start with something really simple. =)

I think if we want to use something wrapped in <> for the temporary names, we'll need to patch out these two lines of linecache:
https://github.com/python/cpython/blob/main/Lib/linecache.py#L88-L89
Though I have nothing against making a patch like that.

antocuni · 2023-07-14T08:23:11Z

I think if we want to use something wrapped in <> for the temporary names, we'll need to patch out these two lines of linecache:

uh? No, you can just set linecache.cache directly, and bypass updatecache() entirely:

import linecache

def set_linecache_lines(filename, src):
    lines = src.splitlines()
    linecache.cache[filename] = (1, None, lines, filename)

def magic_compile(filename, src):
    codeobj = compile(src, filename, 'exec')
    set_linecache_lines(filename, src)
    return codeobj

src = """
def foo():
    0/0
"""
codeobj = magic_compile('<0-runPython>', src)
d = {}
exec(codeobj, d)
d['foo']()

hoodmane · 2023-07-14T15:16:10Z

You can but if you put a print statement you'll see that the function is never called.

hoodmane · 2023-07-14T15:19:43Z

Unless I was messing something up, let me try your code...

antocuni · 2023-07-14T15:23:12Z

You can but if you put a print statement you'll see that the function is never called.

which function?
Note that I'm not using the "lazy cache" functionality here, I'm just putting lines inside the cache, bypassing all the rest :)

hoodmane · 2023-07-14T15:25:20Z

Ah that makes sense.

hoodmane · 2023-07-14T15:26:00Z

Will update this PR to use that approach soon, thanks @antocuni.

hoodmane · 2023-07-16T04:00:48Z

Two possible problems:

I don't know that we want to store all strings that we "runPython" by default. Probably this should be opt in.
If filenames are reused, this will cause incorrect tracebacks

A zany scheme to solve 2: by experiment, if a character is unprintable then it is left out from the filename in the traceback. Every character from 0x10 to 0x20 is unprintable. So we could use:
"<eval>0x11", "<eval>0x12", ... "<eval>0x1F", "<eval>0x110x10", etc.

Where the pattern is on the nth time we use a certain file name we encode n in hexadecimal and then append to the string one byte of value 0x1v for each hex digit of n. You'll probably hate me for suggesting this though @antocuni.

hoodmane · 2023-07-16T04:04:35Z

So explicitly the code would be for the nth use of filename:

garbage =  bytes(16 + int(b, 16) for b in hex(num_uses_so_far)[2:]).decode()
filename_with_appended_garbage = filename + garbage

ryanking13 · 2023-07-17T12:30:06Z

I don't know that we want to store all strings that we "runPython" by default. Probably this should be opt in.

Makes sense to me.

If filenames are reused, this will cause incorrect tracebacks

IMO, this is not a big problem. I think people who use this option should know what they are doing and should not reuse the filename.

A zany scheme to solve 2: by experiment, if a character is unprintable then it is left out from the filename in the traceback. Every character from 0x10 to 0x20 is unprintable. So we could use:
"0x11", "0x12", ... "0x1F", "0x110x10", etc.

no... 😫

hoodmane commented Jul 12, 2023

View reviewed changes

src/py/_pyodide/_base.py Outdated Show resolved Hide resolved

antocuni reviewed Jul 13, 2023

View reviewed changes

src/py/_pyodide/_base.py Outdated Show resolved Hide resolved

ryanking13 approved these changes Jul 13, 2023

View reviewed changes

Apply antocuni's suggestion

904719c

hoodmane added 2 commits July 13, 2023 17:30

Update changelog

3704604

Merge branch 'main' into filename

847898b

Merge branch 'main' into filename

bb6598c

hoodmane merged commit 311fa90 into pyodide:main Aug 28, 2023
5 of 8 checks passed

hoodmane deleted the filename branch August 28, 2023 11:18

antocuni mentioned this pull request Sep 13, 2023

Use the new pyodide feature for better tracebacks, as soon as it's available pyscript/polyscript#36

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add filename argument to runPython and use linecache for better tracebacks #3993

Add filename argument to runPython and use linecache for better tracebacks #3993

hoodmane commented Jul 12, 2023 •

edited

antocuni left a comment •

edited

ryanking13 left a comment

hoodmane commented Jul 14, 2023

antocuni commented Jul 14, 2023

hoodmane commented Jul 14, 2023

hoodmane commented Jul 14, 2023

antocuni commented Jul 14, 2023

hoodmane commented Jul 14, 2023

hoodmane commented Jul 14, 2023

hoodmane commented Jul 16, 2023

hoodmane commented Jul 16, 2023

ryanking13 commented Jul 17, 2023 •

edited

Add filename argument to runPython and use linecache for better tracebacks #3993

Add filename argument to runPython and use linecache for better tracebacks #3993

Conversation

hoodmane commented Jul 12, 2023 • edited

antocuni left a comment • edited

Choose a reason for hiding this comment

ryanking13 left a comment

Choose a reason for hiding this comment

hoodmane commented Jul 14, 2023

antocuni commented Jul 14, 2023

hoodmane commented Jul 14, 2023

hoodmane commented Jul 14, 2023

antocuni commented Jul 14, 2023

hoodmane commented Jul 14, 2023

hoodmane commented Jul 14, 2023

hoodmane commented Jul 16, 2023

hoodmane commented Jul 16, 2023

ryanking13 commented Jul 17, 2023 • edited

hoodmane commented Jul 12, 2023 •

edited

antocuni left a comment •

edited

ryanking13 commented Jul 17, 2023 •

edited