-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interim Transpiler #20
Comments
Cool, and good that you've already thought through some alternatives in the design space. How were you planning to trigger the transpiler? As an import hook, or as a codec? |
My plan was to use an import hook, but I wasn't aware codecs could be used for this purpose, so that could be a good alternative as well. |
See https://peps.python.org/pep-0263/; you can register a codec with an arbitrary name. This would take the place of your "marker import". IIRC there are some issues with getting the codec registered when your package is installed though. |
I think a codec could be better for a lot of reasons:
Would the approach be to use a |
Yeah, if the .py file and the .pyc file match, the source is never read, so the codec isn't run. Pure win! I suspect that there are some problems with We used the codec trick at Dropbox for pyxl3, and IIRC the rewrite was done very differently, to ensure that the line numbers matched. I think the "parsing" was probably done with a regular expression. That should work here too. |
What would be the story for tracebacks and getting back to lines in source? |
The traceback code looks up the line number in the untranslated source (linecache.py just opens the file in text mode, no encoding parameter). So pyxl ensures that the translated line numbers match the original line numbers (but the column offsets don't). I think the translation Ryan proposes should be able to preserve line numbers as well. |
Crazy idea, but what if... my_tag @ f"my {super} {custom} string" Became my_tag(( super, custom), raw=("super", "custom"), conv=(None, None), formatspec=(None, None), strings=("my ", " ", " string")) Where the my_tag @ f"""
my
extra {super}
{custom}
string
""" Would become: my_tag((
super,
custom), raw=("super", "custom"), conv=(None, None), formatspec=(None, None), strings=("my ", " ", " string") |
Shoot! The evaluation of the expressions would no longer be lazy. |
The last way I can think of to preserve column offsets in tracebacks is by passing information about the location of expressions in the original source and using that to modify tracebacks which arise within the tag function itself. my_tag("my ", (lambda: custom, "custom", None, None, 1, 16), " string") Where the def tag(func):
def wrapper(*args, src_info=None):
new_args = []
for a in args:
match a:
case str():
new_args.append(a)
case getvalue, src, conv, spec, *src_info:
getvalue = modify_tracebacks(getvalue, *src_info)
new_args.append((getvalue, src, conv, spec))
return func(*args)
return wrapper
def modify_tracebacks(getvalue, lineno=None, col_offset=None):
if not (lineno and col_offset):
return getvalue
def wrapper():
try:
return getvalue()
except Exception as error:
# modify the traceback with the appropriate lineno and col_offset somehow
error.with_traceback(...)
return new_getvalue If this worked, it'd mean that the transpiler wouldn't even need to worry about preserving line numbers in the areas of code it modified. |
IMO it's not worth worrying about column offsets for the initial prototype. |
@rmorshea If there's a way for me to join in with what you're doing and re-parent my stuff on your interim transpiler, let me know. |
Will do. I might have time to create a repo for this tonight, but otherwise I won't be able to do much until next week. |
@rmorshea while I like the syntax, it's problematic as I mention here (#3 (comment)) - we need to preserve thunks because they give the control on interpolation. |
So, I managed to create a custom tagstr encoding. Unfortunately though, this doesn't play well with Black since it decodes the file before reformatting it. Thus, the version that gets saved is the transformed version, not the one the user authored. Anyone have ideas on how this could be avoided? |
Ok, the hack I came up with involves stuffing the original source at the end of the file. The tagstr encoder then searches for the original source and returns that. This solves the problem of It would be nice if there were a way to tell if the codec was running while formatting code so users didn't have to set the environment variable, but this works for now I suppose. |
Wow, the last time I used the encoding hack, things like Black weren't an issue... I guess an import hook might be better. |
Welp, it's published.
The hook expects there to be an
I'll work on adding an IPython cell magic so this can be used in Jupyter Notebooks/Lab. Not really sure if there's a similar way to inject the transformer into the standard Python REPL though. |
I threw this together pretty quickly so there's definitely gonna be some bugs and rough edges. |
Quite interesting @rmorshea any chance you're at PyCon? I'm sprinting the first day. |
Unfortunately I am not. Would love to participate remotely if that's possible. Feel free to email me: ryan.morshead@gmail.com |
I will also be in person at the sprints through Monday afternoon. This will be a chance for me to get back into this work - I have been very busy with other things. Fortunately I feel like discussing another issue has started to page back into my mind what we have been trying to do here 😁 |
Published another release of the
I still feel like I'm doing something wrong in the import hook so I suspect there are probably other latent issues to be fixed. It's also worth noting that the # tagstr: on
name = "world"
print @ f"hello {name}!"
|
Now that @rmorshea has published the transpolar, can this ticket get closed? |
I think so |
I've been wishing I could use tag strings lately and so to satisfy that craving I thought it would be cool to create an import-time transpiler that would rewrite:
To be:
The syntax seems clever in a few different ways:
Something like this seems like a rather natural extension of @pauleveritt's work in viewdom.
Implementation Details
Some potential issues I've thought of and ways to solve them.
Static Typing
To deal with static type analyzers complaining about the unsupported
@
operator, tag functions can be decorated with:An alternate syntax of
my_tag(f"...")
would not require this typing hack since tag functions must already accept*args: str | Thunk
. The main problem here is that there are probably many times wheresome_function(f"...")
would show up in a normal code-base. Thus it would be hard to determine whether any given instance of that syntax ought to be transpiled. Solving this would require users to mark which identifiers should be treated as tags - perhaps with a line likeset_tags("my_tag")
. This seems more inconvenient than having the tag author add the aforementioned decorator though.Performance
To avoid transpiling every module, users would need to indicate which ones should be rewritten at import-time. This could be done by having the user import this library somewhere at the top of their module. At import time, the transpiler, before parsing the file, would then scan it for a line like
import <this_library>
orfrom <this_library> import ...
.The text was updated successfully, but these errors were encountered: