Skip to content
This repository has been archived by the owner on Jul 27, 2023. It is now read-only.

Lex Jupyter Magic in assignment value position #30

Merged
merged 5 commits into from
Jul 24, 2023

Conversation

dhruvmanila
Copy link
Member

@dhruvmanila dhruvmanila commented Jul 20, 2023

Emit MagicCommand token when it is the assignment value1 i.e., on the right side of an assignment statement.

Examples:

pwd = !pwd
foo = %timeit a = b
bar = %timeit a % 3
baz = %matplotlib \
        inline"

Footnotes

  1. Only % and ! are valid in that position, other magic kinds are not valid

@dhruvmanila
Copy link
Member Author

dhruvmanila commented Jul 20, 2023

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@@ -1292,6 +1309,7 @@ where

// Helper function to emit a lexed token to the queue of tokens.
fn emit(&mut self, spanned: Spanned) {
self.last_emitted = Some(spanned.0.clone());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit expensive as we'll be cloning every single token for the entire source code. Can we maybe look at the pending token instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's very expensive because it even has to clone the heap allocations of the tokens.

Some optimisations: Only store the token kind (see ruff-python-ast TokenKind) which is only a u8.

Track the context explicitly:

https://github.com/mozilla/sweet.js/wiki/design and an implementation of it https://github.com/swc-project/swc/blob/026101b71e942ad2d7ff906cb68bf46345d77712/crates/swc_ecma_parser/src/lexer/state.rs#L756-L797

In your case it seems all you really care about is whether you've seen an equal token. But it would be nice if the chosen approach could eventually replace the SoftKeywordTokenizer as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your case it seems all you really care about is whether you've seen an equal token.

Yes, I think having a boolean stating "Was the last token an Equal?" seems like the simplest option. I'll explore a bit around lexer context and state after this as it's pretty interesting :)

@dhruvmanila dhruvmanila marked this pull request as ready for review July 20, 2023 05:47
Comment on lines +1687 to +1691
pwd = !pwd
foo = %timeit a = b
bar = %timeit a % 3
baz = %matplotlib \
inline"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the valid positions for magic command token in terms of an assignment statement.

Comment on lines +1748 to +1758
let source = r"
# Other magic kinds are not valid here (can't test `foo = ?str` because '?' is not a valid token)
foo = /func
foo = ;func
foo = ,func

(foo == %timeit a = b)
(foo := %timeit a = b)
def f(arg=%timeit a = b):
pass"
.trim();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't valid: parenthesized, any token other than Equal, nested

@dhruvmanila dhruvmanila changed the base branch from main to dhruv/trailing-whitespace July 21, 2023 03:37
@dhruvmanila dhruvmanila changed the base branch from dhruv/trailing-whitespace to main July 24, 2023 04:06
@dhruvmanila dhruvmanila merged commit e363fb8 into main Jul 24, 2023
3 checks passed
@dhruvmanila dhruvmanila deleted the dhruv/assign-line-magic branch July 24, 2023 12:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants