Basic after-parse tokenization interface #221

c42f · 2023-03-16T07:31:55Z

Implement a tokenize() function which retreives the tokens after parsing.

Going through the parser isn't hugely more expensive than plain tokenization, and allows us to be more precise and complete.

For example it automatically:

Determines when contextual keywords are keywords, vs identifiers. For example, the outer in outer = 1 is an identifier, but a keyword in for outer i = 1:10
Validates numeric literals (eg, detecting overflow cases like 10e1000 and flagging as errors)
Splits or combines ambiguous tokens. For example, making the ... in import ...A three separate . tokens.

codecov · 2023-03-16T07:35:20Z

Codecov Report

Merging #221 (654128f) into main (2720980) will decrease coverage by 0.04%.
The diff coverage is 86.20%.

❗ Current head 654128f differs from pull request most recent head 8b71bbd. Consider uploading reports for the commit 8b71bbd to get more accurate results

@@            Coverage Diff             @@
##             main     #221      +/-   ##
==========================================
- Coverage   96.30%   96.27%   -0.04%     
==========================================
  Files          15       15              
  Lines        3869     3888      +19     
==========================================
+ Hits         3726     3743      +17     
- Misses        143      145       +2

Impacted Files	Coverage Δ
src/JuliaSyntax.jl	`100.00% <ø> (ø)`
src/tokenize.jl	`98.35% <80.00%> (ø)`
src/parser_api.jl	`89.39% <89.47%> (+0.03%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Implement a `tokenize()` function which retreives the tokens *after* parsing. Going through the parser isn't hugely more expensive than plain tokenization, and allows us to be more precise and complete. For example it automatically: * Determines when contextual keywords are keywords, vs identifiers. For example, the `outer` in `outer = 1` is an identifier, but a keyword in `for outer i = 1:10` * Validates numeric literals (eg, detecting overflow cases like `10e1000` and flagging as errors) * Splits or combines ambiguous tokens. For example, making the `...` in `import ...A` three separate `.` tokens.

c42f mentioned this pull request Mar 16, 2023

Use post-parse JuliaSyntax.tokenize() API KristofferC/OhMyREPL.jl#306

Merged

c42f force-pushed the c42f/tokens-API branch from 654128f to 8b71bbd Compare March 17, 2023 10:33

c42f merged commit 36909cd into main Mar 17, 2023
20 of 21 checks passed

c42f deleted the c42f/tokens-API branch March 17, 2023 10:59

KristofferC mentioned this pull request Apr 26, 2023

Make a global dictionary const #255

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Basic after-parse tokenization interface #221

Basic after-parse tokenization interface #221

c42f commented Mar 16, 2023

codecov bot commented Mar 16, 2023 •

edited

Basic after-parse tokenization interface #221

Basic after-parse tokenization interface #221

Conversation

c42f commented Mar 16, 2023

codecov bot commented Mar 16, 2023 • edited

Codecov Report

codecov bot commented Mar 16, 2023 •

edited