Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stateful Lexing Support #706

Open
solomon-b opened this issue Feb 17, 2023 · 1 comment
Open

Stateful Lexing Support #706

solomon-b opened this issue Feb 17, 2023 · 1 comment

Comments

@solomon-b
Copy link

The custom lexer page of the Lalrpop tutorial mentions:

Tokens that require tracking internal lexer state

But does not give any say if Lalrpop supports stateful lexers.

I'm coming from Haskell's Alex lexer which has a feature called "start codes" for this. You can annotate lexer rules with start codes so that they only apply when the lexer's state matches that start code. You can then push and pop start codes from a stack to set the state of the lexer.

This is a really helpful in certain situations such as lexing string templates. Does Lalrpop supports anything like this?

@Marwes
Copy link
Contributor

Marwes commented Feb 17, 2023

See #195, LALRPOP does not support this right now. I prototyped a feature for this in #673 but I have no plans to finish that anymore.

nwalfield added a commit to nwalfield/lalrpop that referenced this issue Jun 26, 2024
In some languages, it is necessary for the parser to control the
lexer's mode.  This is possible, but non-obvious.

This commit adds a simple example that shows how to parse the "list of
length-value" language in which values are encoded as a length,
followed by a colon, and then the literal number of bytes, e.g.,
`2:hi`.  To parse this, the parser needs to tell the lexer to return
the next `n` literal bytes when it sees the length prefix.

See lalrpop#802, lalrpop#706.
github-merge-queue bot pushed a commit that referenced this issue Jul 8, 2024
…916)

* Add an example showing how the parser can control the lexer's mode.

In some languages, it is necessary for the parser to control the
lexer's mode.  This is possible, but non-obvious.

This commit adds a simple example that shows how to parse the "list of
length-value" language in which values are encoded as a length,
followed by a colon, and then the literal number of bytes, e.g.,
`2:hi`.  To parse this, the parser needs to tell the lexer to return
the next `n` literal bytes when it sees the length prefix.

See #802, #706.

* run clippy --fix

* remove explicit into_iter

---------

Co-authored-by: Patrick LaFontaine <32135464+Pat-Lafon@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants