Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lx lookahead isn't always necessary #112

Open
jameysharp opened this issue Feb 20, 2019 · 0 comments
Open

lx lookahead isn't always necessary #112

jameysharp opened this issue Feb 20, 2019 · 0 comments

Comments

@jameysharp
Copy link
Contributor

lx -l c doesn't generate exactly the code I'd prefer for accepting states which have no further out-transitions.

For example, my sample language specification in #111 generates four states. In state S2, we've already seen .. so if we see a third . then we've matched an $ellipsis token. Currently, in that case lx generates a transition to a state S3 and continues with a new iteration of the loop, which calls lx_getc and then unconditionally passes the result to lx_ungetc before returning TOK_ELLIPSIS.

But instead of transitioning to a new state and reading then unreading a new character, when S2 matches a third ., it could immediately return TOK_ELLIPSIS.

This applies to any accepting state that has no out-transitions. From such a state, reading any additional character will always trigger an error transition. At that point the state machine must roll back to the most recent accepting state and return that, and we know statically which state that was.

This is a minor optimization, but I mostly care because the generated code would be slightly easier to understand if these unnecessary extra states were removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants