-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement basic Lexer #102
Comments
@plietar I think we should reuse, or at least heavily borrow from your prototype. |
The prototype doesn’t really have a lexer, it just feeds the source input into pegmatite. It’s actually quite broken, where things like |
Pegmatite doesn't really have a lexer vs parser separation. Everything is a sequence of characters. Rules either refer to expressions of characters or to other rules. Though I strongly recommend against reusing any of the Pegmatite code, I actually like this model. The distinction makes sense in a push model, where the lexer produces a stream of tokens and drives a state machine to produce reductions. Both Clang and ponyc have a pull model (at least, if I understood @sylvanc's explanation of ponyc correctly), where the parser asks the lexter to interpret the next part of the stream as a particular token. This is necessary for any complex grammar (e.g. one with context-dependent keywords). In this model, the 'parser' is really just a set of helpful routines for consuming one or more characters from the stream. It probably is useful to have something like Clang's |
I like the pull model, so would be keen to keep that route. |
We just had a long discussion about the grammar, lexing and parsing, and @sylvanc has a prototype that can produce an AST from Verona code that we want to use instead of writing a whole new one. This is in the interest of a better division of work, as @mjp41 put it, allowing us to work on different levels of the project at the same time. It is not a statement of what we want in the end. The pull model really is a great idea and I think we should pursue, but that can be done in parallel with other parts of the project if we already have a working AST producer. Sylvan's prototype also allows us to change the grammar on the fly, which is really handy at this stage of development. So, I'll close this ticket for now. Once we have a better idea on how we'll do this, we should create a new one, with more specific tasks. |
Who is 'we' in this context? |
@davidchisnall sorry for not inviting you. It started out as a ten minute chat about the prefix/infix operators, which snowballed into over an hour. |
Tokenise the source in a simple structure. Examine the trade-offs of adding lexical blocks {} as lists of tokens instead of using a simple stream and keeping the context in the parser.
Acceptance criteria:
Not included:
The text was updated successfully, but these errors were encountered: