Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-generating lexer.rs and parser.rs #17

Closed
tcr opened this issue Jul 9, 2017 · 3 comments
Closed

Auto-generating lexer.rs and parser.rs #17

tcr opened this issue Jul 9, 2017 · 3 comments

Comments

@tcr
Copy link
Owner

tcr commented Jul 9, 2017

It's important that parser-c be able to port fixes from upstream (language-c) even though they weren't written in Rust. For the most part, source code changes can be ported over manually, assuming that most patches will are small.

But, there are exceptions. lexer.rs and parser.rs are converted from Lexer.y and Parser.x, which are inputs to a Haskell-specific lexer and parser generator (Happy and Alex, respectively). Because these generate a large amount of Haskell code, the corresponding changes to Rust code must also be massive, so porting it over manually is prohibitive.

Corollary is not capable of doing this complex a conversion any time soon. (The current lexer.rs and parser.rs were heavily edited by hand.) These are the solutions I came up with instead:

Short term solution: We use Haskell's Happy and Alex libraries to do the code generation, but modify their codemod files to output Rust instead: ProduceCode.lhs and Output.hs. Inline Haskell code in the source .y and .x files must also changed to be Rust.

The benefit of this setup is that it's simple, and we can keep using it indefinitely. Requiring Haskell for the build step will not mean Haskell is required for consumers of the library—they'll only get the generated Rust code.

Long term solution: After this, I see three obvious choices:

  1. Do nothing, just leave our Happy/Alex Haskell binaries with output==rust as part of the source tree. This introduces a Haskell dependency, but only when these files need to be regenerated.
  2. Pursue a wholly Rust solution. Convert Happy and Alex into Rust libraries, with a mixture of Corollary and manual editing. This is a lot of work to support just a single-use toolchain though.
  3. Convert the parser and lexer source files into an equivalent parser and lexer generator in the Rust ecosystem (think nom, LALRPOP, etc.) This is the dream solution as it allows us to leverage tools inside the Rust community, and allows the source files to be editable by anyone in that community. But on the other hand, we then cannot easily pull updates from upstream when they are made, creating a lot of additional work (and potential for more bugs!) for possibly little gain.

There may be better solutions than either of the above!

@tcr
Copy link
Owner Author

tcr commented Jul 9, 2017

To get started on the short term solution:

  • Clone into a directory both Happy and Alex and then install them.
  • Write a script that generates using Happy a Lexer.hs from the source Lexer.y (in reference/language-c/src/parser) and uses Alex to generate a Parser.hs from the source Parser.x.
  • Rename these files lexer.rs and parser.rs, and have them saved to src/parser/.
  • Modify any Haskell source code in Lexer.y and Parser.x to be the equivalent Rust code that exists in lexer.rs and parser.rs on master.
  • Edit ProduceCode.lhs and Output.hs to produce similar code as what exists on master.
  • Try to build parser-c with the regenerated code; keep repeating the cycle of editing the source files or the generator code until it works.
  • Move the modified Happy and Alex libraries in-tree, then write a top-level script that locally invokes these Haskell libs to generate src/parser/lexer.rs and src/parser/parser.rs.

birkenfeld added a commit that referenced this issue Jul 9, 2017
Fork can be found here: github.com:birkenfeld/alex-rust

ref #17
@birkenfeld
Copy link
Collaborator

Work started with Alex in the "lexer" branch. Now I'll try my luck with Happy.

birkenfeld added a commit that referenced this issue Jul 9, 2017
Fork can be found here: github.com:birkenfeld/alex-rust

ref #17
@birkenfeld
Copy link
Collaborator

Ok, parser is done in the "parser" branch. Sneaky proc-macro making clones behind my back!

birkenfeld added a commit that referenced this issue Jul 9, 2017
Fork can be found here: github.com:birkenfeld/happy-rust

Fixes #17
tcr pushed a commit that referenced this issue Jul 10, 2017
Fork can be found here: github.com:birkenfeld/alex-rust

ref #17
@tcr tcr closed this as completed in 174d4c5 Jul 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants