Shared Parser Library #10765

matklad · 2021-11-14T13:27:12Z

So.... I've been trying to make rustc/rust-analyzer shared parser library for the past four years (rust-analyzer originally intended to be just a parser library), and the results have been meagre -- we share the lexer, and that's it. My theory is that's due to org stuff -- parser/AST has wide APIs, so extracing that is a whole lot of poorly factorable work. As such, other, more immediate things tend to always get higher priority. But today rust-analyzer feels like it is on a relatively stable footing, so it seems like a good opportunity to try to move the giant ship for real.

Let's see what we need to do to achieve that:

Have isolated, IDE friendly Rust parsing library.

separate repo from ra? Or at least /libs folder?
get rid of Source Sync traits, use more direct API

Figure out the best way to integrate with rustc.

Tree -> Tree transformation (there was PR to rustc proving feasibility)
- need to stabilize & finalize rowan for that
- perf?
Parser -> (Tree1, Tree2)
- how to emit typed ast out of untyped parser?
  - ungrammar
Concede that sharing "nice" library is infeasible, and just hack today's parser to emit CST via cfg flags

Do cleanups on rustc side.

harmonize token tree model (always use split tokens)
reduce dependencies on global state (remove code-map from parser, allow for shared-nothing parallel parsing)
how to handle Interpolated tokens?

Implement the merge

??? and lots of work

Tasks:

The text was updated successfully, but these errors were encountered:

10995: internal: switch from trait-based TokenSource to simple struct of arrays r=matklad a=matklad cc #10765 The idea here is to try to simplify the interface as best as we can. The original trait-based approach is a bit over-engineered and hard to debug. Here, we replace callback with just data. The next PR in series will replace the output `TreeSink` trait with data as well. The biggest drawback here is that we now require to materialize all parser's input up-front. This is a bad fit for macro by example: when you parse `$e:expr`, you might consume only part of the input. However, today's trait-based solution doesn't really help -- we were already materializing the whole thing! So, let's keep it simple! Co-authored-by: Aleksey Kladov <aleksey.kladov@gmail.com>

11117: internal: replace TreeSink with a data structure r=matklad a=matklad The general theme of this is to make parser a better independent library. The specific thing we do here is replacing callback based TreeSink with a data structure. That is, rather than calling user-provided tree construction methods, the parser now spits out a very bare-bones tree, effectively a log of a DFS traversal. This makes the parser usable without any *specifc* tree sink, and allows us to, eg, move tests into this crate. Now, it's also true that this is a distinction without a difference, as the old and the new interface are equivalent in expressiveness. Still, this new thing seems somewhat simpler. But yeah, I admit I don't have a suuper strong motivation here, just a hunch that this is better. cc #10765 Co-authored-by: Aleksey Kladov <aleksey.kladov@gmail.com>

…lnicola Try to update parser/event doc `TokenSource` and `TreeSink` has been refactored as part of #10765, they no longer exist in code repo. This pr tries to remove them from event module level comment to prevent confusion.

matklad mentioned this issue Dec 12, 2021

internal: switch from trait-based TokenSource to simple struct of arrays #10995

Merged

matklad mentioned this issue Dec 25, 2021

internal: replace TreeSink with a data structure #11117

Merged

oxalica mentioned this issue May 8, 2023

AST library to let nix frontend reusable (for format, lsp, tooling, ... etc) NixOS/nix#8304

Closed

Kangaxx-0 mentioned this issue Nov 9, 2023

minor: Try to update parser/event doc #15853

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shared Parser Library #10765

Shared Parser Library #10765

matklad commented Nov 14, 2021 •

edited

Loading

Shared Parser Library #10765

Shared Parser Library #10765

Comments

matklad commented Nov 14, 2021 • edited Loading

matklad commented Nov 14, 2021 •

edited

Loading