Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: libsyntax2.0 #2256
Conversation
This comment has been minimized.
This comment has been minimized.
|
The fact that rust has neither a stable official correct parser nor an official correct grammar is very annoying. I think this is a major step towards making tooling easier to contribute to and more complete/stable. |
This comment has been minimized.
This comment has been minimized.
|
Note that that the RFC explicitly does not propose to create an official grammar. There's https://github.com/nox/rust-rfcs/blob/master/text/1331-grammar-is-canonical.md for that. However, I do hope to produce a comprehensive, progressive test-suite for rust parsers. |
eddyb
reviewed
Dec 23, 2017
| and other tools would require massive refactorings. | ||
|
|
||
| - Proposed syntax tree requires to keep the original source code | ||
| available, which might increase memory usage of the |
This comment has been minimized.
This comment has been minimized.
eddyb
Dec 23, 2017
Member
We already do that, and Span refers to the source - we even load source of other crates on-demand (ignoring the contents if their hash has changed).
In general, I'd recommend discussing the existing implementation details of rustc with @rust-lang/compiler.
This comment has been minimized.
This comment has been minimized.
|
While this model doesn't use ADTs like So as long as those 3 tasks can be performed without much difficulty, the actual representation of the AST doesn't matter much to the compiler itself. My preference would be something auto-generated from an official grammar, such that everything matches the names of rules and whatnot from there. I am biased towards schemes which allow reuse of parse results, e.g. for when a macro or derive uses an input expression / type multiple times. |
Manishearth
reviewed
Dec 23, 2017
|
So, firstly, I was under the impression that That said, syn doesn't handle comments. But still, having the library live in the crates ecosystem sounds like the best way to deal with this. For one, we don't have to tiptoe around having a typed API because it's fine to do major version updates. For another, this way it doesn't impact the compiler; I'm concerned that this will lead to performance and readability issues since we're losing a lot of the typed AST if merged into the compiler. As an "officially maintained crate" this makes sense, but less so as a replacement of libsyntax IMO. |
| } | ||
| } | ||
| pub const ERROR: NodeKind = NodeKind(0); |
This comment has been minimized.
This comment has been minimized.
Manishearth
Dec 23, 2017
Member
I feel like we should do this with extensible enums instead. Those don't exist in rust, but can be added as an internal only feature.
This comment has been minimized.
This comment has been minimized.
matklad
Dec 24, 2017
•
Author
Member
Yeah, one of the open implementation questions is how to store error payload. There are several workable approaches, but none of them screams being the correct one.
However, I do think that there should be only one error node kind in the ast, whose purpose is to contain tokens which are skipped by parse during error recovery.
This comment has been minimized.
This comment has been minimized.
|
cc @mystor |
This comment has been minimized.
This comment has been minimized.
mystor
commented
Dec 23, 2017
|
For example,
It's likely that at some point the ecosystem will either grow another rust parser, or syn will grow enough feature flags that it can start to pick up these sorts of use cases. I'm not sure which solution is preferable. In general I think I support the idea of making a feature-complete alternative to libsyntax outside of the compiler, and I think it fills a different niche than |
This comment has been minimized.
This comment has been minimized.
mystor
commented
Dec 23, 2017
|
Also, cc @dtolnay if he hasn't been pinged already, as he wrote most of |
scottmcm
added
the
T-compiler
label
Dec 24, 2017
This comment has been minimized.
This comment has been minimized.
I think we could and should generated typed AST layer, but the parser itself is better be hand-written. Both IntelliJ and fall generate the parsing code as well, which is very useful because maintaining the parser is relatively low-effort. But the generated parsers have suboptimal error reporting, performance and correctness (because Rust already has some funny aspects of the grammar which are not natural to express in a declarative way). While hand-writing a parser takes more effort, I think it's a good investment long-term. And given that we already have a hand-written parser, it may actually save the time :) One problem with the generated AST and hand-written parser is that there's no guarantee that they match, but this is not a big problem in practice: such bugs are easy to notice, fix and test for. |
This comment has been minimized.
This comment has been minimized.
|
By the way, I am still looking for a good term describing such non-Abstract Syntax Trees. So if anyone knows one, please let me know! |
This comment has been minimized.
This comment has been minimized.
Reusing subtrees is also important for incremental reparsing, so I am very interested in allowing this as well. To do this, we should store lengths instead of offsets in syntax tree nodes, and store (lazily-calculated) offsets a the file level. |
This comment has been minimized.
This comment has been minimized.
@Manishearth please take a closer look at the Typed Tree section. I claim that it's possible to regain full type safety, by layering newtype wrappers on top of raw syntax tree. It's easier to see this in action, so here's some example code from fall. This is the code for "add impl" action, which turns this: struct Foo<X, Y: Clone> {}into this struct Foo<X, Y: Clone> {}
impl<X, Y: Clone> Foo<X, Y> {
}Note how precisely the function in question characterizes the set of applicable syntactical constructs: It's true though that there are some performance considerations: there's no identifier interning built in, and accessing a child of particular type is a liner scan of all children. However, this tree should be converted to HIR early on, and it's always possible to create a side table which maps syntax tree nodes to arbitrary cached data, because the node is an integer index. |
This comment has been minimized.
This comment has been minimized.
|
@Manishearth I'd like to reiterate that "stable libsyntax" is an explicit non-goal of this RFC. The intended clients of the library (if everything goes well, of which there's no certainty) are I also agree with @mystor that |
This comment has been minimized.
This comment has been minimized.
To expand on this a bit, I think the ideal would be to have two parsers in the compiler.
The command line compiler would then use LR, and only LR parser for compilation. However, if LR parser fails to parse some file, compiler would reparse it with the libsyntax2.0 to give a nice error report. The IDE stuff would use libsyntax2.0. |
This comment has been minimized.
This comment has been minimized.
|
@matklad That's sounds like a jolly good idea! An added benefit is that you then may define an |
This comment has been minimized.
This comment has been minimized.
Oh, I saw that, I'm saying we don't need that weird intermediate layer if we have extensible enums (or if we keep it out of tree so that it can be versioned independently)
Well, if we're not trying to stabilize it, then why the major changes? Your main point against current libsyntax is that it's not a pure function -- but this can fixed in a focused RFC! I think this RFC as currently stated needs a lot more work clarifying what axes we're evaluating things against, and why the new proposal is better under those axes. I also feel like the parser that IDEs and rustfmt need is different from the parser that the compiler needs (IDEs and rustfmt want parsers that are more lax with dealing with errors, and preserve comments); which only further makes me feel like having a separate officially-maintained library of which rustc is not a client would help. You yourself put forth two parsers -- only one of these is necessary in the compiler -- why not maintain this outside of tree? TLDR, before moving forward here I'd like to see:
|
This comment has been minimized.
This comment has been minimized.
|
@Centril this is a little bit off topic, but the property based testing should proceed in slightly different manner :) First, note that the proposed tree structure does not really have a So the better property would be And to generate arbitrary text, you can take existing rust code for valid inputs, and for invalid inputs you can some valid code and cut&paste fragments of it over itself. This is how fall checks that incremental and full reparsers are equivalent (1, 2). |
This comment has been minimized.
This comment has been minimized.
|
Yeah, totally agree that the motivation section needs more work! My main mistake is that I overemphasized "pure function" part, while my main point is actually that it's difficult to build great tooling on top of current lossy libsyntax. However, I don't want to dive too deep into discussion why libsyntax is worse for tooling than the proposed solution simply because libsyntax actually exists and libsyntax2.0 is a little more than a pipe dream at the moment. And essentially what this RFC proposes is "let's agree that building a prototype of libsytnax2.0 using the proposed tree structure is a good idea". No harm will be done to the compiler as a part of this RFC :) I am posting this as an RFC and not as a discussion on IRC or internals because it's a rather ambitions project anyway, and I would like to gather community consensus. |
This comment has been minimized.
This comment has been minimized.
Right, but it's not clear as to why the proposed structure is better :) (it would be worth making it very explicit that this RFC does not try to replace libsyntax, just pave a path for something which may eventually replace it but through a different RFC, since otherwise there will be two separate proposals being discussed in tandem) |
This comment has been minimized.
This comment has been minimized.
|
The one question I want to be answered before building a prototype is whether it is at all feasible to use proposed implementation in the compiler, or will macros through a wrench into the works? So I am eagerly waiting for @nrc or @jseyfried to clarify the following points
|
This comment has been minimized.
This comment has been minimized.
|
So, to actual discussion of why the proposed structure is better, and what metrics are used to measure this "better" :) The crucial point here is that, to make writing good tooling possible, the syntax tree must be lossless, it must explicitly account for comment nodes and whitespace. It also must be capable of representing partially-padsed code. For example, for
the parser must produce a function call node for However, I don't think I can back this claim up with something else then "it's obvious that it is supposed to work this way": all IDE stuff I've written was using lossless trees. Maybe I am wrong and it is actually quite possible to create a perfect IDE support on top of conventional AST. But, if we assume "losslessness" as an axiom, then representing syntax as a generic tree which points to ranges in the original string seems to be the minimal and natural approach? The trick with typed wrappers seems more weird of course, but, in my personal experience, its a great way to represent ASTs. @Manishearth, I think I don't fully understand your point about extensible enums, I though it was about making error specifically an enum like The problem with AST is that AST nodes don't naturally form a hierarchy, and you need all of structs, enums and traits to work with them in a type-safe manner. For example, you need a concrete So here are the axis which differentiate between libsyntax and proposed libsyntax2
|
This comment has been minimized.
This comment has been minimized.
I am not entirely sure if IDE and compiler should use separate parsers or a single one (some discussion on internals 1). Originally I was of the opinion the their requirements differ to much, and macros make this whole idea impossible, but now I think that it is in fact feasible to use the single parser. The obvious benefit is of course code reuse. The less obvious benefit is that it's not entirely clear where compiler ends and IDE starts, and the syntax tree seems to be one natural boundary. It's useful to share the actual tree data structure between the compiler and the IDE, because the IDE can reparse file incrementally, and compiler has to do a full reparse (or it has to learn about text editing). It's also interesting to think about how compiler/IDE reports errors. For sure the error must be detected by the compiler. But IDE must be able to suggest a quick fix for it, and quick fixes are all about text editing, file formatting and interacting with the user. So it would be nice if the compiler could collect some context during checks, and then pass this context information to some layer which either prints it to stdout as an error message, or suggest a quick fix in IDE. It would be nice if this context information could use a language of syntax tree! Macing macro expansion to work with IDE tree also allows some nifty features like live debugging macro expansions, for example :) |
This comment has been minimized.
This comment has been minimized.
Nah, this is still useful data And I see why lossless trees work well here! My preferred way of representing this would be to use a complete AST as much as possible, and have nodes like ItemKind::PartialItem, PartialExpr which are more like lex trees with some parsed info when things are partial. But I can see why that's problematic. Still, worth putting this other proposal up there, I don't know if it's actually better -- you're far more experienced here than I
It's not really relevant anymore, but I was under the impression you were doing the untyped tree so that you wouldn't have a stability problem -- and this is better tackled by having a typed tree where you are never allowed to have an exhaustive match on an enum, you must have a wildcard branch. Stability was not the actual motivation here, as you clarified, so this is a moot point
I disagree, but this is pretty subjective anyway Reading through the RFC again with the motivations in mind I think it's much clearer why we should do this. I like the design! Still feel like the "full AST where possible" design might be better but I'm not sure. Might leave more specific comments later. |
Manishearth
approved these changes
Dec 25, 2017
| whitespaces and desugar some syntactic constructs in terms of the | ||
| simpler ones. | ||
|
|
||
| In contrast, for IDEs it is crucial to have a lossless view of the |
This comment has been minimized.
This comment has been minimized.
Manishearth
Dec 25, 2017
Member
Best to include explicit examples, "lossless view" wasn't clear to me till you explained it in the comments.
So something like:
IDEs need to be able to handle partial code like and still help you with autocompletion while continuing to work with the surrounding, non-partial, code. For this to work we need to be able to represent this losslessly, ....
|
|
||
|
|
||
| ``` | ||
| FILE |
This comment has been minimized.
This comment has been minimized.
Manishearth
Dec 25, 2017
Member
I'm wondering: How do we figure out the best way to interpret a partial tree? Should we rely on indentation as a hint? Are there well-known solutions here?
This comment has been minimized.
This comment has been minimized.
matklad
Dec 25, 2017
Author
Member
Here's the array of tricks I am aware of:
- Code is typically typed left-to-right, so it's possible to have "commit points" in grammar/parser. Here's an example from fall:
pub rule fn_def {
'const'? 'unsafe'? linkage?
'fn' <commit> ident
type_parameters?
value_parameters
ret_type?
where_clause?
{block_expr | ';'}
}
This <commit> means that as soon parser sees fn keyword, it produces a function node, even if there are parsing errors after this point. Commit effectively makes the trailing part optional. In other words, one way to deal with partial parses is to treat certain prefixes of parses as parses. These commit points nicely mesh up with ll-ness of the parser: you commit just after the necessary lookahead.
-
Code usually has block structure, so it makes sense, when parsing a block expression, first parse it approximately as a token tree, and then try to parse the internals of the block. That way, parse errors stay isolated to blocks. Of course, you can do this with all kinds of lists and what not, as long as you can invent a robust covering grammar. This trick also makes incremental parsers more efficient (as changes are isolated to one block unless it's a change in the block structure itself) and allows one to lazily parse stuff. There are certain variations of this trick: for example, you can lex string literals as just any stuff between
", and then lex the contents of the literal with a second lexer, which properly deals with escape sequences. -
When parsing repeated constructs, like
item*, a useful strategy of error recovery is, after each (partially) parsed item skip tokens until a token fromFIRST(item)appears. (example from fall)
| field. | ||
|
|
||
|
|
||
| ## Typed Tree |
This comment has been minimized.
This comment has been minimized.
Manishearth
Dec 25, 2017
Member
FWIW this is kinda reminiscent of Go's AST, however Go's AST is designed that way more because Go doesn't have ADTs. (I've always found it annoying to work with because of that, but this isn't an inherent problem in this approach)
This comment has been minimized.
This comment has been minimized.
matklad
Dec 25, 2017
Author
Member
Yeah, in this representation, you could use enums and pattern match, for example, against all kinds of expressions, but you won't be able to destruct structs.
This comment has been minimized.
This comment has been minimized.
|
Might be interesting to look at how Roslyn solved a bunch of these same challenges for C# parsing and ASTs. There are probably better resources, but here's one I could find quickly: https://blogs.msdn.microsoft.com/ericlippert/2012/06/08/persistence-facades-and-roslyns-red-green-trees/ |
This comment has been minimized.
This comment has been minimized.
A most reasonable question! I've investigated a little how trees are represented in Dart SDK and Roslyn. So for both Dart and C#, an object oriented approach is used, where there's a hierarchy of classes corresponding to syntactical constructs, and each class stores its children as typed fields. However, it is possible to view AST as untyped because all nodes share a supertype, are linked via But they use different strategies to represent whitespace and comments. In Roslyin, each token class has two fields for leading and trailing trivia. The Dart looks weird (or maybe I am reading the wrong thing?) Seems like they build an explicit linked-list of non-whitespace non-comment tokens, and additionally attach comments (but not whitespace?) to the following token: https://github.com/dart-lang/sdk/blob/0c48a1163577a1157ea16c00b2fe914c1759357b/pkg/front_end/lib/src/scanner/token.dart#L477 This all makes me less sure that the representation I propose is good, though I still think it's better then alternatives. One of its worse parts is that accessing a node is linear in the number of children, and not constant. However, this can be fixed in a couple of ways: first, you can store a tree in such a way that all children of a node are stored in a continuous slice, which makes linear time pretty fast. Second, because a node is essentially a file-local index, you can store side tables like |
This comment has been minimized.
This comment has been minimized.
The compiler itself is already an ECS, this is just pushing stuff to the parser too. |
This comment has been minimized.
This comment has been minimized.
|
One more aspect of internal representation is that compiler should not need all these whitespaces and comments most of the time. So, at least in theory, it should be possible to parse source into a lossless tree, then condense the tree into more compact representation, forgetting various bits of information, and then restore the full tree by reparsing when the need arises. Specifically, I think only the following cases need access to the full syntax tree:
|
This comment has been minimized.
This comment has been minimized.
|
@matklad That compact representation is HIR in the current implementation, which also desugars things like |
This comment has been minimized.
This comment has been minimized.
|
@eddyb right, the idea is not about "having compact representation", but about loading and unloading full representation on demand. Nested |
This comment has been minimized.
This comment has been minimized.
|
@matklad You have the original |
This comment has been minimized.
This comment has been minimized.
|
I am tentatively starting to put some code in here: https://github.com/matklad/libsyntax2 :-) |
This comment has been minimized.
This comment has been minimized.
Pzixel
commented
Jan 13, 2018
|
@matklad I have been using Roslyn for a while as a user, and I can say that it's damn good in these things. You can play with this page to see what tree Roslyn actually generates: https://roslynquoter.azurewebsites.net/ . I think you can learn something from it :) |
This comment has been minimized.
This comment has been minimized.
|
So, this has been quite for some time. I'd love to receive a more "official" @rust-lang/compiler team feedback (an FCP maybe?) to better understand how to reach the goals I want to reach :-) So, here are some core questions about this RFC I would love to hear feedback on:
|
This comment has been minimized.
This comment has been minimized.
|
@matklad d'oh. I've been carrying around a printed copy of this RFC for about 2 weeks, but haven't gotten around to actually reading it and giving feedback. I will make a point to do so ! |
This comment has been minimized.
This comment has been minimized.
|
@matklad ok so I read this last night. I'm definitely I was thinking that it might be profitable to discuss these matters "live", at the upcoming Rust All Hands gathering, presuming that many of the stakeholders will be there? One thing I would like to note: I've been wanting for some time to add a mode to LALRPOP where it generates values of some pre-defined type, much like the trees you define here. The idea would be that you just write a grammar with no actions and we'll build up a tree; we could then layer tree transformers on top of that (a bit like what ANTLR does, iirc). I'd love for this "tree representation" we are discussing here to be an independent standard that we could use for that -- this would in turn allow us to have both hand-written and LALRPOP-generated parsers that are compatible (I'm not sure how hand writing buys us when compared against LALRPOP's existing error recovery mechanisms, but it's really hard to tell). I think -- strategically -- it's a mistake to start out with too big of a goal. We should probably start by iterating on actual code and putting it to use. But another way, I feel like replacing libsyntax would be the "final step", not the first one. But it's good to have that goal in mind as we plan our steps, to make sure we're not accidentally building up things with critical flaws. |
This comment has been minimized.
This comment has been minimized.
My understanding from the discussion above was that we gained a lot in terms of custom error messages. For example, would it be possible to do something like rust-lang/rust#48858 with an auto-generated parser? Also, is Rust's grammar even known to be LALR? |
nikomatsakis
referenced this pull request
Mar 29, 2018
Open
Canonical Rust grammar distinct from parser (tracking issue for RFC #1331) #30942
pickfire
reviewed
Apr 15, 2018
| [drawbacks]: #drawbacks | ||
|
|
||
| - No harm will be done as long as the new libsyntax exists as an | ||
| experiemt on crates.io. However, actually using it in the compiler |
This comment has been minimized.
This comment has been minimized.
pickfire
reviewed
Apr 15, 2018
|
|
||
| * It is minimal: it stores small amount of data and has no | ||
| dependencies. For instance, it does not need compiler's string | ||
| interner or literal data representation. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
rpjohnst
Apr 15, 2018
No, a "string interner" is a data structure that combines all copies of a string into one so they can be shared, for lower memory usage and fast comparison.
pickfire
reviewed
Apr 15, 2018
| new tree. | ||
|
|
||
| * A prototype implementation of the macro expansion on top of the new | ||
| sytnax tree. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Out of curiosity, what were the results of the All Hands meeting on this? |
This comment has been minimized.
This comment has been minimized.
|
@mark-i-m argh, I've should have written it here ages ago, thanks for the ping! The conclusion from all-hands discussion was that a lot here depends on the actual implementation details, and that it makes sense to experiment with parse tree approach. For experimenting, we've decided that it would be more interesting to add a parse-tree mode to LALRPOP (which differs from "let's write another rust parser by hand" approach I've proposed in RFC). The current work on LALRPOP is tracked in this issue: lalrpop/lalrpop#354 |
This comment has been minimized.
This comment has been minimized.
|
I've just studied the implementation of swift's libsyntax, and it has some nice ideas. It is also surprisingly easy to understand.
They use a three-layered representation:
Now, what makes their representation really interesting is that they store child nodes in an array, and not in a linked list (as proposed in the RFC). This allows for O(1) child access operation: class ClassDeclSyntax final : public DeclSyntax {
public:
enum Cursor : uint32_t {
Attributes,
Modifiers,
ClassKeyword,
Identifier,
GenericParameterClause,
InheritanceClause,
GenericWhereClause,
Members,
};
};
llvm::Optional<GenericParameterClauseSyntax> ClassDeclSyntax::getGenericParameterClause() {
auto ChildData = Data->getChild(Cursor::GenericParameterClause);
if (!ChildData)
return llvm::None;
return GenericParameterClauseSyntax {Root, ChildData.get()};
}What I find sub optimal about swift representation is that whitespace is attached directly to tokens, which make it harder to implement "declaration owns all preceding comments" logic. This can be fixed using the following representation: struct Trivia {
kind: SyntaxKind,
text: InternedString,
}
type Trivias = SmallVec<Arc<Trivia>>;
struct GreenNode {
kind: SyntaxKind,
len: TextUnit,
leadingTrivia: Trivias,
children: [(Arc<GreenNode>, Trivias)], //DST
}Another bit is that Arc-based EDIT: a short video overview of libsyntax: https://www.youtube.com/watch?v=5ivuYGxW_3M |
This comment has been minimized.
This comment has been minimized.
|
An interesting observation about swift's tree: Each node holds an Arc to the root of the tree and a raw pointer to a particular tree node. This is convenient, because you get owned syntax tree nodes with value semantics and parent pointers. However that means that read-only traversal of the tree generates atomic refcount increments/decrements, which I expect might generate contention if a single file processed concurrently. That is, However in Rust, we can make the strong pointer to root generic, and use either fn as_borrowed<'a>(owned: &'a SyntaxNode<Arc<Root>>) -> SyntaxNode<&'a Root>That is, it is possible at runtime to switch from an owned Arced version to a |
This comment has been minimized.
This comment has been minimized.
lnicola
commented
Jul 31, 2018
•
This comment has been minimized.
This comment has been minimized.
|
@lnicola yes, I am aware of Rolsyn's approach. The Swift libsyntax is indeed a realization of this red/green. Their implementation is much easier to read, and also does not rely on GC. |
This comment has been minimized.
This comment has been minimized.
maxbrunsfeld
commented
Aug 5, 2018
|
I've just been doing some work on tree-sitter-rust, the incremental rust parser that Atom will soon ship with (and hopefully Xray will ship with at some point). I had one interesting realization related to macros: for syntax highlighting purposes, and probably other purposes as well, it's desirable to, if possible, parse the contents of token trees as expressions and items, as opposed to leaving them in the unstructured form that For example, before I added this feature, code like this would not syntax highlight very nicely, because we wouldn't know that assert_eq!(a::b::<T, U>(), c.d);Of course, not all token trees have a regular structure like this, so we need to 'fall back' to parsing them in an unstructured way. With I also might be overthinking this; I'm curious if you have thoughts on this issue. |
This comment has been minimized.
This comment has been minimized.
That is true. Extend selection is next to useless if what looks to a human like an expression is represented as a token tree in the syntax tree. I also agree that non-deterministic parsing of macro invocation body as an expression or an item is a good approximation, especially with GLR approaches, where you can cheaply try all variants of the parsing. However the proper solution here is indeed a two phase parsing. You need to know the difference between So, for practical purposes, I would probably have used the following progression:
|
This comment has been minimized.
This comment has been minimized.
maxbrunsfeld
commented
Aug 5, 2018
Yeah, multi-phase parsing is definitely useful in general; we currently use it for things like JS in HTML, templating languages like EJS, etc. The hard parts of what you propose seems to be resolving macro calls correctly and interpreting macro definitions.
But I don't think you could do this based on some finite heuristic, because macro patterns can have arbitrary nesting and complexity. For example, in this macro from declare_tests! {
// ...
test_result {
Ok::<i32, i32>(0) => &[
Token::Enum { name: "Result" },
Token::Str("Ok"),
Token::I32(0),
],
}
// ...
}With my current approximate approach based on GLR, I'm able to determine on-the-fly that these inner token trees are expressions, but the outer one is not. How would a heuristic-based system deal with macros like this? |
This comment has been minimized.
This comment has been minimized.
I mean something like "heuristically resolve macro call, and then interpret the macro definition". But this is probably a lot of effort for little gain in comparison to what GLR already gives you. |
This comment has been minimized.
This comment has been minimized.
|
Status update: I've implemented (approximately) Swift style syntax tree in libsyntax2, in both owned and borrowed variants: https://github.com/matklad/libsyntax2/blob/2fb854ccdae6f1f12b60441e5c3b283bdc81fb0a/src/yellow/syntax.rs I've also hacked quite a bit on the parser itself, so that it now parses a majority of non-weird Rust constructs. Here's, for example, libsyntax2-based extend selection: https://www.youtube.com/watch?v=21NbnLhj-S4 |
Centril
referenced this pull request
Oct 8, 2018
Closed
Redesign the libsyntax and librustc APIs to make them more amenable to tooling #643
Centril
added
the
A-syntax
label
Nov 22, 2018
This comment has been minimized.
This comment has been minimized.
|
Status update:
|
matklad commentedDec 23, 2017
•
edited
Hi!
This RFC proposes to change AST and parser used by rustc, so as to create a solid base on top of which a great IDE support can be developed. The RFC is largely informed by my experience developing IntelliJ Rust and by my experiments with IDE-ready syntax tree in Rust in fall. I am not so knowledgeable about the internals of
rustcand especially about macro expansion machinery, so input from the compiler team would be very valuable!@rust-lang/compiler @rust-lang/dev-tools @nrc @jseyfried
Rendered