-
Notifications
You must be signed in to change notification settings - Fork 10.6k
[Syntax] Swift libSyntax API #11320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Syntax] Swift libSyntax API #11320
Conversation
|
woot! |
|
I get the reference of your commit message 👀 |
|
@felix91gr libSyntax is a library in the Swift compiler that provides a full source-preserving Syntax tree that can be easily transformed and re-printed as a file. You can find more info about the C++ side of libSyntax here: https://github.com/apple/swift/tree/master/lib/Syntax |
|
That’s awesome! Thank you for explaining 😊 |
|
So, would I be able to use this to get an AST from a file, almost like Clang's libTooling? |
|
@tiferrei You'll be able to call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! some minor comments/question in line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to assert the validity of index here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically it'll fail in the stdlib but it wouldn't hurt to add a more descriptive error message here 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add some doc-comment that this is the access path from the root to a specific node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more about this, in our design, should the meaning of the identifier be transparent to the end users? Do you think clients will misuse them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be an implementation detail and not exposed to clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good. Thanks for explaining!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this if statement do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this should go in a comment above this if statement)
If the token doesn't have an exact text specified, then the case will have a single associated value of type String. This is for identifier, spacedBinaryOperator, unspacedBinaryOperator, prefixOperator, postfixOperator, integerLiteral, floatingLiteral, and stringLiteral
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we gyb over TriviaPiece to shorten these above two switch statements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wary of gybing Trivia just because it's a static set of values and we don't anticipate it changing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, i think gybing reduces logic duplication and avoids some errors introduced by copy-n-paste.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair! I can gyb this, no problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use a method that takes the key+count if you think it's worth factoring this. I don't think gyb'ing is the right answer for this. Gyb adds a large cognitive burden to any code that uses it, so it has to have a big benefit to be worth it IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For simple logic like this, I don't think gybing will harm readability and it makes sure we cover all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no default in this switch, so the language already ensures we cover all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually, I agree that gyb can be hard to read than actual code; but based on the context here, also since we are using gyb already for other enums in this patch, i still prefer we generate the switch to be consistent. Readability won't suffer because of its simple logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we generate these factory methods too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, I disagree with gyb'ing this. It's much easier to read as plain source code and the benefit is very small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using gyb here doesn't harm readability either. The logic is straightforward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the outcome of the validation? Should it throw or return error if validation fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just debug validation to ensure that the things generated from SyntaxFactory/SyntaxBuilders in the tests hold the right structure. This validation doesn't happen in Release builds because the only way to construct one of these is through the Factory and Builders, and they're meant to guarantee you a valid structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like an Equatable conformance. Doesn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Yeah, I had originally structured this with protocols, and didn't want to enforce Equatable conformance because of the Self requirement, but since this is a class hierarchy I can make this a conformance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this work?
extension Syntax : Equatable {
public static func == (lhs: Self, rhs: Self) -> Bool {
return lhs.data === rhs.data
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looking great, thanks Harlan!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this trying to say that .missing does not imply "implicit", but instead might be a required syntax element that was omitted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I can clarify the comment a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why not a random access collection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a random access collection would be something like a Stack or a Queue, where you can only work on a specific region of the Data Structure. Contrasting with an Array or a List in that regard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D'oh, edited my comment: I was trying to ask why this is not a random access collection, since it seems like the underlying abstraction would support it. That said, I'm really just curious, no need to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The underlying abstraction totally does support it. It's currently bound to Collection but I'll conform it to RandomAccessCollection instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we end up in a situation where the parent goes away without the syntax data being "invalid" in some send, or could this be unowned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parent is optional -- nodes that are the root node have no parent. We can't have an unowned optional
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how often you'd have more than one backtick in a row in source code, but I guess the Int is free in this representation 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See TextOutputStreamable protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's absolutely what I was missing here. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you don't want to do this ;-) (I looked inside AnyIterator and I saw things in there). Writing a special Iterator struct for SyntaxChildren is just a few more lines of code, but avoids an unnecessary allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the methods in the class itself, it can also conform (I guess) to MutableCollection and RangeReplaceableCollection and potentially acquire a more familiar API for things like removingLast etc...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can't conform to MutableCollection because the nodes themselves are immutable.
|
@moiseev This shouldn't live in |
|
/cc @zisko for the CMake question above. |
|
At the place it is now, the CMake LGTM. |
d31a71e to
e43a933
Compare
|
@swift-ci please smoke test |
|
@swift-ci please python lint |
|
@swift-ci please smoke test |
9c96b21 to
4f019ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: noe -> node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
34faf91 to
19c275b
Compare
This patch is an initial implementation of the Swift libSyntax API. It aims to provide all features of the C++ API but exposed to Swift. It currently resides in SwiftExperimental and will likely exist in a molten state for a while.
19c275b to
4a8b360
Compare
|
@swift-ci please test |
|
Build failed |
|
@swift-ci please test |
|
Build failed |
|
@swift-ci please test |
|
Not sure what happened with that last failure. |
|
@nkcsgexi @benlangmuir @moiseev Good to merge? |
|
I've no more comments! LGTM! |
|
Build failed |
|
👍 |
|
That "Build failed" comment...doesn't make sense? It actually passed. 🤔 |
|
⛴ |
* Create Swift libSyntax API This patch is an initial implementation of the Swift libSyntax API. It aims to provide all features of the C++ API but exposed to Swift. It currently resides in SwiftExperimental and will likely exist in a molten state for a while. * Only build SwiftSyntax on macOS
This patch is an initial implementation of the Swift libSyntax API. It
aims to provide all features of the C++ API but exposed to Swift.
It currently resides in
tools/and will likely exist in amolten state for a while.
Tasks remaining:
tools/)