Skip to content

Convert lit-based syntax classification test to XCTest #1953

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 3, 2023

Conversation

StevenWong12
Copy link
Contributor

@StevenWong12 StevenWong12 commented Jul 27, 2023

We should use the assertClassification to make these test cases look nicer and get rid of the dependency on FileCheck eventually.

I converted a subset of the original lit-tests since I think most of them are mutually covered. There are some tests not producing the expected results and I wrote down the reasons in XCTExpectFailure. Maybe we should determine whether changing the test cases or marking these as bugs to fix.

Comment on lines 297 to 303
XCTExpectFailure(
"""
'@available' is classified as @ and a typeIdentifier
'iOS' is classified as a typeIdentifier
'8.0' and '10.10' are classified as integerLiteral
"""
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is expected by the way the syntax tree gets represented because by now we always model the name of an attribute as a type. Also the lines between builtin attributes (like @availablable) with completely custom syntax, user-defined attributes that are types (like result builders and property wrappers) and user-defined types that aren’t types (attached macros) are blurring. So, what I would expect is available to be classified as an attribute.

You should be able to achieve that by changing \AttributeSyntax.attributeName to return (.attribute, false) in SyntaxClassification.classify(_ keyPath:)

Same for testAttribute2 where objc and IBOutlet should be classified as attribute.

Copy link
Contributor Author

@StevenWong12 StevenWong12 Jul 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to achieve that by changing \AttributeSyntax.attributeName to return (.attribute, false)

Do you mean we changing this to (.attribute, true)? Since \AttributeSyntax.attributeName already returns (.attribute, false) now.

If so, we may need to make some changes on handleLayout (as my new commits) since \AttributeSyntax.attributeName is not a TokenSyntax and the second return value of SyntaxClassification.classify(_ keyPath:) only has effects in handleToken.

But I can't find a proper workaround for classifying the trivia pieces in a layout node for now though.


On a second thought, what do you think about refactor ClassificationVisitor as a subclass of SyntaxAnyVisitor like ParseDiagnosticsGenerator. That should make this case handled easier. And the 8.0 and 10.10 can be specified as floatingLiteral rather that some integerLiterals in a version tuple.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean we changing this to (.attribute, true)? Since \AttributeSyntax.attributeName already returns (.attribute, false) now.

If so, we may need to make some changes on handleLayout (as my new commits) since \AttributeSyntax.attributeName is not a TokenSyntax and the second return value of SyntaxClassification.classify(_ keyPath:) only has effects in handleToken.

Yes, that’s what I meant. But I didn’t realize it only had an effect on tokens 😢

On a second thought, what do you think about refactor ClassificationVisitor as a subclass of SyntaxAnyVisitor like ParseDiagnosticsGenerator. That should make this case handled easier. And the 8.0 and 10.10 can be specified as floatingLiteral rather that some integerLiterals in a version tuple.

I think that would be a good idea and should make the classifier easier to maintain. If the SyntaxAnyVisitor-based implementation is significantly slower than the current one, we should look into why that is and see if we can make SyntaxAnyVisitor faster.

In any case, I would prefer to do this in a follow-up PR. We can just keep the test case disabled for now. One note though: XCTExpectFailure is not supported on Linux IIRC, so you would need to do try XCTSkipIf(true, <message>)

Comment on lines 205 to 232

if let classification, classification.force == true {
let range = SyntaxClassifiedRange(
kind: classification.classification,
range: ByteSourceRange(offset: byteOffset, length: child.byteLength - child.leadingTriviaByteLength - child.trailingTriviaByteLength)
)
report(range: range)
byteOffset += child.byteLength
continue
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, did you find a solution to #1953 (comment) after all? Or doesn’t this work?

Copy link
Contributor Author

@StevenWong12 StevenWong12 Jul 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could be a solution if we don't refactorClassificationVisitor, and that works.😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that’s a fine solution. Good idea 👍🏽

@StevenWong12 StevenWong12 force-pushed the convert_coloring_lit_test branch 2 times, most recently from 6f9c15b to 9d7c597 Compare July 30, 2023 15:39
@StevenWong12 StevenWong12 marked this pull request as ready for review July 30, 2023 15:40
@StevenWong12 StevenWong12 force-pushed the convert_coloring_lit_test branch from 9d7c597 to 54d3de5 Compare July 30, 2023 16:01
Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Your idea with forcing the classification of an entire subtree sounds like a good solution for now. We would still be classifying the comment in the following as an attribute but that’s something we can fix in a rewrite of SyntaxClassifier.

@MyType /* abc */ . NestedResultBuilder
func foo() {}

Comment on lines 207 to 208
// Leading trivia.
report(range: SyntaxClassifiedRange(kind: .none, range: ByteSourceRange(offset: byteOffset, length: child.leadingTriviaByteLength)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also classify the trivia for doc comments (both leading and trailing). A test case to test that would be.

@MyAttribute // some comment
func foo() {}

To do that, I think you should factor out the following and call it to classify the trivia.

https://github.com/apple/swift-syntax/blob/e04c5c117bdf5a92ddc309ac4ce8e33fe302db41/Sources/SwiftIDEUtils/SyntaxClassifier.swift#L168-L173

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah. But we also have to add two public properties leadingTriviaPieces and trailingTriviaPieces in RawSyntax to get the trivia pieces as in my new commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought we could have a function to classify the trivia pieces so we don’t need to repeat this loop everywhere.

Something like

/// Classifies `trivia` and returns the number of bytes the trivia took up in the source
func classify(trivia: [RawTriviaPiece]) -> Int {
  var classifiedBytes = 0
  for triviaPiece in triviaPieces {
    let range = triviaPiece.classify(offset: byteOffset)
    report(range: range)
    classifiedBytes += triviaPiece.byteLength
  }
  return classifiedBytes
}

And then here you can just do

if let triviaPieces = child.leadingTriviaPieces {
  byteOffset += classify(trivia: triviaPieces)
}

And you can also use the function in handleToken.

My thinking is that repeating an implementation twice is still OK but as soon as we reach 3 or more copies, we should really factor it out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think that looks nice. 👍

Comment on lines 205 to 232

if let classification, classification.force == true {
let range = SyntaxClassifiedRange(
kind: classification.classification,
range: ByteSourceRange(offset: byteOffset, length: child.byteLength - child.leadingTriviaByteLength - child.trailingTriviaByteLength)
)
report(range: range)
byteOffset += child.byteLength
continue
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that’s a fine solution. Good idea 👍🏽

@StevenWong12 StevenWong12 force-pushed the convert_coloring_lit_test branch from 54d3de5 to ad6358a Compare August 1, 2023 11:29
@StevenWong12 StevenWong12 force-pushed the convert_coloring_lit_test branch from ad6358a to 462b69b Compare August 2, 2023 16:41
Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍🏽 Thanks

@ahoppen
Copy link
Member

ahoppen commented Aug 2, 2023

@swift-ci Please test

@ahoppen
Copy link
Member

ahoppen commented Aug 2, 2023

This breaks sourcekit-lsp because you are removing SyntaxClassification.objectLiteral. Could you create a corresponding sourcekit-lsp PR?

@StevenWong12
Copy link
Contributor Author

Sure, here is the link swiftlang/sourcekit-lsp#788

@ahoppen
Copy link
Member

ahoppen commented Aug 3, 2023

swiftlang/sourcekit-lsp#788

@swift-ci Please test

@ahoppen ahoppen merged commit 1cd6c22 into swiftlang:main Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants