Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Completion of function calls sometimes fails to include snippet #1785

Closed
HighCommander4 opened this issue Oct 15, 2023 · 8 comments · Fixed by llvm/llvm-project#69153
Closed
Assignees
Labels
bug Something isn't working

Comments

@HighCommander4
Copy link

HighCommander4 commented Oct 15, 2023

In the following code:

void waldo(int arg);

int main()
{
  int val;
  wal^

    // if (true)
  
}

Place the cursor at the location indicated by ^, invoke completion, and select the only result.

Expected result: waldo(int arg) is inserted.

Actual result: waldo is inserted.

The behaviour has something to do with the commented line.

@HighCommander4 HighCommander4 added the bug Something isn't working label Oct 15, 2023
@HighCommander4
Copy link
Author

This branch is being incorrectly taken, because NextTokenKind is incorrectly calculated to be tok::l_paren.

@HighCommander4
Copy link
Author

HighCommander4 commented Oct 15, 2023

NextTokenKind is computed here.

Going into Lexer::findNextToken(), the input Loc is the code completion location, which in the source buffer looks like a null character was inserted in between the source character before the cursor and the source character after the cursor when completion was invoked.

Things go downhill on this line, though:

  Loc = Lexer::getLocForEndOfToken(Loc, 0, SM, LangOpts);

After this, Loc now points to the space character before the ( character in the comment, which explains why the next token from that point onwards is identified as an l_paren.

@HighCommander4
Copy link
Author

HighCommander4 commented Oct 15, 2023

Having a bit of a "how did this ever work" moment here.

Inside getLocForEndOfToken(), we call MeasureTokenLength(), which lexes the next non-whitespace token after the code completion location, which is the tok::comment token // if (true), and returns its length, which is 12...

... but then adds that 12 to the code completion location (which differs from the start of the comment token by the intervening whitespace that was skipped), resulting in a location in the middle of the comment token 🤯

@HighCommander4
Copy link
Author

I guess the way the current code manages to successfully identify cases where the next token after the completion location really is a ( (which is what it's trying to do), is that in those cases, MeasureTokenLength() measures the length of that ( token, which is 1, and then adds that 1 to the code completion location (which points to the null character), resulting in the offset where we actually do want to start lexing for the next token.

It basically works by accident, but the mechanism is pretty badly broken.

@HighCommander4
Copy link
Author

One more thing I noticed while stepping through the lexer code that may be relevant: the lexer's inner loop does have a check for the code completion location, but checking for this requires the lexer to have a Preprocessor, and the temporary lexer created inside Lexer::getRawToken() (called from MeasureTokenLength()) does not have one.

@HighCommander4 HighCommander4 self-assigned this Oct 16, 2023
@HighCommander4
Copy link
Author

My best idea for how to fix this is to have clangd not use Lexer::findNextToken, but instead use code similar to the implementation of findNextToken, but which works for the completion point's SourceLocation.

It would be nicer to get findNextToken itself to handle the completion point, but I don't see how it can, given that the completion point is stored on the Preprocessor, and findNextToken's arguments do not include anything from which you can recover the Preprocessor; we'd have to modify the interfaces of findNextToken, getLocForEndOfToken, and MeasureTokenLength to take a Preprocessor as an optional parameter, or something like that.

@HighCommander4
Copy link
Author

Proposed fix: llvm/llvm-project#69153

@zyn0217
Copy link

zyn0217 commented Oct 16, 2023

Thanks for the fix. Looking around the changed lines, I remembered that once I encountered the case where the "don't add new parens & placeholders if arg list does exist." heuristic doesn't work when completing a setter function call generated by Protobuf. I'd like to have a try on this patch and see if this solves that problem. (I didn't get around to reducing the code snippet since I first ran into it. Looks like it is somehow particular to protobuf because I've never seen this issue for my hand-written functions.)

HighCommander4 added a commit to HighCommander4/llvm-project that referenced this issue Nov 12, 2023
The code was previously using Lexer::findNextToken() which does not
handle being passed the completion point as input.

Fixes clangd/clangd#1785
HighCommander4 added a commit to llvm/llvm-project that referenced this issue Nov 12, 2023
…#69153)

The code was previously using Lexer::findNextToken() which does not
handle being passed the completion point as input.

Fixes clangd/clangd#1785
zahiraam pushed a commit to zahiraam/llvm-project that referenced this issue Nov 20, 2023
…llvm#69153)

The code was previously using Lexer::findNextToken() which does not
handle being passed the completion point as input.

Fixes clangd/clangd#1785
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants