Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve autocompletion by looking on the type and name #3954

Merged
merged 16 commits into from Apr 23, 2020

Conversation

bnjjj
Copy link
Contributor

@bnjjj bnjjj commented Apr 11, 2020

This tweet (https://twitter.com/tjholowaychuk/status/1248918374731714560) gaves me the idea to implement that in rust-analyzer.

Basically for this first example I made some examples when we are in a function call definition. I look on the parameter list to prioritize autocompletions for the same types and if it's the same type + the same name then it's displayed first in the completion list.

So here is a draft, first step to open a discussion and know what you think about the implementation. It works (cf tests) but maybe I can make a better implementation at some places. Be careful the code needs some refactoring to be better and concise.

PS: It was lot of fun writing this haha

Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj bnjjj force-pushed the master branch 2 times, most recently from 6087c27 to f62fe8d Compare April 11, 2020 21:29
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@lnicola
Copy link
Member

lnicola commented Apr 12, 2020

I wonder if we could make a MatchInfo enum or struct and put these heuristics in its PartialEq implementation, since it looks like the comparer might end up unwieldy.

@bnjjj
Copy link
Contributor Author

bnjjj commented Apr 12, 2020

Yes definitely we can find a better way to handle this sort. I also thought about a type which could implement PartialEq. An enum for each different cases (in a function call, in a struct literal expression, ...) with all useful data inside. For exemple the call_info and fields in this first case.

@bnjjj
Copy link
Contributor Author

bnjjj commented Apr 12, 2020

I made a little change, I moved all the sort directly in the Completions struct. Then I added a sort_option field to Completions. I'm questioning about the possibility to have several sort options. Could it be possible or not ? To me it makes no sense but maybe I'm missing something

…entation, include sort in Completions struct

Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@lnicola
Copy link
Member

lnicola commented Apr 12, 2020

Just for the record, I was thinking of something like:

#[derive(PartialEq)]
enum CompletionMatchType {
    Prefix,
    Infix, // nMat for CompletionMatchType
    Substring, // CMT for CompletionMatchType
}

enum CompletionItemType {
    Fn(FnDef),
    Field(FieldDef),
    Macro(MacroDef),
    // ...
}

struct CompletionItem {
    ref: CompletionItemType,
    type: CompletionMatchType,
    is_type_match: bool,
    // ...
}

impl Ord for CompletionItem {
    // heuristics here
}

@bnjjj
Copy link
Contributor Author

bnjjj commented Apr 14, 2020

Oh yes, I can implement PartialEq for CompletionItem and move all my code inside the current sort function inside this implementation. But about the CompletionMatchType I'm not sure it's useful. Do you have a use case in mind ? I mean another use case than a prefix match.

@lnicola
Copy link
Member

lnicola commented Apr 14, 2020

Do you have a use case in mind ? I mean another use case than a prefix match.

Well, I'm not sure if it's the best approach, but we might want to show completions for all of those cases: prefix, literal substring (sequence) and non-contiguous substrings. Most IDEs can suggest CompletionMatchType if you type CMT, for example. Having that enum might be an easy way to support these in the future because we could just include it in the sorting key. Of course, it might be even better to keep track of "frecency" (frequency + recency) and include more scoring heuristics like the web browsers do, but that seems much harder to implement.

Anyway, it's fine to not add it now. It might not even fit well with the rest of the code.

PS: Ord, not PartialEq.

Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj
Copy link
Contributor Author

bnjjj commented Apr 14, 2020

Ok I experimented a little bit. I was not very satisfied with the implementation of Ord on CompletionItem because I need to have data coming from context (call_info for example). And I'm not very confident to put this data multiple times in each CompletionItem. It's a lot of duplicate informations IMHO.

BTW as @matklad told me on Zulip, in fact even if we sort our completions in LSP server the client doesn't care about it. So I added an additional data called sort_text to sort completion given the current index in the vector.

Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj
Copy link
Contributor Author

bnjjj commented Apr 15, 2020

I just added the support also for RecordField in a struct literal expression.

@matklad
Copy link
Member

matklad commented Apr 16, 2020

Has anyone found any docs about how exactly sortText and filterText interact? Am I correct that in the current PR we completely ignore the textual match of identifiers?

Implementation wise, I feel relying on the order here would be a wrong abstraction. A sorted completion list tells you that there's an unambiguous linear order from worst to best, but that's not actually true in practice. We, for example, can't generally compare unrelated completions, like a reference completion vs keyword completion.

I think the right abstraction is to add a

struct CompltionScore {
   ...
}

to each completion field, where score is something with PartialOrd. For example, it can be an enum only same variants of which are comparable. In the current impl, that would be just

struct CompltionScore {
    is_precise_type_match: bool
}

but it than could be extended with all kinds of heuristics. Then, it becomes the job of conv to convert those scores to a representation, suitable for the client.

How to do that given the current constraints of LSP is, of course, an option question (especially, how to marry fuzz-matching scores with semantic scores). The simplest first step seems to just mark items with fully matched types as preselect: true?

@bnjjj
Copy link
Contributor Author

bnjjj commented Apr 16, 2020

About your first question to be honest I just checked the GoPLS codebase https://github.com/golang/tools/blob/master/internal/lsp/completion.go#L111

Ok then for the first iteration I will make a CompletionScore struct like this:

struct CompletionScore {
    is_precise_type_match: bool, // Only when it's the same type
    is_precise_type_name_match: bool, // Only when it's a perfect match and in that case I put a preselect: true
}

And then I will try to find some documentations about what the consequence of using sortText with filterText. Seems good for you ?

@matklad
Copy link
Member

matklad commented Apr 16, 2020 via email

Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj
Copy link
Contributor Author

bnjjj commented Apr 16, 2020

Ok I made some changes with your advices. And here is screenshot in vscode to compare between current situation in RA and new situation with this PR.

Current:
Capture d’écran 2020-04-16 à 18 28 54

New:
Capture d’écran 2020-04-16 à 18 27 52

In fact I only reorder completions with a score. If not it keeps the same behavior as before (seems to be an alphabetical sort by default).

self.score_option = Some(score_option);
}

pub(crate) fn compute_score(&mut self, ctx: &CompletionContext) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to delay the computation of scores until the end of completion process? A simpler approach seems to be to add a fully-computed score to CompletionItem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I will move this part in completion_dot

CompletionScore::TypeAndNameMatch => res.preselect = Some(true),
CompletionScore::TypeMatch => {}
}
res.sort_text = Some(format!("{:02}", ctx.2));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather start with just setting preselect and void the sort_text hack. I belive setitng sort_text only for some fields would be wrong, and we also need to take fuzzy mathcing into account, if we set sort text. I think just preselect would give a meaningful improvement.

Copy link
Contributor Author

@bnjjj bnjjj Apr 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have an example in mind in which case sort text could cause an issue let me know I will try. The fact is as a rust analyzer user I feel better with my example in sorted items. But of course I don’t want to introduce a bug or non deterministic behavior. So if you have examples on which I can make some tests I will be happy to try and show you the result. I agree with you, preselect is good but if we can’t go a step further using sortText as gopls does. Developer experience will be improved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way when I go to the LSP specs from Microsoft, when I look about documentation about sortText here it is A string that should be used when comparing this item with other items. When falsy the label is used. it seems it's not a hack. I mean, today we have all completion items sorted in alphabetical, which in my opinion, is very bad. IMHO I think we can't do something more bad because you always have to scroll the entire autocompletions list to see all interesting parameters of a struct for example. I think it really worth to at least show to the user fields with right types first + preselect perfect match. If we look at my screenshots there are keywords like box, dbg, ... displayed and at this place we don't care about it. In my example it's not so annoying because I have a lot of fields beginning with a letter so it's in first position and I don't have to scroll so much. If you are worried I can write a lot of tests, make a lot of experiments even manually and show you the results because I really think it worth.

if line.chars().last().map(char::is_whitespace) == Some(true) {
panic!("Trailing whitespace in {}", path.display())
panic!("Trailing whitespace in {} at line {}", path.display(), line_number)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 , worth a separate PR

@kjeremy
Copy link
Contributor

kjeremy commented Apr 16, 2020

I believe that sortText is used to literally sort in the UI so if you had:

label sort
foo '0000'
bar '0001'

If would show up as foo then bar. If you had:

label sort
foo '0001'
bar '0000'

It would be presented as bar then foo.

filterText is used when the characters types might not match the label: ex. typing a '.' to auto fill something.

If either of those are missing label is used instead.

I think there are vscode settings though that can alter the behavior and I don't know how they interact with the LSP.

@bnjjj
Copy link
Contributor Author

bnjjj commented Apr 16, 2020

Yes sortText is used to sort completionItem and with my experiments it seems that it’s sorted by ascending so 0000 first and then 0001.

Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
bors bot added a commit that referenced this pull request Apr 17, 2020
4008: tests: add more info about what failed in tidy tests r=matklad a=bnjjj

Separate PR from #3954

Co-authored-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj bnjjj marked this pull request as ready for review April 19, 2020 19:40
@bnjjj bnjjj requested a review from matklad April 19, 2020 19:40
CompletionItem,
};
use rustc_hash::FxHashSet;
// use std::cmp::Ordering;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stray comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted

@@ -285,6 +304,51 @@ impl Builder {
self.deprecated = Some(deprecated);
self
}
#[allow(unused)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this allow unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because at this time I don't really use it. I just wanted to keep the same way than it was done with other fields. To stay consistent

@@ -285,6 +304,51 @@ impl Builder {
self.deprecated = Some(deprecated);
self
}
#[allow(unused)]
pub(crate) fn compute_score(mut self, ctx: &CompletionContext) -> Builder {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function should live in presentation.rs, and here we only need set_score

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

items.into_iter().map(|item| item.conv_with((&line_index, line_endings))).collect();
let items: Vec<CompletionItem> = items
.into_iter()
.enumerate()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually sort variants anywhere? ctrl+f sort gives no results

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only use the index to order when there is a type match. I don't know if it answers your question

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I am not sure....

Like, we never sort completion items, so the indexes they get are arbitrary. So, when we set an index as a sort_text, we get arbitrary ordering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes you're right. We don't care about the index here. In fact, what's is important is to have a kind of global counter. I also don't care about the order at this time. I just use this data as a counter. If it's weird to you I also can pass an &mut counter coming from this handle_completion function but I need to have a kind of global counter to not put the same number in sortText. And as for non scored item there isn't any sortText it's the default behavior as before then it doesn't change. Then in fact I could have a sortText set to 2 and the next one set to 4 but it's not so important. The only important thing is to have a incremental counter. I will change this to pass a &mut counter it should be easier to read I think :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the variable to use counter instead of arbitrary index

Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj bnjjj requested a review from matklad April 21, 2020 12:39
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj
Copy link
Contributor Author

bnjjj commented Apr 23, 2020

Is it good for you @matklad ?

@matklad
Copy link
Member

matklad commented Apr 23, 2020

bors r+

I want to merge this as it has been seeting in the queue for far too long. That said, I want to tweak this a quite a bit after it lands -- I have a strong suspicion that current usages of sortText, where we set it only for some variants, and don't look at the textual match at all, gives us essentially random completions. This suspicion is confirmde by the fact that a bona fide random sorting was only caught during code review. But the general idea here is definitelly solid

@bors
Copy link
Contributor

bors bot commented Apr 23, 2020

@bors bors bot merged commit e833e03 into rust-lang:master Apr 23, 2020
bors bot added a commit that referenced this pull request Mar 12, 2021
7904: Improved completion sorting r=JoshMcguigan a=JoshMcguigan

I was working on extending #3954 to apply completion scores in more places (I'll have another PR open for that soon) when I discovered that actually completion sorting was not working for me at all in `coc.nvim`. This led me down a bit of a rabbit hole of how coc and vs code each sort completion items.

Before this PR, rust-analyzer was setting the `sortText` field on completion items to `None` if we hadn't applied any completion score for that item, or to the label of the item with a leading whitespace character if we had applied any completion score. Completion score is defined in rust-analyzer as an enum with two variants, `TypeMatch` and `TypeAndNameMatch`. 

In vs code the above strategy works, because if `sortText` isn't set [they default it to the label](microsoft/vscode@b4ead4e). However, coc [does not do this](https://github.com/neoclide/coc.nvim/blob/e211e361475a38b146a903b9b02343551c6cd372/src/completion/complete.ts#L245).

I was going to file a bug report against coc, but I read the [LSP spec for the `sortText` field](https://microsoft.github.io/language-server-protocol/specifications/specification-current/#textDocument_completion) and I feel like it is ambiguous and coc could claim what they do is a valid interpretation of the spec.

Further, the existing rust-analyzer behavior of prepending a leading whitespace character for completion items with any completion score does not handle sorting `TypeAndNameMatch` completions above `TypeMatch` completions. They were both being treated the same.

The first change this PR makes is to set the `sortText` field to either "1" for `TypeAndNameMatch` completions, "2" for `TypeMatch` completions, or "3" for completions which are neither of those. This change works around the potential ambiguity in the LSP spec and fixes completion sorting for users of coc. It also allows `TypeAndNameMatch` items to be sorted above just `TypeMatch` items (of course both of these will be sorted above completion items without a score). 

The second change this PR makes is to use the actual completion scores for ref matches. The existing code ignored the actual score and always assumed these would be a high priority completion item.

#### Before

Here coc just sorts based on how close the items are in the file.

![image](https://user-images.githubusercontent.com/22216761/110249880-46063580-7f2d-11eb-9233-91a2bbd48238.png)

#### After

Here we correctly get `zzz` first, since that is both a type and name match. Then we get `ccc` which is just a type match.

![image](https://user-images.githubusercontent.com/22216761/110249883-4e5e7080-7f2d-11eb-9269-a3bc133fdee7.png)


Co-authored-by: Josh Mcguigan <joshmcg88@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants