Currently completion candidates are sorted by score, and then secondarily sorted lexicographically to provide a predictable order. I propose we instead do a secondary sort by candidate length, preferring shorter candidates. My intuition is that shorter names are used more frequently than longer names, so it is a somewhat better heuristic to put shorter items first, all else being equal.
Edit: Note that this proposal only comes in to play when candidates have identical scores. Currently the tie-break sorting is alphabetical; I'm proposing we switch to a potentially better heuristic. This will have a very small impact in general.
(Based on my limited understanding of how this all works) I think we first need to look at whether fuzzy matching (and therefore scoring) matches people's intuition: #38380. Because my gut feeling is that, ignoring for one second whether the fuzzy matching does/doesn't match one's intuition, sorting by length/lexicographically on top of the fuzzy scoring further skews the results from an intuitive order.
I use completion to save keystrokes. Putting shorter candidates first counteracts that.
Can you provide an example where my proposal would make candidate ranking worse?
Please note that my proposed change only impacts candidates that already have identical scores, which is not that common. The existing tie-break sorting is alphabetical, which is effectively random from a ranking perspective.
changed the title
x/tools/gopls: consider sorting completion candidates by lengthJul 17, 2020
Thinking about it more, for same-score completions that already have some portion filled – which was what I was thinking about in my previous comment – shortlex order is probably rarely different from the current lexicographic order. On the other hand, both seem ultimately arbitrary. Your intuition was that shorter names are used more frequently than longer ones, but I think the only place that is a convention in Go is for local variable names.
Your intuition was that shorter names are used more frequently than longer ones, but I think the only place that is a convention in Go is for local variable names.
I would guess shorter names are used more frequently for functions, methods and struct fields as well. It is easy enough to write a program to analyze code and collect some numbers, so maybe I will do that.
If we "fix" the fuzzy completion algorithm this problem of same-score might go away?
My first example var _ string = f // want "f.ID.s" but got "f.AardvarkID.s" would not be fixed because it is not related to fuzzy matcher scoring. "f" is an exact prefix of both candidates, and there are no other "f"s later in the candidates.