Poor result ranking for workspaceSymbol #81

HighCommander4 · 2020-10-04T04:51:38Z

If I'm editing the LLVM codebase with vscode + clangd and I perform a workspaceSymbol search for TypeDecl, this is what I get:

Note that clang::TypeDecl is the very last result, and all sorts of identifiers where the leaf segment is only a partial as opposed to an exact match are ranked higher than it.

I believe this is clearly suboptimal. Symbols where the leaf segment of the qualified name is an exact match should be ranked higher than ones where it's a partial match.

The text was updated successfully, but these errors were encountered:

kadircet · 2020-10-05T10:35:18Z

This is unfortunate and it looks like vscode is tripping us over somehow. SymbolInformation itself doesn't have any way to indicate score of a symbol, but clangd does its best to return results in a sorted order, see the attachement below for the top N results. But apparently vscode just throws away our ordering :/

Moving the issue into vscode-clangd.

{
   "id":341,
   "jsonrpc":"2.0",
   "result":[
      {
         "containerName":"clang",
         "kind":5,
         "location":{
            "range":{
               "end":{
                  "character":14,
                  "line":3063
               },
               "start":{
                  "character":6,
                  "line":3063
               }
            },
            "uri":"file:///usr/local/google/home/kadircet/repos/llvm/clang/include/clang/AST/Decl.h"
         },
         "name":"TypeDecl"
      },
      {
         "containerName":"clang::TypeDecl",
         "kind":9,
         "location":{
            "range":{
               "end":{
                  "character":10,
                  "line":3078
               },
               "start":{
                  "character":2,
                  "line":3078
               }
            },
            "uri":"file:///usr/local/google/home/kadircet/repos/llvm/clang/include/clang/AST/Decl.h"
         },
         "name":"TypeDecl"
      },
      {
         "containerName":"clang",
         "kind":5,
         "location":{
            "range":{
               "end":{
                  "character":21,
                  "line":3105
               },
               "start":{
                  "character":6,
                  "line":3105
               }
            },
            "uri":"file:///usr/local/google/home/kadircet/repos/llvm/clang/include/clang/AST/Decl.h"
         },
         "name":"TypedefNameDecl"
      },
      {
         "containerName":"clang",
         "kind":5,
         "location":{
            "range":{
               "end":{
                  "character":17,
                  "line":3207
               },
               "start":{
                  "character":6,
                  "line":3207
               }
            },
            "uri":"file:///usr/local/google/home/kadircet/repos/llvm/clang/include/clang/AST/Decl.h"
         },
         "name":"TypedefDecl"
      },
      {
         "containerName":"clang::ASTContext",
         "kind":6,
         "location":{
            "range":{
               "end":{
                  "character":26,
                  "line":1395
               },
               "start":{
                  "character":11,
                  "line":1395
               }
            },
            "uri":"file:///usr/local/google/home/kadircet/repos/llvm/clang/include/clang/AST/ASTContext.h"
         },
         "name":"getTypeDeclType"
      }

sam-mccall · 2020-10-05T19:43:11Z

Yeah, I'm happy to add a score extension to our responses, but there's not that much we can do here, this is a bug somewhere between VSCode and LSP.
(LSP doesn't let us communicate ordering beyond providing an order, and VSCode ignores the order we provide).

Any ideas before we close this?

Trass3r · 2020-10-05T20:33:34Z

The question is where the order is lost. Can barely find the relevant code in vscode, only this

The protocol doesn't really incorporate ranking. As with code completion, most clients respect what the server sends, but VSCode re-ranks items, with predictable results. See clangd/vscode-clangd#81 There's no filterText field so we may be unable to construct a good workaround. But expose the score so we may be able to do this on the client in future. Differential Revision: https://reviews.llvm.org/D88844

HighCommander4 · 2020-12-14T06:18:16Z

It looks like the culprit is our fix for #31.

What VSCode appears to be doing is:

compare symbol.name to the query string to discriminate between exact (prefix) matches and partial matches
rank exact matches above partial matches
within the group of partial matches, sort them alphabetically

(I'm basing this on experimentation, I haven't tracked down the relevant VSCode source.)

So, if symbol.name contains the leaf name only, and symbol.containerName contains the qualifier, as the API intends, then all is well, and exact matches get ranked above partial matches.

This is indeed how the server provides the symbols. However, we added middleware code to the client to manipulate the server results in the following way:

        provideWorkspaceSymbols: async (query, token, next) => {
          let symbols = await next(query, token);
          return symbols.map(symbol => {
            if (symbol.containerName)
              symbol.name = `${symbol.containerName}::${symbol.name}`;
            // Always clean the containerName to avoid displaying it twice.
            symbol.containerName = '';
            return symbol;
          })
        },

By including the qualifier in symbol.name, every match becomes a partial match (for names with qualifiers where the full qualifier is not included in the query), and we just get an alphabetical ordering of all the matches. So, by doing this, we solved one problem (the one in #31, that qualified names could not be used as a search query) but created another (this one, the poor ordering).

…ults Fixes clangd#81

HighCommander4 · 2020-12-14T06:46:49Z

I proposed a fix which limits the manipulation added for #31, to just the cases where the query string is in fact qualified. That should solve the problem for the vast majority of cases. (Technically, the problem can still occur for partially-qualified query strings, but those are much less likely to have a mixture of exact and partial matches.)

…ults (#118) Fixes #81

The protocol doesn't really incorporate ranking. As with code completion, most clients respect what the server sends, but VSCode re-ranks items, with predictable results. See clangd/vscode-clangd#81 There's no filterText field so we may be unable to construct a good workaround. But expose the score so we may be able to do this on the client in future. Differential Revision: https://reviews.llvm.org/D88844

kadircet transferred this issue from clangd/clangd Oct 5, 2020

HighCommander4 added a commit to HighCommander4/vscode-clangd that referenced this issue Dec 14, 2020

Tweak workspaceSymbols middleware to avoid suboptimal ordering of res…

524905a

…ults Fixes clangd#81

HighCommander4 mentioned this issue Dec 14, 2020

Tweak workspaceSymbols middleware to avoid suboptimal ordering of results #118

Merged

HighCommander4 added a commit to HighCommander4/vscode-clangd that referenced this issue Dec 14, 2020

Tweak workspaceSymbols middleware to avoid suboptimal ordering of res…

4344c1c

…ults Fixes clangd#81

hokein closed this as completed in #118 Jan 8, 2021

hokein pushed a commit that referenced this issue Jan 8, 2021

Tweak workspaceSymbols middleware to avoid suboptimal ordering of res…

e548f9e

…ults (#118) Fixes #81

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor result ranking for workspaceSymbol #81

Poor result ranking for workspaceSymbol #81

HighCommander4 commented Oct 4, 2020

kadircet commented Oct 5, 2020

sam-mccall commented Oct 5, 2020

Trass3r commented Oct 5, 2020

HighCommander4 commented Dec 14, 2020

HighCommander4 commented Dec 14, 2020

Poor result ranking for workspaceSymbol #81

Poor result ranking for workspaceSymbol #81

Comments

HighCommander4 commented Oct 4, 2020

kadircet commented Oct 5, 2020

sam-mccall commented Oct 5, 2020

Trass3r commented Oct 5, 2020

HighCommander4 commented Dec 14, 2020

HighCommander4 commented Dec 14, 2020