Improve suggestion results #1168

cdce8p · 2020-11-13T14:23:00Z

A first attempt to improve the suggestion list. See: microsoft/pylance-release#608

Instead of the Levenshtein Distance I though that Longest Common Subsequence might work better. The only drawback being that that user must only enter chars that appear in the symbol he/she is looking for and they must be in the correct order.

An example result would look like this:

Please let me know if that's something worth exploring further.

--
Algorithm used: Similar to isPatternInWord from VS Code

ghost · 2020-11-13T14:23:13Z

All CLA requirements met.

erictraut · 2020-11-13T14:29:05Z

We've received feedback from many other pyright/pylance users that they don't want to be restricted to only characters that appear within the symbol, so I think we'd receive a significant backlash if we went with this approach. The approach needs to accommodate small typos (different order, different capitalization, small deviations in characters).

cdce8p · 2020-11-13T14:34:24Z

Capitalization is already taken care of, since it compares all lower case strings.
Regarding the other points: Maybe a combined approach would work. I'll think about it.

heejaechang · 2020-11-13T21:11:01Z

returning result that doesn't contain given pattern will be filtered out by vscode so regardless whether LS returns them user will never be able to see those?

you can try that by enabling LSP logging and see what shows up in completion. (microsoft/pylance-release#362 (comment))

by the way, I think we don't need to do anything more than what vscode recommended (https://github.com/microsoft/vscode/blob/21dc66054203ab742d36be9f0ef6ecb774ae62f2/src/vs/base/common/filters.ts#L506-L514) for vscode since it does its own filter over data returned.

not sure about other editors that support LSP such as atom, vim-lsp and etc. for those, we might use different algorithm?

cdce8p · 2020-11-14T01:00:00Z

I tried a few things and basically observed what @heejaechang mentioned already. VSC does exact pattern matching for the typed string. Setting "editor.suggest.filterGraceful": true (the default) allows for probably one permutation but that's it. Other results are filtered although Pyright provided them (using Levenshtein Distance). Additionally if the user provided chars that don't appear in the symbol, it is also filtered.

Having said thar, I don't see much value in sticking with the current approach. Then again it's not my decision to make.
We could also use a combined method which I suggested previously. However this would probably come with at least a small performance hit, depending on the project size.

erictraut · 2020-11-14T04:37:03Z

Yeah, if that's the case, then I agree that sticking with the current approach doesn't provide much value.

cdce8p · 2020-11-14T21:58:16Z

I've change the algorithm to the one @heejaechang suggested and VS Code uses: isPatternInWord (Link). Before that I also did some tests with a recursive one to allow for permutations. I'll include it below, however I'm not sure if it's worth it.

export function computeCompletionSimilarity(typedValue: string, symbolName: string): number {
    if (recurseSimilarity(typedValue.toLocaleLowerCase(), symbolName.toLocaleLowerCase())) {
        return 1;
    }
    return 0;
}

function recurseSimilarity(typedLower: string, symbolLower: string): boolean {
    if (typedLower.length === 0) {
        return true;
    }
    const index = symbolLower.indexOf(typedLower[0]);
    if (index === -1) {
        return false;
    }
    if (typedLower.length === 1) {
        return true;
    } else if (typedLower.length === 2) {
        symbolLower = symbolLower.slice(index + 1);
        return recurseSimilarity(typedLower.slice(1), symbolLower);
    } else {
        symbolLower = symbolLower.slice(index + 1);
        return (
            recurseSimilarity(typedLower.slice(1), symbolLower) ||
            recurseSimilarity(typedLower[2] + typedLower[1] + typedLower.slice(3), symbolLower)
        );
    }
}

packages/pyright-internal/src/common/stringUtils.ts

heejaechang · 2020-11-16T21:13:50Z

I agree that we don't do permutation for now. for me, code LGTM, but I will leave it to @erictraut to decide.

* Leave computeCompletionSimilarity in place * Moved code to isPatternInSymbol * Change function signature to return boolean * Replaced references * Moved tests for computeCompletionSimilarity and isPatternInWord to existing stringUtils.test.ts * Re-added leven package

cdce8p · 2020-11-16T22:20:45Z

I've pushed a new commit with all requested changes, see commit description.
Should I remove all references to similarityLimit, now that it isn't used anymore?

jakebailey · 2020-11-16T22:26:50Z

similarityLimit appears to be a local in a bunch of modules, I think, so would be safe to remove if truly not used. It's the exported code that we wouldn't want to remove (given we do reference things from pylance).

cdce8p · 2020-11-16T22:39:09Z

@jakebailey Should it remain in function signatures of exported classes?

jakebailey · 2020-11-16T22:40:49Z

Oh, my mistake. I see, it's a parameter. I would just leave it and we can perform the cleanup later as it's a part of the API.

packages/pyright-internal/src/tests/common.test.ts

heejaechang · 2020-11-16T23:46:36Z

it looks good to me? if there is no objection, I can accept the PR.

heejaechang · 2020-11-17T19:13:12Z

@cdce8p we will take the PR soon. but probably for the next release (Pylance) due to some change we need to make for add import code actions.

cdce8p · 2020-11-17T23:33:49Z

@heejaechang Thanks for letting me know. Looking forward to using it when it's released!

jakebailey · 2020-11-19T17:49:26Z

Thanks for this; it's not likely to come out in Pylance next week (holiday week), but should make it afterward. 🙂

* quick and dirty proof of concept impl * fixed various issue * tweaks * adding 1 partial stub file improved ds-1 analysis time about 50% * fix test failure * made pylance fs to handle existing stub case and py.typed cases. * put virtual prefix for virtual file in logging. * added sanity pylance file system and import resolver tests * removed createFileSystem and moved PylanceFileSystem to pyright * moved import resolver tests to pyright * revert back to private * addressed PR feedbacks * addressed PR feedback * fixed file header comments * revert back the change * initial partial stub package working * some clean ups * don't generate new string unnecessarily * added test for isVirtual

cdce8p force-pushed the fuzzy-matching branch from 81a29d6 to 46fe787 Compare November 14, 2020 21:46

cdce8p changed the title ~~WIP: Improve suggestion results~~ Improve suggestion results Nov 15, 2020

cdce8p added 4 commits November 16, 2020 15:34

Implement Longest common subsequence for suggestion results

51f85c0

Fix style issus

1aa12eb

Change matching algroithm / use isPatternInWord from VSC

2ad0c94

Remove leven + alternative algorithm

40b9c60

cdce8p force-pushed the fuzzy-matching branch from 6dc5b23 to 40b9c60 Compare November 16, 2020 14:39

Add test case

f03e13a

heejaechang reviewed Nov 16, 2020

View reviewed changes

packages/pyright-internal/src/common/stringUtils.ts Outdated Show resolved Hide resolved

cdce8p added 2 commits November 16, 2020 23:08

Review changes

ed1ffc5

* Leave computeCompletionSimilarity in place * Moved code to isPatternInSymbol * Change function signature to return boolean * Replaced references * Moved tests for computeCompletionSimilarity and isPatternInWord to existing stringUtils.test.ts * Re-added leven package

Fix missing reference

3b0704c

heejaechang reviewed Nov 16, 2020

View reviewed changes

packages/pyright-internal/src/tests/common.test.ts Show resolved Hide resolved

heejaechang and others added 4 commits November 16, 2020 15:46

Merge branch 'master' into fuzzy-matching

9e310cc

Merge branch 'master' into fuzzy-matching

6daf76a

Merge branch 'master' into fuzzy-matching

f6aa3ce

Merge branch 'master' into fuzzy-matching

c8a08e3

cdce8p and others added 2 commits November 18, 2020 21:10

Merge branch 'master' into fuzzy-matching

69a4bd0

Merge branch 'master' into fuzzy-matching

e5b8d74

heejaechang merged commit a9d2528 into microsoft:master Nov 19, 2020

cdce8p deleted the fuzzy-matching branch November 19, 2020 06:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve suggestion results #1168

Improve suggestion results #1168

cdce8p commented Nov 13, 2020 •

edited

Loading

ghost commented Nov 13, 2020 •

edited by ghost

Loading

erictraut commented Nov 13, 2020

cdce8p commented Nov 13, 2020

heejaechang commented Nov 13, 2020 •

edited

Loading

cdce8p commented Nov 14, 2020 •

edited

Loading

erictraut commented Nov 14, 2020

cdce8p commented Nov 14, 2020

heejaechang commented Nov 16, 2020

cdce8p commented Nov 16, 2020

jakebailey commented Nov 16, 2020

cdce8p commented Nov 16, 2020

jakebailey commented Nov 16, 2020

heejaechang commented Nov 16, 2020

heejaechang commented Nov 17, 2020

cdce8p commented Nov 17, 2020

jakebailey commented Nov 19, 2020

Improve suggestion results #1168

Improve suggestion results #1168

Conversation

cdce8p commented Nov 13, 2020 • edited Loading

ghost commented Nov 13, 2020 • edited by ghost Loading

erictraut commented Nov 13, 2020

cdce8p commented Nov 13, 2020

heejaechang commented Nov 13, 2020 • edited Loading

cdce8p commented Nov 14, 2020 • edited Loading

erictraut commented Nov 14, 2020

cdce8p commented Nov 14, 2020

heejaechang commented Nov 16, 2020

cdce8p commented Nov 16, 2020

jakebailey commented Nov 16, 2020

cdce8p commented Nov 16, 2020

jakebailey commented Nov 16, 2020

heejaechang commented Nov 16, 2020

heejaechang commented Nov 17, 2020

cdce8p commented Nov 17, 2020

jakebailey commented Nov 19, 2020

cdce8p commented Nov 13, 2020 •

edited

Loading

ghost commented Nov 13, 2020 •

edited by ghost

Loading

heejaechang commented Nov 13, 2020 •

edited

Loading

cdce8p commented Nov 14, 2020 •

edited

Loading