Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate eagerly returning more information about suggestions #36265

Open
mjbvz opened this issue Jan 17, 2020 · 24 comments
Open

Investigate eagerly returning more information about suggestions #36265

mjbvz opened this issue Jan 17, 2020 · 24 comments
Labels
Awaiting More Feedback This means we'd like to hear from more people who would be helped by this feature Suggestion An idea for TypeScript

Comments

@mjbvz
Copy link
Contributor

mjbvz commented Jan 17, 2020

Problem

Suggestions VS Code have three broad UX parts:

  • Name/label. Primary identifier for a suggestion, such as foo.
  • Details. Type information or auto import info, such as string.
  • Documentation. Js doc info and additional details about the suggestion item

Currently VS Code only shows the details of the currently active suggestion. This can be problematic in cases like auto imports, where multiple symbols may all have the same name:

Screen Shot 2020-01-17 at 11 26 30 AM

This design also reflects the design of the suggestion api:

  • completionInfo returns the entire list of suggestions, but only includes the name
  • completionEntryDetails returns the details and documentation for an individual suggestion

Investigation

With microsoft/vscode#39441, VS Code is exploring letting languages/extensions return the details part of the suggestion eagerly. This would let VS Code show details for the entire list of suggestions instead of just for the active one. In the auto import case specifically, this would likely mean showing the file paths we would import from.

We would like the typescript team's feedback on these idea and our current proposals:

  • What information would be most helpful to render in the main suggestion list?
  • How much overhead would returning this information add? Both in terms of message size and TS server computation time?

/cc @jrieken For VS Code apis
/cc @amcasey @minestarks for VS

@mjbvz
Copy link
Contributor Author

mjbvz commented Jan 17, 2020

Here's the distribution of number of completion items returned (bucketed in 100s)

Screen Shot 2020-01-17 at 2 00 28 PM

In summary: of all non-empty completions, about 40% return 1000 or more items while 60% return between 1 and 999 items

Query:

RawEventsVSCodeExt
| where EventName ==  "typescript-language-features/completions.execute"
| where  ServerTimestamp > ago(10d)
| extend num = toint(Properties['count'])
| extend num = iif(num > 0, num, 0)
| extend num = iif(num > 10000, 10000, num)
| summarize count() by num=bin(num, 100.0)
| order by num
| render barchart 

@jrieken
Copy link
Member

jrieken commented Jan 20, 2020

fyi - this is what we are aiming for: microsoft/vscode#39441 (comment).

@RyanCavanaugh RyanCavanaugh added Awaiting More Feedback This means we'd like to hear from more people who would be helped by this feature Suggestion An idea for TypeScript labels Feb 6, 2020
@mjbvz
Copy link
Contributor Author

mjbvz commented Feb 20, 2020

I've measuring the perf impact of the most simple approach possible: call completionEntryDetails with the entire list of completion items (this is one TS server request but returns data for many items)

In a empty project, this call takes ~3seconds for 1500 or so completion items. So clearly not a good solution!

Just a breakdown of simples of timings for completionEntryDetails:

  • 1 item - 5ms
  • 10 items - 25ms
  • 50 items - 110 ms
  • 100 items - 200ms
  • 200 items - 350ms
  • 500 items - 900ms

(Note that completionEntryDetails also returns JS Doc documentation and some additional data that our completion label proposal does not call for. What I was measuring here is very worst possible case)


@andrewbranch At the TS/Editor sync today, we though you may the person to investigate this on the TS side. Using the existing TS Server apis, would you be able to look into:

  • What easily accessibly data does TS currently have for the completions call that it does not send back to VS Code (such as the auto import source)

  • What would be the perf impact of sending back this data?

  • What would be the perf impact of returning all the information from the VS Code proposal (Consider showing completion item detail if available for all list items vscode#39441 (comment)) except the signature

  • And what would be the perf impact of returning the signature as well (we expect this to be the most expensive part to generate properly)

@andrewbranch
Copy link
Member

(Cross-posting from microsoft/vscode#39441 to move discussion here)

Would the ideal response include a separate completion entry for every overload of the same method/function? Currently, we only send one entry per identifier, and then when we send the details, we show the first overload and a string indicating how many additional overloads (if any) are available:

image

@andrewbranch
Copy link
Member

Very early investigation observations:

  • The good news is that for symbols that are syntactically known to be methods, we already resolve call signatures for them while generating this initial list. So that’s a good chunk of semantic work that we can take advantage of. Leveraging it will only incur the cost of serializing and sending it over the wire.
  • The bad news is that for most other symbols, we don’t get their type at all, so that’s a big chunk of semantic work that will need to be added.
  • If we were to limit that semantic work to symbols that are syntactically known to be functions (in addition to already doing it for methods), it would likely cut that work down significantly, at the cost of introducing a wide range of false negatives (not showing signatures on entries that do in fact have them). As an example, in const fnRef = fn where fn is an actual function declaration, we know “for free” that fn is going to have a signature that we should display, but we don’t know that about fnRef. If we want to show the signature on both fn and fnRef, we’ll likely need to resolve the type and signatures for every member in the list. I’ll work on gathering numbers for how impactful these approaches might be, but I wanted to get an early read on whether you’re considering this feature to be worthwhile if the perf cost of covering 100% of cases turns out to be prohibitive.

@andrewbranch
Copy link
Member

I also have a UX question/consideration about the signature part of the VS Code proposal: how are you planning to represent objects that have both properties and call signatures? Common examples can be found in jest:

test('something', () => {
  expect(0).toBe(1);
});

test.each(table)('thing %1', () => {
  expect.hasAssertions();
});

I’m wondering if a format like this mockup might obscure the properties of test and expect by rendering them as calls, and nothing else.

@mjbvz
Copy link
Contributor Author

mjbvz commented Mar 20, 2020

Thanks for looking into this @andrewbranch!

For overloads, I don't think we should show a unique entry in the completion list for each overload. Unlike Java overloads which are more like distinct methods, JS overloads all still map to a single implementation. Many JS library methods also have a ton of overloads (like document.createEvent) that are not actually all that helpful to users.

However I'm not sure if we have an VS Code advice right now on what to show in the UI for overloads

Adding @jrieken @octref for the UX questions

@jrieken
Copy link
Member

jrieken commented Mar 23, 2020

For overloads, I don't think we should show a unique entry in the completion list for each overload.

Not so sure about the overloads. Yes, they are always the same function but I see two benefits when having each overload be an entry

  • It makes it easy to see all overloads at a glance
  • When using the javascript.suggest.completeFunctionCalls-setting it will help to get the correct template inserted quickly

Generally, I don't have strong feelings about this... It would be nice if TS has this information so that the extension can make a decision but for me overloads is not P1 - at least not right now.

how are you planning to represent objects that have both properties and call signatures

That's a question that we haven't asked ourselves yet. Today, we have completion item kinds for function and class and your samples seem to be both... As of today, users have to know that something has properties. A similar case exists with number literals and this sequence: 42 -> 42.|<suggest> -> (42).toString(). One idea would be add a new CompletionItemTag that tells you that a completion, once accepted, has properties that can also be completed.

Very early investigation observations:

Big thanks for looking into this 👏 Generally, we aren't looking for an "all or nothing"-approach but "more in more". Still, having eventually consistent information without false negatives should be the goal here, esp. since I am afraid that everything else will confuse people.

It is not the first time that the "too much data, too little compute-time"-challenge is coming up - other extensions are facing the sample problem. @andrewbranch There is one idea that I would like to get expert-feedback on: Let's state that there is completions from "scope-items" (global, local and other scope-variables) and "member-items" (class, namespace members). Now let's assume that the majority of completion items are "scope-items", e.g TS returns ~1000 completions when triggered at the top of a file vs. 25 completions for console.|. My idea: Would it make sense to keep all those "global-scope-items" and only enable/disable them per request? We could add new API that allows extensions to return a reusable completion item list and later, per completion request, the extension only names those reusable lists (instead of recomputing and resending it). So, basically this is some sort of caching for completion items based on my assumption that the majority of items aren't changing (rooted in lib.d.ts and friends). Now the question: Does that make sense? Do you see any problems? Does that fit into the completion internals of your language service? Generally, what do you think?

@andrewbranch
Copy link
Member

Global caching

We briefly discussed caching global completions (cc @amcasey) in a recent editor sync (primarily as a way of reducing costs associated with data transfer, not computation). I think there is definitely an opportunity to do something like what you’re describing, but ironically, the rest of this completions proposal makes it much harder to do. Today, the global completion list doesn’t actually contain any type information, which makes it pretty easy to cache. But, if we encode type information (such as call signatures) into that list, it becomes much more volatile, and cache invalidation can become difficult. Because of declaration merging, any change in a non-module file or in a declare global block could change the shape of those global completion items. Example:

interface Document {
  (): void;
}

Adding this declaration in an otherwise empty file gives the global variable document (from lib.dom.d.ts) a call signature, when previously it had none, and it’s unclear to me how this should affect the completion item in your proposal. This is a contrived and not terribly useful example, but merges like this do occur, for example, when someone installs @types/node, @types/webpack-env, or other packages that (intentionally or unintentionally) overload or merge with existing globals.

While refreshing the cached list isn’t a huge problem, I don’t have a clear idea of how we’d detect this in an efficient way in order to recompute the list—certainly not in a way that’s generalizable for arbitrary sources of globals. Built-in libs are fairly self-sufficient, but a global declaration file that makes use of import types and globals declared elsewhere, combined with conditional or indexed access types, could change shape in response to changes in any file, in hard-to-track ways.

Further performance investigations

I recorded timings for a few scenarios on Friday: completions after the letter d, completions after document., and, with @types/lodash installed, completions after _.. For each scenario I measured the total response time for generating the list as it is today, with getting call signatures for methods and functions, and with getting call signatures for every member. It should be noted I didn’t do anything with these call signatures; I was measuring only the impact of resolving them.

I’d want to reproduce this experiment in a more controlled environment before reporting the full data here, but the summary here is basically a huge “it depends.” For the global list of 1339 items, (returned with the single character d—this isn’t just filtered to symbols with the letter d in it, by the way; that happens on the VS Code side), there was a pretty wide spread in my measurements, but it looks like getting call signatures for every item resulted in about a 30% slowdown, while getting call signatures only for things items that were methods and function declarations (25 methods, 158 functions) had no measurable impact (I have to assume we’d see something in a more controlled experiment, but the cost appears to be minimal).

For document., on the other hand, there was a comparatively huge cost to getting call signatures for everything (246 items)—between 3x and 4x slower, though the absolute times were small (went from 6–8 ms to 20–40 ms), so noise might be showing up stronger here. But there was literally zero cost to getting call signatures for methods and functions (67 of them), because a) we already get signatures for methods in the control trial, and b) there aren’t any function symbols—every member symbol is either a method or a property. (Keep in mind though, that property symbols can still resolve to a type that has call signatures, so taking shortcuts here could still result in false negatives.)

For _., there was barely any measurable impact to getting call signatures for everything (315 items), because 297 of them are methods, which we get call signatures for today.

Conclusions

I have real concerns about caching type information in global lists, although I think caching the non-type-related parts of built-in lib globals, at the least, is still possible and would be worth the savings in wire cost (at least for thin-client-like scenarios). But, I’m uncertain whether such a mechanism could be reliable for reducing the computation costs of generating call signature readouts.

I think the emerging insight from the numbers I’ve gathered is that getting call signatures usually isn’t catastrophically expensive, but it can add up quickly for long lists, and it depends greatly on the composition of the list. (I’m honestly puzzled by the discrepancy between the ~30% slowdown for globals and the ~300% slowdown for members of document—I’d need to do some more careful profiling to see what’s going on.) Aside from what I was able to measure, resolving type information opens us up to doing arbitrary work, and there could be pathological cases lurking in DefinitelyTyped definitions or user code.

Personally, I’d be much more comfortable with this proposal if these details were returned in a second pass after the bones of the list is returned. (One consequence of this though, is that you wouldn’t be able to display overloads as separate items, since we don’t know how many overloads there are until resolving call signatures.) Is that something you all could consider?

@jrieken
Copy link
Member

jrieken commented Mar 24, 2020

Thanks for the update. Curious to learn more when you have measured more.

Coincidentally, cached completions came up with dart today as well (Dart-Code/Dart-Code#2290 (comment)). We are planning to check with Java, C#, Cpp folks but I do acknowledge the complexity with TS and "open types"

Is that something you all could consider?

Yeah - we have considered that and todays version of resolving the focused item is the mini variant of this. We could do this in batches, let's say resolve whole view port. But there will be flicker because typing changes what's visible in the view port by a lot (we don't filter just by prefix). E.g. you have said 1339 global completions and resolve the first 10 or so. Now, typing c will most certainly show completely different items (which means flicker). Typing another c reveals a few new items and keeps some existing items which means more flicker and a mix of items that are resolved and those that aren't. An extreme version where we only resolve (and keep) the details of the current item can be seen here: microsoft/vscode#39441 (comment). We want to prevent that in the name of good UI. Tho, we could try this in a more realistic scenario, by using batch resolving and updating all items of the current view port.

Another important note on the 2nd-pass-resolve is that this pass cannot change the insert behaviour of an item, e.g we cannot wait/block for resolving to happen which means a user is free to insert a suggestion before its details are resolved.

Another suggestion that has been brought up is to differentiate between "full" and "partial" completions, e.g. pressing ctrl+space would be a request for full completions (no matter how slow) and suggest-as-you type requests partial completions only. Whatever that means. For instance only locals vs "everything". We don't like that options because it leaves too much room for interpretation for how to implement it and it hinders fluent typing, e.g when I am typing docu I want to see document and not think that I need to press ctrl+space first.

So, those are the current options in our order of preference

  1. have all the information with the first request
  2. incrementally resolve batches of completion items
  3. have full and partial completion requests

Then, there is always to option to artificially limit the amount of completions that are returned by the extension and then signal that is more via the isIncomplete-flag. That has the very big downside that it requires server-side filtering which will most likely conflict with our client-side filtering and ranking...

@andrewbranch
Copy link
Member

  1. have all the information with the first request

Currently I’m pretty convinced that this just isn’t viable. Even if the 300% slower measurement is a distant outlier, I don’t think we want to accept a 30% slowdown as the typical case, with the possibility of unpredictable pathological cases.

incrementally resolve batches of completion items

@amcasey and I talked about this yesterday. Honestly, I think requesting the entire batch of details in a second pass, as long as it doesn’t block the list from showing up initially, would be on ok place to start. If the performance there seems to be a problem, we could look into requesting smaller batches, but the numbers I was seeing weren’t large in absolute terms; they were just bad compared to the baseline. For the global case, for example, the worst trial I recorded for resolving call signatures on everything was 138 ms, compared to the baseline average of 36 ms. So, displaying the initial list in 36 ms and getting the details for all 1339 items 100 ms later doesn’t sound terrible to me—then, you wouldn’t need to worry about calculating and requesting new batches as the user types.

@jrieken
Copy link
Member

jrieken commented Apr 15, 2020

For the global case, for example, the worst trial I recorded for resolving call signatures on everything was 138 ms, compared to the baseline average of 36 ms

Not sure with what version you have measured this? I just tried this with 3.9.0-dev.20200413 and see the following numbers when using an "empty project" (a single ts file, no project file). Values are medians of 5 runs: get completion list 15ms, resolve all details (via completionEntryDetails) 811ms. That gives my 50x, not 3x as you have seen. Also, when asking for completions in the vscode-project I see that ts-server is responding only after 76ms which means resolving all details would be (didn't measure): 50x76ms or 3x76ms. Bottom line is that I am eager to see the 3x in resolving details, not what I currently see.

So, now comes the fun part: Our TS-extension and extension-integration are quite expensive and add 3-4x to the ts-server time, e.g empty project: ts-server: 15ms, ts-ext: 41ms, vscode project: ts-server: 76ms, ts-ext: 322ms. This is after results come in, the total is the sum, and mostly due to data conversion and validation. Assuming we can minimise that and assuming resolving all details actually only costs 3x (above) then this could add up to zero. Very curious.

On resolving batches: Yes, we can play with that already today (and we did) but the UI will flicker as we show details all the time, not just on focus. I will see to get some recordings that can be posted here.

@andrewbranch
Copy link
Member

My numbers reflect only the body of the language service functions, excluding synchronizeHostData which brings the program up to date, not the total round trip time. I’m also not sure what you mean by

resolve all details (via completionEntryDetails) 811ms. That gives my 50x, not 3x as you have seen.

Do you mean you were calling the completion details request for every item in the list? That’s not going to give you anything close to realistic numbers because it gets a totally different set of information than what you’re asking for in this proposal, and by calling the function N times it repeats a lot of work that could be reused. There’s no nightly you can try out to get the numbers I reported; I simply made local changes against master to perform the semantic work I thought would be most expensive. To clarify my earlier comment:

For each scenario I measured the total response time the LS function execution time for 1) a control case: generating the list as it is today, with getting call signatures for methods and functions; 2) local changes implementing something resembling what it would take to implement your proposal: getting call signatures for every member. It should be noted I didn’t do anything with these call signatures; I was measuring only the impact of resolving them.

Secondly:

This is after results come in, the total is the sum, and mostly due to data conversion and validation. Assuming we can minimise that and assuming resolving all details actually only costs 3x (above) then this could add up to zero.

I mean, if we can minimize that overhead, we should do it independently of protocol changes and realize performance gains. From our side (and thinking as a VS Code + TypeScript user myself), if you told me there‘s an opportunity to reduce the user-observed completions time by 3x, I think I would be pretty excited about the 3x speedup rather than seeing an opportunity to replace all that surplus time with other computation 😄. In other words, if there’s a realistic opportunity to drastically improve the times for completions as they are today, that should be the baseline we measure against.

@jrieken
Copy link
Member

jrieken commented Apr 17, 2020

Do you mean you were calling the completion details request for every item in the list? That’s not going to give you anything close to realistic numbers because it gets a totally different set of information than what you’re asking for in this proposal,

In my experiment I have called completionEntryDetails once, with CompletionDetailsRequestArgs that list all items that I have just received. That makes 2 calls to ts-server but I don't know what's happening under the hood...

I mean, if we can minimize that overhead, we should do it independently of protocol changes and realize performance gains.

Sure thing. We are on it, independent of this.

@andrewbranch
Copy link
Member

In my experiment I have called completionEntryDetails once, with CompletionDetailsRequestArgs that list all items that I have just received.

Interesting. I didn’t realize that the protocol accepted an array of items, because the language service does not—in the server code we just map over this list and do the full languageService.getCompletionEntryDetails() call for each.

@jrieken
Copy link
Member

jrieken commented Apr 21, 2020

@andrewbranch Is there any chance that you can make available what you have prototyped, e.g using the real fast implementation behind the existing protocol? Or using a smarter protocol that doesn't require us to send back all those name-string.

I am planning to run the following experiment

  • ts-ext makes completions request
  • ts-ext resolves details of first N (=100?) items
  • ts-ext marks the completion result as incomplete (which it only is the in the "details-dimension" not wrt the amount of items)
  • due to incomplete-ness vscode will keep asking the ts-ext for "more" items, in which case it will simply fetch more details, not retrigger completions

@andrewbranch
Copy link
Member

I don’t currently have anything prototyped that would work for that experiment. Essentially all I did was measure the impact of adding checker.getSignaturesOfType(checker.getTypeOfSymbol(completionSymbol)) for every completion item in getCompletionsAtPosition. I don’t think it would be very much work to drastically improve the performance of getting details (as “details” are defined in the current protocol) for batches of more than one item, just keep in mind that these details are a bit different from the proposal at microsoft/vscode#39441 (comment).

Happy to throw something together, but I want to be sure we’re on the same page about what you’re getting in the response vs. what you would like to get in the response.

@mjbvz
Copy link
Contributor Author

mjbvz commented May 6, 2020

@jrieken @andrewbranch I looked into using CompletionEntry.source so we could at least try the new VS Code api for auto imports, however it looks like this field does not produce useful display strings. Usually they are absolute paths:

        {
            "name": "xyz",
            "kind": "const",
            "kindModifiers": "export",
            "sortText": "5",
            "hasAction": true,
            "source": "/Users/matb/projects/san/x"
        }

So unfortunately I think we are blocked on adopting the new VS Code API for auto imports. I can open a new issue for this, I suspect CompletionEntry.source was only ever supposed to be an internal identifier

@andrewbranch
Copy link
Member

@mjbvz that’s correct, and we need an internal-only property like that not just for auto-imports, but also for disambiguating this-property completions from local variable completions and auto-imports when those names are the same. If I recall correctly, for this-property completions the source is populated with nonsense. I think it would be easy to return a display-friendly path for you, but we ideally need to keep source as-is.

@svipas
Copy link

svipas commented Aug 12, 2020

@andrewbranch I often check this issue and I see work is stopped to implement this from TypeScript side. Is this is still relevant for TS team? This feature would improve DX very very much!

@andrewbranch
Copy link
Member

I think the status right now is that my investigation proved the initial proposal not viable for performance reasons. We’d be happy to investigate alternative proposals that don’t impact the completion list response time, like returning details for batches of completion items in a subsequent request.

@svipas
Copy link

svipas commented Aug 13, 2020

@andrewbranch Understand, thanks for all your work. I hope in the future we will have this feature available.

@andrewbranch
Copy link
Member

Unassigning myself since I’m not aware of any actionable next steps on our side. Feel free to ping me if things change 👍

@andrewbranch andrewbranch removed their assignment Dec 1, 2020
@jrieken
Copy link
Member

jrieken commented Dec 2, 2020

@andrewbranch that's a pity but I kinda expected this. We will proceed with our API/UI changes without selfhost validation via TypeScript, tho we have done this: microsoft/vscode#98228 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting More Feedback This means we'd like to hear from more people who would be helped by this feature Suggestion An idea for TypeScript
Projects
None yet
Development

No branches or pull requests

5 participants