Autocomplete: Only ever complete a single line in single-line mode and reduce the output token limit #344
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Ensure that single-line mode completions are only ever returning a single line.
Since we no longer have two lines, I thought it's also okay to decrease the response token limit a bit. While there's still some variations, this seems to make it much more common for me when testing to get sub 800ms responses (compared to ~1sec before). I wasn't able to find an instance where 60 tokens weren't enough for single line completions yet.
Test plan
Updated test suite. Tested the latency locally.