-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cody completions: Setup to Bring-your-own-LLM #53495
Conversation
c5c3208
to
aa0d816
Compare
aa0d816
to
0bfcd3e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behavior was forked off to reduce complexity in the other files. We'll update this to use the new providers interface as a follow-up
undefined, | ||
undefined, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, it's getting smaller!
} | ||
|
||
CompletionLogger.noResponse(logId) | ||
return [] | ||
} | ||
|
||
public async fetchAndShowManualCompletions(): Promise<void> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff, Philipp! 🔥
Co-authored-by: Valery Bugakov <skymk1@gmail.com>
Co-authored-by: Valery Bugakov <skymk1@gmail.com>
Co-authored-by: Valery Bugakov <skymk1@gmail.com>
Co-authored-by: Valery Bugakov <skymk1@gmail.com>
This PR restructures the way we generate completions in a way that makes it very easy and cheap for us to try out different completion providers. The existing behavior was kept MOSTLY the same, however some minor tweaks were made in an attempt to reduce complexity: - The getContext function now has a higher number of characters that it gets to generate. We previously calculated the _exact length_ we have for context by generating an empty prompt containing only the prefix and suffix. This caused quite a bit of overhead and back and forth to negotiate the right value. The new behavior is to use the total context length as an upper bound for the context. This means context is now slightly over-fetched, but in practice this should not be a big deal. The individual providers still need to make sure they only include context when it fits (a logic that was previously implemented already) - The anthropic answer tokens were changed from `200` to `204` since that would allow us to change from hardcoded numbers to percentage and it looked nice (10% of the total tokens are now used for the answer). ## Controversial: - The logic where inline completions in lines that were not empty was causing 2 normal requests _and one request with a `\n` injected in the prompt_ was preliminary removed. Do we want it back? If so, it should be branched off in the anthropic provider only. - My personal take on this is that it was always confusing to me as to why we do it and I have not observed it causing any major gains _personally_. I’m fine with dropping it for now and adding back if we do see issues but I suspect that a different model will bring us more gains. ## ToDo: (Will be addressed before a merge but does not need to block a first code review): - [ ] Add proper context to the backend API (this will be added in another PR) - [x] Increase the response token size for multi-line completion and maybe reduce the response size for single-line completions (to improve latency) - [x] Add request logging so we know what the fetch is doing - [x] Add logging on startup to show what backend its using - [x] Handle abort case - [x] check why 4 requests ## Test plan - Test like you normally would - Nothing should have changed! - To test a new provider: - First start the mock server `cd client/cody && pnpm ts-node ./scripts/mock-server.ts` - Then add the following two options into your config: ``` "cody.completions.advanced.provider": "unstable-codegen", "cody.completions.advanced.serverEndpoint": "http://localhost:3001/batch" ``` - Observe that it works 😮 <!-- All pull requests REQUIRE a test plan: https://docs.sourcegraph.com/dev/background-information/testing_principles --> --------- Co-authored-by: Valery Bugakov <skymk1@gmail.com>
This PR restructures the way we generate completions in a way that makes it very easy and cheap for us to try out different completion providers.
The existing behavior was kept MOSTLY the same, however some minor tweaks were made in an attempt to reduce complexity:
200
to204
since that would allow us to change from hardcoded numbers to percentage and it looked nice (10% of the total tokens are now used for the answer).Controversial:
\n
injected in the prompt was preliminary removed. Do we want it back? If so, it should be branched off in the anthropic provider only.ToDo:
(Will be addressed before a merge but does not need to block a first code review):
Test plan
cd client/cody && pnpm ts-node ./scripts/mock-server.ts