-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autocomplete: Add support for Starcoder Enterprise virtual model identifier #2714
Conversation
'starcoder-3b': 'fireworks/accounts/fireworks/models/starcoder-3b-w8a16', | ||
'starcoder-1b': 'fireworks/accounts/fireworks/models/starcoder-1b-w8a16', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing these two models. They aren't being used right now and it's easy to add them back if we need to test them.
This PR is safe to land before we do any server-side changes since it will only run the new code path if I'll merge this right now so we have this out in the next insider release |
Hi |
@Detyzz greetings traveler :) |
This PR adds a new `fireworks/starcoder` virtual model string for use with the licensing data (so that adding one concise `fireworks/starcoder` string allows us to give access to all needed fireworks model) and the downstream clients (so that we can later propagate this as the default code completion model from the SG instance to the Cody clients. Detailed list of changes after this PR: - Creation of a new `fireworks/starcoder` virtual model string - This is the only starcoder related model string that is now rendered as valid in the UI (existing configurations with the specific strings will still work but will be rendered with a red background. This is to ensure GTM will be able to only set the correct values) - Cody Gateway can translate the fireworks/starcoder model string to a specific model - For now this is 100% mapping to accounts/fireworks/models/starcoder-16b-w8a16 - When validating the model for a request, we now expand virtual model strings so that `fireworks/starcoder` in the list of allowed models is enough to allow all Fireworks related models - PLG users now report `fireworks/starcoder` as a valid coder model in their license data - This will allow us to remove the specific models listed in this payload in the future (since Cody Gateway knows that this is a virtual model tag and how to expand it). However, the existing list of specific models is not changed (small asterisks here that I’ll discuss later) - Clients can now use `fireworks/starcoder` as a custom model. The SG instance will simply forward this to Cody Gateway - Removal of more places that mentioned the old ST model name. - Here, it's important to know that the validation always happens with the _resolved_ model string. So after adding the `disableSingleTenant` flag, the resolved model will no longer be the old single-tenant model names. Thus we can remove support for it (as long as we keep the rewrite) Here's a list of what changed from an actor perspective: - PLG user - PLG traffic to dotcom will still send the same virtual model strings that we have added before (`fireworks/starcoder-16b`, `fireworks/starcoder-7b`). This is still going to be translated on the SG instance. However, that translation was also duplicated into Cody Gateway so that, after a Cody Gateway rollout, the SG instance can just pass-through this model string. - We do not want to change the model strings PLG uses for now as we still need to be able to separate between the faster 7b and the more accurate 16b model in multi-tenant environments. - The PLG user license ping will no longer report the old outdated single-tenant model as allowed (since Cody Gateway rewrites those anyways and checks against the resolved model string) and now includes the new `fireworks/starcoder` virtual mode. After a Cody Gateway rollout, it is safe to remove the old specific model strings as Cody Gateway can expand the new virtual one. - In the Cody extensions, nothing changes for PLG users. They will still opt-into Fireworks using the feature flag on the dotcom instance. When this FF is true, they will use the `fireworks/starcoder-16b` and `fireworks/starcoder-7b` virtual model names just as they are now. - Enterprise user - It is now possible to add `fireworks/starcoder` to the allowed model list in the license data. When this is enabled, Cody Gateway starts to accept all specific starcoder model strings and the virtual one (`fireworks/starcoder`) from the model. When these are used, Cody Gateway translates the request to a specific model. -⚠️ It is not possible yet for clients to pick up when the default model in the site config changes. This is going to be a separate PR in the Cody repo. Thus, Enterprise users would not be able to use Fireworks yet (unless they also set the feature flag). This is done here: sourcegraph/cody#2714 -⚠️ This PR does not change the default models for licenses with no overwrite yet. I will do this in a follow-up PR after 2023-01-14 to be 100% safe. This is done here: #59586 ## Test plan ### Enterprise instance with `fireworks/starcoder` and nothing else in the license data: ``` curl 'https://sourcegraph.test:3443/.api/completions/code' \ -X POST \ -H 'authorization: token TOKENTOKEN' \ --data-raw '{"messages":[{"speaker":"human","text":"function bubbleSort(){"}],"maxTokensToSample":30,"temperature":0.2,"stopSequences":[],"timeoutMs":5000,"stream":false,"model":"fireworks/starcoder"}' {"completion":"\\n\\tvar len = arr.length;\\n\\tvar temp;\\n\\tfor(var i = 0; i \u003c","stopReason":"length"}% ``` ### Enterprise instance with no rate limit overwrites and no `fireworks/starcoder` in the license data ``` curl 'https://sourcegraph.test:3443/.api/completions/code' \ -X POST \ -H 'authorization: token TOKENTOKEN' \ --data-raw '{"messages":[{"speaker":"human","text":"function bubbleSort(){"}],"maxTokensToSample":30,"temperature":0.2,"stopSequences":[],"timeoutMs":5000,"stream":false,"model":"fireworks/starcoder"}' Sourcegraph Cody Gateway: unexpected status code 400: {"error":"model \"fireworks/accounts/fireworks/models/starcoder-16b-w8a16\" is not allowed, allowed: []"} ``` ### Dotcom users with the same license data as our production dotcom license key <img width="912" alt="Screenshot 2024-01-12 at 15 28 13" src="https://github.com/sourcegraph/sourcegraph/assets/458591/b6636c9e-a203-43fa-9f5f-6db2ea067f37"> ``` curl 'https://sourcegraph.test:3443/.api/completions/code' \ -X POST \ -H 'authorization: token TOKENTOKEN' \ --data-raw '{"messages":[{"speaker":"human","text":"function bubbleSort(){"}],"maxTokensToSample":30,"temperature":0.2,"stopSequences":[],"timeoutMs":5000,"stream":false,"model":"fireworks/starcoder-16b"}' ``` --------- Co-authored-by: Rafał Gajdulewicz <rafax@users.noreply.github.com>
Depends on sourcegraph/sourcegraph#59522
This PR adds support for a new model identifier inside the Fireworks setup:
fireworks/starcoder
. Read more about the new flag, here: sourcegraph/sourcegraph#59522This PR makes it so that if an enterprise instance has
fireworks/starcoder
set as their default code completion model, the client will pick it up and use the Fireworks provider to build the prompt and send it with thefireworks/starcoder
model identifier.Test plan
fireworks/starcoder
:fireworks/starcoder
cody-pro
feature flag nor anycody-autocomplete
feature flags are enabled.