Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autocomplete: Add support for Starcoder Enterprise virtual model identifier #2714

Merged
merged 3 commits into from
Jan 16, 2024

Conversation

philipp-spiess
Copy link
Contributor

@philipp-spiess philipp-spiess commented Jan 12, 2024

Depends on sourcegraph/sourcegraph#59522

This PR adds support for a new model identifier inside the Fireworks setup: fireworks/starcoder. Read more about the new flag, here: sourcegraph/sourcegraph#59522

This PR makes it so that if an enterprise instance has fireworks/starcoder set as their default code completion model, the client will pick it up and use the Fireworks provider to build the prompt and send it with the fireworks/starcoder model identifier.

Test plan

  1. Be on Add StarCoder enterprise virtual model string sourcegraph#59522 on your backend
  2. Configure your Cody Gateway license to allow fireworks/starcoder: Screenshot 2024-01-12 at 17 50 19
  3. Set your site-config's completion model to fireworks/starcoder Screenshot 2024-01-12 at 17 51 21
  4. Make sure neither the cody-pro feature flag nor any cody-autocomplete feature flags are enabled.
  5. Open the VS Code extension and connect to your local instance
  6. Observe that the right model identifier is picked and you can see completions
Screenshot 2024-01-12 at 17 47 43

@philipp-spiess philipp-spiess requested a review from a team January 12, 2024 16:53
Comment on lines -46 to -47
'starcoder-3b': 'fireworks/accounts/fireworks/models/starcoder-3b-w8a16',
'starcoder-1b': 'fireworks/accounts/fireworks/models/starcoder-1b-w8a16',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing these two models. They aren't being used right now and it's easy to add them back if we need to test them.

@philipp-spiess
Copy link
Contributor Author

This PR is safe to land before we do any server-side changes since it will only run the new code path if starcoder/fireworks is configured as the code completion model by the SG instance which, unless you add the server-side changes, won't he the case.

I'll merge this right now so we have this out in the next insider release

@philipp-spiess philipp-spiess merged commit a13ab38 into main Jan 16, 2024
18 checks passed
@philipp-spiess philipp-spiess deleted the ps/starcoder-enterprise branch January 16, 2024 12:41
@Detyzz
Copy link

Detyzz commented Jan 16, 2024

Hi

@philipp-spiess
Copy link
Contributor Author

@Detyzz greetings traveler :)

philipp-spiess added a commit to sourcegraph/sourcegraph that referenced this pull request Jan 16, 2024
This PR adds a new `fireworks/starcoder` virtual model string for use
with the licensing data (so that adding one concise
`fireworks/starcoder` string allows us to give access to all needed
fireworks model) and the downstream clients (so that we can later
propagate this as the default code completion model from the SG instance
to the Cody clients.

Detailed list of changes after this PR:

- Creation of a new `fireworks/starcoder` virtual model string
  - This is the only starcoder related model string that is now rendered
    as valid in the UI (existing configurations with the specific
    strings will still work but will be rendered with a red background.
    This is to ensure GTM will be able to only set the correct values)
  - Cody Gateway can translate the fireworks/starcoder model string to a
    specific model
    - For now this is 100% mapping to
      accounts/fireworks/models/starcoder-16b-w8a16
  - When validating the model for a request, we now expand virtual model
    strings so that `fireworks/starcoder` in the list of allowed models
    is enough to allow all Fireworks related models
  - PLG users now report `fireworks/starcoder` as a valid coder model in
    their license data
    - This will allow us to remove the specific models listed in this
      payload in the future (since Cody Gateway knows that this is a
      virtual model tag and how to expand it). However, the existing
      list of specific models is not changed (small asterisks here that
      I’ll discuss later)
   - Clients can now use `fireworks/starcoder` as a custom model. The SG
     instance will simply forward this to Cody Gateway
- Removal of more places that mentioned the old ST model name.
  - Here, it's important to know that the validation always happens with
    the _resolved_ model string. So after adding the
    `disableSingleTenant` flag, the resolved model will no longer be the
    old single-tenant model names. Thus we can remove support for it (as
    long as we keep the rewrite)

Here's a list of what changed from an actor perspective:

- PLG user
  - PLG traffic to dotcom will still send the same virtual model strings
    that we have added before (`fireworks/starcoder-16b`,
    `fireworks/starcoder-7b`). This is still going to be translated on
    the SG instance. However, that translation was also duplicated into
    Cody Gateway so that, after a Cody Gateway rollout, the SG instance
    can just pass-through this model string. 
  - We do not want to change the model strings PLG uses for now as we
    still need to be able to separate between the faster 7b and the more
    accurate 16b model in multi-tenant environments. 
  - The PLG user license ping will no longer report the old outdated
    single-tenant model as allowed (since Cody Gateway rewrites those
    anyways and checks against the resolved model string) and now
    includes the new `fireworks/starcoder` virtual mode. After a Cody
    Gateway rollout, it is safe to remove the old specific model strings
    as Cody Gateway can expand the new virtual one.
  - In the Cody extensions, nothing changes for PLG users. They will
    still opt-into Fireworks using the feature flag on the dotcom
    instance. When this FF is true, they will use the
    `fireworks/starcoder-16b` and `fireworks/starcoder-7b` virtual model
    names just as they are now.
- Enterprise user
  - It is now possible to add `fireworks/starcoder` to the allowed model
    list in the license data. When this is enabled, Cody Gateway starts
    to accept all specific starcoder model strings and the virtual one
    (`fireworks/starcoder`) from the model. When these are used, Cody
    Gateway translates the request to a specific model.
  - ⚠️ It is not possible yet for clients to pick up when the default
    model in the site config changes. This is going to be a separate PR
    in the Cody repo. Thus, Enterprise users would not be able to use
    Fireworks yet (unless they also set the feature flag). This is done
    here: sourcegraph/cody#2714
  - ⚠️ This PR does not change the default models for licenses with no
    overwrite yet. I will do this in a follow-up PR after 2023-01-14 to
    be 100% safe. This is done here:
    #59586
     
## Test plan

### Enterprise instance with `fireworks/starcoder` and nothing else in the license data:

```
curl 'https://sourcegraph.test:3443/.api/completions/code' \
-X POST \
-H 'authorization: token TOKENTOKEN' \
--data-raw '{"messages":[{"speaker":"human","text":"function bubbleSort(){"}],"maxTokensToSample":30,"temperature":0.2,"stopSequences":[],"timeoutMs":5000,"stream":false,"model":"fireworks/starcoder"}'
{"completion":"\\n\\tvar len = arr.length;\\n\\tvar temp;\\n\\tfor(var i = 0; i \u003c","stopReason":"length"}%      
```

### Enterprise instance with no rate limit overwrites and no `fireworks/starcoder` in the license data

```
curl 'https://sourcegraph.test:3443/.api/completions/code' \
-X POST \
-H 'authorization: token TOKENTOKEN' \
--data-raw '{"messages":[{"speaker":"human","text":"function bubbleSort(){"}],"maxTokensToSample":30,"temperature":0.2,"stopSequences":[],"timeoutMs":5000,"stream":false,"model":"fireworks/starcoder"}'
Sourcegraph Cody Gateway: unexpected status code 400: {"error":"model \"fireworks/accounts/fireworks/models/starcoder-16b-w8a16\" is not allowed, allowed: []"}
```

### Dotcom users with the same license data as our production dotcom license key

<img width="912" alt="Screenshot 2024-01-12 at 15 28 13"
src="https://github.com/sourcegraph/sourcegraph/assets/458591/b6636c9e-a203-43fa-9f5f-6db2ea067f37">

```
curl 'https://sourcegraph.test:3443/.api/completions/code' \
-X POST \
-H 'authorization: token TOKENTOKEN' \
--data-raw '{"messages":[{"speaker":"human","text":"function bubbleSort(){"}],"maxTokensToSample":30,"temperature":0.2,"stopSequences":[],"timeoutMs":5000,"stream":false,"model":"fireworks/starcoder-16b"}'
```
---------

Co-authored-by: Rafał Gajdulewicz <rafax@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants