Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 11 additions & 10 deletions docs/cody/capabilities/supported-models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,20 @@ Cody supports a variety of cutting-edge large language models for use in chat an

| **Provider** | **Model** | **Free** | **Pro** | **Enterprise** | | | | |
| :------------ | :-------------------------------------------------------------------------------------------------------------------------------------------- | :----------- | :----------- | :------------- | --- | --- | --- | --- |
| OpenAI | [gpt-4 turbo](https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo#:~:text=TRAINING%20DATA-,gpt%2D4%2D0125%2Dpreview,-New%20GPT%2D4) | - | ✅ | ✅ | | | | |
| OpenAI | [gpt-4o](https://platform.openai.com/docs/models#gpt-4o) | - | ✅ | ✅ | | | | |
| OpenAI | [gpt-4o-mini](https://platform.openai.com/docs/models#gpt-4o-mini) | ✅ | ✅ | ✅ | | | | |
| OpenAI | [GPT-4 Turbo](https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo#:~:text=TRAINING%20DATA-,gpt%2D4%2D0125%2Dpreview,-New%20GPT%2D4) | - | ✅ | ✅ | | | | |
| OpenAI | [GPT-4o](https://platform.openai.com/docs/models#gpt-4o) | - | ✅ | ✅ | | | | |
| OpenAI | [GPT-4o-mini](https://platform.openai.com/docs/models#gpt-4o-mini) | ✅ | ✅ | ✅ | | | | |
| OpenAI | [o3-mini-medium](https://openai.com/index/openai-o3-mini/) (experimental) | ✅ | ✅ | ✅ | | | | |
| OpenAI | [o3-mini-high](https://openai.com/index/openai-o3-mini/) (experimental) | - | - | ✅ | | | | |
| OpenAI | [o1](https://platform.openai.com/docs/models#o1) | - | ✅ | ✅ | | | | |
| Anthropic | [claude-3.5 Haiku](https://docs.anthropic.com/claude/docs/models-overview#model-comparison) | ✅ | ✅ | ✅ | | | | |
| Anthropic | [claude-3.5 Sonnet](https://docs.anthropic.com/claude/docs/models-overview#model-comparison) | ✅ | ✅ | ✅ | | | | |
| Google Gemini | [1.5 Pro](https://deepmind.google/technologies/gemini/pro/) | ✅ | ✅ | ✅ (beta) | | | | |
| Google Gemini | [2.0 Flash](https://deepmind.google/technologies/gemini/flash/) | ✅ | ✅ | ✅ | | | | |
| Google Gemini | [2.0 Flash-Lite Preview](https://deepmind.google/technologies/gemini/flash/) (experimental) | ✅ | ✅ | ✅ | | | | |

<Callout type="note">To use Claude 3 Sonnet models with Cody Enterprise, make sure you've upgraded your Sourcegraph instance to the latest version.</Callout>
| Anthropic | [Claude 3.5 Haiku](https://docs.anthropic.com/claude/docs/models-overview#model-comparison) | ✅ | ✅ | ✅ | | | | |
| Anthropic | [Claude 3.5 Sonnet](https://docs.anthropic.com/claude/docs/models-overview#model-comparison) | ✅ | ✅ | ✅ | | | | |
| Anthropic | [Claude 3.7 Sonnet](https://docs.anthropic.com/claude/docs/models-overview#model-comparison) | - | ✅ | ✅ | | | | |
| Google | [Gemini 1.5 Pro](https://deepmind.google/technologies/gemini/pro/) | ✅ | ✅ | ✅ (beta) | | | | |
| Google | [Gemini 2.0 Flash](https://deepmind.google/technologies/gemini/flash/) | ✅ | ✅ | ✅ | | | | |
| Google | [Gemini 2.0 Flash-Lite Preview](https://deepmind.google/technologies/gemini/flash/) (experimental) | ✅ | ✅ | ✅ | | | | |

<Callout type="note">To use Claude 3 Sonnet models with Cody Enterprise, make sure you've upgraded your Sourcegraph instance to the latest version. Claude 3.7 Sonnet with thinking is not supported for BYOK deployments.</Callout>

## Autocomplete

Expand Down
68 changes: 35 additions & 33 deletions docs/cody/core-concepts/token-limits.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,53 +13,55 @@ Here's a detailed breakdown of the token limits by model:

<Tabs>
<Tab title="Free">
| **Model** | **Conversation Context** | **@-mention Context** | **Output** |
| ------------------------------------ | ------------------------ | --------------------- | ---------- |
| gpt-4o-mini | 7,000 | shared | 4,000 |
| gpt-o3-mini-medium | 7,000 | shared | 4,000 |
| claude-3.5 Haiku | 7,000 | shared | 4,000 |
| **claude-3.5 Sonnet (New)** | **15,000** | **30,000** | **4,000** |
| Google Gemini 1.5 Pro | 7,000 | shared | 4,000 |
| Google Gemini 2.0 Flash | 7,000 | shared | 4,000 |
| Google Gemini 2.0 Flash-Lite Preview | 7,000 | shared | 4,000 |
| **Model** | **Conversation Context** | **@-mention Context** | **Output** |
| ----------------------------- | ------------------------ | --------------------- | ---------- |
| GPT 4o mini | 7,000 | shared | 4,000 |
| GPT o3 mini medium | 7,000 | shared | 4,000 |
| Claude 3.5 Haiku | 7,000 | shared | 4,000 |
| **Claude 3.5 Sonnet (New)** | **15,000** | **30,000** | **4,000** |
| Gemini 1.5 Pro | 7,000 | shared | 4,000 |
| Gemini 2.0 Flash | 7,000 | shared | 4,000 |
| Gemini 2.0 Flash-Lite Preview | 7,000 | shared | 4,000 |

</Tab>

<Tab title="Pro">

The Pro tier supports the token limits for the LLM models on Free tier, plus:

| **Model** | **Conversation Context** | **@-mention Context** | **Output** |
| ------------------------------------ | ------------------------ | --------------------- | ---------- |
| gpt-4o-mini | 7,000 | shared | 4,000 |
| gpt-o3-mini-medium | 7,000 | shared | 4,000 |
| gpt-4-turbo | 7,000 | shared | 4,000 |
| gpt-4o | 7,000 | shared | 4,000 |
| o1 | 7,000 | shared | 4,000 |
| claude-3.5 Haiku | 7,000 | shared | 4,000 |
| **claude-3.5 Sonnet (New)** | **15,000** | **30,000** | **4,000** |
| **Google Gemini 1.5 Pro** | **15,000** | **30,000** | **4,000** |
| Google Gemini 2.0 Flash | 7,000 | shared | 4,000 |
| Google Gemini 2.0 Flash-Lite Preview | 7,000 | shared | 4,000 |
| **Model** | **Conversation Context** | **@-mention Context** | **Output** |
| ----------------------------- | ------------------------ | --------------------- | ---------- |
| GPT 4o mini | 7,000 | shared | 4,000 |
| GPT o3 mini medium | 7,000 | shared | 4,000 |
| GPT 4 Turbo | 7,000 | shared | 4,000 |
| GPT 4o | 7,000 | shared | 4,000 |
| o1 | 7,000 | shared | 4,000 |
| Claude 3.5 Haiku | 7,000 | shared | 4,000 |
| **Claude 3.5 Sonnet (New)** | **15,000** | **30,000** | **4,000** |
| Claude 3.7 Sonnet | 15,000 | 30,000 | 4,000 |
| Gemini 1.5 Pro | 15,000 | 30,000 | 4,000 |
| Gemini 2.0 Flash | 7,000 | shared | 4,000 |
| Gemini 2.0 Flash-Lite Preview | 7,000 | shared | 4,000 |

</Tab>

<Tab title="Enterprise">

The Enterprise tier supports the token limits for the LLM models on Free and Pro tier, plus:

| **Model** | **Conversation Context** | **@-mention Context** | **Output** |
| ------------------------------------ | ------------------------ | --------------------- | ---------- |
| gpt-4o-mini | 7,000 | shared | 4,000 |
| gpt-o3-mini-medium | 7,000 | shared | 4,000 |
| gpt-4-turbo | 7,000 | shared | 4,000 |
| gpt-4o | 7,000 | shared | 4,000 |
| o1 | 7,000 | shared | 4,000 |
| o3-mini-high | 7,000 | shared | 4,000 |
| claude-3.5 Haiku | 7,000 | shared | 4,000 |
| **claude-3.5 Sonnet (New)** | **15,000** | **30,000** | **4,000** |
| Google Gemini 2.0 Flash | 7,000 | shared | 4,000 |
| Google Gemini 2.0 Flash-Lite Preview | 7,000 | shared | 4,000 |
| **Model** | **Conversation Context** | **@-mention Context** | **Output** |
| ----------------------------- | ------------------------ | --------------------- | ---------- |
| GPT 4o mini | 7,000 | shared | 4,000 |
| GPT o3 mini medium | 7,000 | shared | 4,000 |
| GPT 4 Turbo | 7,000 | shared | 4,000 |
| GPT 4o | 7,000 | shared | 4,000 |
| o1 | 7,000 | shared | 4,000 |
| o3 mini high | 7,000 | shared | 4,000 |
| Claude 3.5 Haiku | 7,000 | shared | 4,000 |
| **Claude 3.5 Sonnet (New)** | **15,000** | **30,000** | **4,000** |
| Claude 3.7 Sonnet | 15,000 | 30,000 | 4,000 |
| Gemini 2.0 Flash | 7,000 | shared | 4,000 |
| Gemini 2.0 Flash-Lite Preview | 7,000 | shared | 4,000 |

</Tab>
</Tabs>
Expand Down
35 changes: 35 additions & 0 deletions docs/cody/enterprise/model-configuration.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,41 @@ In the example above:
- Sourcegraph-provided models are used for `"chat"` and `"fastChat"` (accessed via Cody Gateway)
- The newly configured model, `"huggingface-codellama::v1::CodeLlama-7b-hf"`, is used for `"autocomplete"` (connecting directly to Hugging Face’s OpenAI-compatible API)

#### Example configuration with Claude 3.7 Sonnet

```json
{
"modelRef": "anthropic::2024-10-22::claude-3-7-sonnet-latest",
"displayName": "Claude 3.7 Sonnet",
"modelName": "claude-3-7-sonnet-latest",
"capabilities": [
"chat",
"reasoning"
],
"category": "accuracy",
"status": "stable",
"tier": "pro",
"contextWindow": {
"maxInputTokens": 45000,
"maxOutputTokens": 4000
},
"modelCost": {
"unit": "mtok",
"inputTokenPennies": 300,
"outputTokenPennies": 1500
},
"reasoningEffort": "high"
},
```

In this modelOverrides config example:

- The model is configured to use Claude 3.7 Sonnet with Cody Gateway
- The model is configured to use the `"chat"` and `"reasoning"` capabilities
- The `reasoningEffort` can be set to 3 different options in the Model Config. These options are `high`, `medium` and `low`
- The default `reasoningEffort` is set to `low`
- When the reasoning effort is `low`, 1024 tokens is used as the thinking budget. With `medium` and `high` the thinking budget is set via `max_tokens_to_sample/2`

Refer to the [examples page](/cody/enterprise/model-config-examples) for additional examples.

## View configuration
Expand Down
2 changes: 2 additions & 0 deletions src/utils/constants/supportedModelsEnt.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ export const chatTableDataEnt: any = {
{ provider: 'Anthropic', model: 'claude-3 Sonnet', status: '✅ *(5.3.9104 and above)' },
{ provider: 'Anthropic', model: 'claude-3.5 Sonnet', status: '✅ *(5.5.0 and above)' },
{ provider: 'Anthropic', model: 'claude-3.5 Sonnet (Latest)', status: '✅ *(5.9 and above)' },
{ provider: 'Anthropic', model: 'claude-3.7 Sonnet', status: '✅ *(6.1.1295 and above)' },
{ provider: 'Google', model: 'Gemini 1.5 Pro', status: '✅ *(5.4.5099 and above)' },
],
'OpenAI': [
Expand All @@ -52,6 +53,7 @@ export const chatTableDataEnt: any = {
{ provider: 'Anthropic', model: 'claude-3 Sonnet', status: '✅ *(5.3.9104 and above)' },
{ provider: 'Anthropic', model: 'claude-3.5 Sonnet', status: '✅ *(5.5.0 and above)' },
{ provider: 'Anthropic', model: 'claude-3.5 Sonnet (Latest)', status: '✅ *(5.5.0 and above)' },
{ provider: 'Anthropic', model: 'claude-3.7 Sonnet', status: '✅ *(6.1.1295 and above)' },
{ provider: 'Google', model: 'Gemini 1.5 Pro', status: '❌' },
],
'Google': [
Expand Down