-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add an OpenAI-compatible provider as a generic Enterprise LLM adapter #3218
Merged
Commits on Mar 18, 2024
-
add an OpenAI-compatible provider as a generic Enterprise LLM adapter
Increasingly, LLM software is standardizing around the use of OpenAI-esque compatible endpoints. Some examples: * [OpenLLM](https://github.com/bentoml/OpenLLM) (commonly used to self-host/deploy various LLMs in enterprises) * [Huggingface TGI](huggingface/text-generation-inference#735) (and, by extension, [AWS SageMaker](https://aws.amazon.com/blogs/machine-learning/announcing-the-launch-of-new-hugging-face-llm-inference-containers-on-amazon-sagemaker/)) * [Ollama](https://github.com/ollama/ollama) (commonly used for running LLMs locally, useful for local testing) All of these projects either have OpenAI-compatible API endpoints already, or are actively building out support for it. On strat we are regularly working with enterprise customers that self-host their own specific-model LLM via one of these methods, and wish for Cody to consume an OpenAI endpoint (understanding some specific model is on the other side and that Cody should optimize for / target that specific model.) Since Cody needs to tailor to a specific model (prompt generation, stop sequences, context limits, timeouts, etc.) and handle other provider-specific nuances, it is insufficient to simply expect that a customer-provided OpenAI compatible endpoint is in fact 1:1 compatible with e.g. GPT-3.5 or GPT-4. We need to be able to configure/tune many of these aspects to the specific provider/model, even though it presents as an OpenAI endpoint. In response to these needs, I am working on adding an 'OpenAI-compatible' provider proper: the ability for a Sourcegraph enterprise instance to advertise that although it is connected to an OpenAI compatible endpoint, there is in fact a specific model on the other side (starting with Starchat and Starcoder) and that Cody should target that configuration. The _first step_ of this work is this change. After this change, an existing (current-version) Sourcegraph enterprise instance can configure an OpenAI endpoint for completions via the site config such as: ``` "cody.enabled": true, "completions": { "provider": "openai", "accessToken": "asdf", "endpoint": "http://openllm.foobar.com:3000", "completionModel": "gpt-4", "chatModel": "gpt-4", "fastChatModel": "gpt-4", }, ``` The `gpt-4` model parameters will be sent to the OpenAI-compatible endpoint specified, but will otherwise be unused today. Users may then specify in their VS Code configuration that Cody should treat the LLM on the other side as if it were e.g. Starchat: ``` "cody.autocomplete.advanced.provider": "experimental-openaicompatible", "cody.autocomplete.advanced.model": "starchat-16b-beta", "cody.autocomplete.advanced.timeout.multiline": 10000, "cody.autocomplete.advanced.timeout.singleline": 10000, ``` In the future, we will make it possible to configure the above options via the Sourcegraph site configuration instead of each user needing to configure it in their VS Code settings explicitly. Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>
Configuration menu - View commit details
-
Copy full SHA for 23fa968 - Browse repository at this point
Copy the full SHA 23fa968View commit details
Commits on Mar 26, 2024
-
Configuration menu - View commit details
-
Copy full SHA for c93b032 - Browse repository at this point
Copy the full SHA c93b032View commit details -
Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>
Configuration menu - View commit details
-
Copy full SHA for 80cc1c5 - Browse repository at this point
Copy the full SHA 80cc1c5View commit details
Commits on Mar 27, 2024
-
Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>
Configuration menu - View commit details
-
Copy full SHA for b1bdcd7 - Browse repository at this point
Copy the full SHA b1bdcd7View commit details
Commits on Mar 28, 2024
-
Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>
Configuration menu - View commit details
-
Copy full SHA for ce562cd - Browse repository at this point
Copy the full SHA ce562cdView commit details -
fix test / provider identifier
Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>
Configuration menu - View commit details
-
Copy full SHA for 755e44d - Browse repository at this point
Copy the full SHA 755e44dView commit details -
Configuration menu - View commit details
-
Copy full SHA for d06e752 - Browse repository at this point
Copy the full SHA d06e752View commit details -
Configuration menu - View commit details
-
Copy full SHA for fa792b0 - Browse repository at this point
Copy the full SHA fa792b0View commit details -
Configuration menu - View commit details
-
Copy full SHA for a999d60 - Browse repository at this point
Copy the full SHA a999d60View commit details -
Configuration menu - View commit details
-
Copy full SHA for 89a9696 - Browse repository at this point
Copy the full SHA 89a9696View commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.