This project allows you to host your own GitHubCopilot-like model locally while using the official GitHubCopilot VSCode extension.
-
Download and install the oobabooga backend
-
Download a model
open the oobabooga UI, go to the models tab and download a code completion model. I'm using:
Deci/DeciCoder-1b
, paste that name, then click download, then click load once completeWhich model should I choose? Use smaller models for faster predictions, especially if you have a weaker PC. I tested DeciCoder-1b
size speed model name 125M superfast flax-community/gpt-neo-125M-code-clippy-dedup-2048 1B fast Deci/DeciCoder-1b 3B medium TheBloke/stablecode-instruct-alpha-3b-GGML 7B slow mlabonne/codellama-2-7b 15B slow TheBloke/WizardCoder-15B-1.0-GGML
Optional testing
A. (optional) Test the backend using curl
:
```sh
curl -X 'POST' 'http://localhost:5000/v1/engines/codegen/completions' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{"prompt":"def hello_w","suffix":"","max_tokens":500,"temperature":0.4,"top_p":1,"n":10,"stop":["\ndef ","\nclass ","\nif ","\n\n#"],"logprobs":2,"stream":true}'
```
B. (optional) Test that the model is working by going to the "chat" tab and clicking "generate".
-
Go to VSCode and modify the settings and add the following:
"github.copilot.advanced": { "debug.overrideEngine": "codegen", "debug.testOverrideProxyUrl": "http://localhost:8000", // address:port of middleware "debug.overrideProxyUrl": "http://localhost:8000", },
-
(optional for authentication) Update
~/.vscode/extensions/github.copilot-*/dist/extension.js
with the following:- Replace
https://api.github.com/copilot_internal
withhttp://127.0.0.1:8000/copilot_internal
- replace
https://copilot-proxy.githubusercontent.com
withhttp://127.0.0.1:8000
- Replace
-
Run the proxy:
pip install git+https://github.com/FarisHijazi/localCopilot localCopilot --port 7000
If you have oobabooga running on a separate server use the --backend argument {hostname:port}
pip install git+https://github.com/FarisHijazi/localCopilot localCopilot --port 8000 --backend http://10.0.0.1:5002
(Optional): testing the middleware
curl -X 'POST' 'http://localhost:8000/v1/engines/codegen/completions' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{"prompt":"def hello_w","suffix":"","max_tokens":500,"temperature":0.4,"top_p":1,"n":2,"stop":["\ndef ","\nclass ","\nif ","\n\n#"],"logprobs":2,"stream":true}'
expected output
data: {"id": "conv-1692741316942825472", "object": "text_completion.chunk", "created": 1692741316, "model": "Deci_DeciCoder-1b", "choices": [{"index": 0, "finish_reason": "stop", "text": "", "logprobs": {"top_logprobs": [{"<|endoftext|>": -0.4215908944606781, "<fim_middle>": -1.2965909242630005, "\n": -3.0741329193115234}]}}], "usage": {"prompt_tokens": 4, "completion_tokens": 13, "total_tokens": 17}}
data: [DONE]
-
install the official GitHub copilot extension
-
HAPPY CODING!
To test that the copilot extension is working, either type some code and hope for a completion or use the command pallet (
Ctrl+Shift+P
) and search forGitHub Copilot: Open Completions Panel
This is done using a single script: localCopilot/middleware.py
(only 90 lines of code), which is a compatibility layer between the official GitHub copilot VSCode extension and oobabooga as a backend.
Credit: I learned about the traffic redirecting from the Fauxpilot project here.
Cloud | |
Self-hosted |
Advanced experimental hacks
The tokenizers used by Copilot are not the same, so you can overwrite them. However, I'm not sure how useful this actually is as I don't notice any change in performance
COPILOTPATH=$HOME/.vscode/extensions/github.copilot-1.105.353
MODELPATH=$HOME/Projects/oobabooga_linux/text-generation-webui/models/Deci_DeciCoder-1b
mv $COPILOTPATH/dist/resources $COPILOTPATH/dist/resources.backup
mkdir -p $COPILOTPATH/dist/resources/cushman001
mkdir -p $COPILOTPATH/dist/resources/cushman002
cp $MODELPATH/tokenizer.json $COPILOTPATH/dist/resources/cushman001/tokenizer_cushman001.json
cp $MODELPATH/merges.txt $COPILOTPATH/dist/resources/cushman001/vocab_cushman001.bpe
cp $MODELPATH/tokenizer.json $COPILOTPATH/dist/resources/cushman002/tokenizer_cushman002.json
cp $MODELPATH/merges.txt $COPILOTPATH/dist/resources/cushman002/vocab_cushman002.bpe
And to revert your changes, just uninstall and reinstall the extension.
OR:
rm -rf $COPILOTPATH/dist/resources
mv $COPILOTPATH/dist/resources.backup $COPILOTPATH/dist/resources
- 🔏 Privacy: No more sending your code to the cloud! This is the main benefit especially for enterprise. No code is sent to the cloud when self-hosting since everything runs on your machine(s).
- 🌐 Works without internet: use it on the plane!
✈️ - 💰 Free: No need to pay for your monthly subscription
- GitHub copilot looks at multiple files for context. The current hack only looks at the current file
- Quality Open source models might not have suggestions as good as copilot, but still as good most of the time
- GitHub copilot gives 10 suggestions, while this hack gives only 1 suggestion per completion
- 🐛 There's a bug where the first space in autocompletion is skipped, this is due to the oobabooga backend, not the model
There are many other projects for having an open source alternative for copilot, but they all need so much maintenance, I tried to use an existing large project that is well maintained: oobabooga, since it supports almost all open source LLMs and is commonly used, and is well maintained
I know that the middleware method might not be optimal, but this is a minimal hack that's easy to run, and this repository should be really easy to maintain.
Once oobabooga supports multiple requests in a single call, then the middleware should no longer be needed.
Here are some helpful open source projects I found while doing my research:
Project URL | description | actively maintained (as of Aug 2023) |
---|---|---|
https://github.com/CodedotAl/gpt-code-clippy | Frontend + models | ❌ |
https://github.com/Venthe/vscode-fauxpilot | this is a FauxPilot frontend | ✅ |
https://github.com/hieunc229/copilot-clone | frontend which uses Google/StackOverflow search as a backend | ✅ |
https://github.com/fauxpilot/fauxpilot | FauxPilot backend | ✅ |
https://github.com/ravenscroftj/turbopilot | A backend that runs models | ✅ |