Using GPT‐Pilot with Local LLMs
GPT Pilot is developed primarily with OpeAI GPT-4 in mind. This is because it is currently the best model on the market for general programming tasks (not just writing code but also planning, debugging, and everything else related to the development).
This means that GPT Pilot uses OpenAI API (to access GPT-4), and also that the prompts are all optimized for GPT4.
While it is possible to run GPT Pilot with another LLM, you will need to modify the prompts to make it work - expect a lot of trial and error there.
That said, here's how you can use the command-line version of GPT Pilot with your local LLM of choice:
- Set up GPT-Pilot
- Install an local API proxy (see below for choices)
- Edit
.env
file ingpt-pilot/pilot/
directory (this is the file you would have to set up with your OpenAI keys in step 1), to setOPENAI_ENDPOINT
andOPENAI_API_KEY
to something required by the local proxy; for example:# This differs for different local llm proxies, this is just an example OPENAI_ENDPOINT=http://0.0.0.0:8000/chat/completions OPENAI_API_KEY=dummy
- Start the API Proxy in a separate terminal, then start GPT Pilot as usual:
cd /path/to/gpt-pilot source pilot-env/bin/activate # (or pilot-env\Scripts\activate on Windows) cd pilot/ python main.py
- As you're using GPT Pilot, watch the output that LLM makes. It will probably get stuck in a loop, or producing nonsense output, and you'll need to go to
pilot/prompts
and start hacking. No programming knowledge needed!
The extension currently doesn't allow changing the endpoint/key settings so it can't be used out of the box. However, it uses the command-line GPT Pilot under the hood so you can configure these settings in the same way.
- Install the VSCode GPT Pilot extension
- Start the extension. On the first run, you will need to select an empty folder where the GPT Pilot will be downloaded and configured. This will take a few minutes.
- Open a terminal and go to that folder. The
gpt-pilot/pilot/.env
file should have already been created, and you can proceed with the same steps as for the command line version (see above).
LM Studio is an easy way to discover, download and run local LLMs, and is available for Windows, Mac and Linux. After selecting a downloading an LLM, you can go to the Local Inference Server
tab, select the model and then start the server.
Then edit the GPT Pilot .env
file to set:
OPENAI_ENDPOINT=http://localhost:1234/v1/chat/completions
OPENAI_API_KEY=dummy
(The port 1234 is default in LM Studio, you can use any free port just make sure it's the same in LM Studio and the .env file).
LiteLLM can proxy for a lot of remote or local LLMs, including ollama
, vllm
and huggingface
(meaning it can run most of the models that these programs can run. Here is the full list of supported LLM providers, with instructions how to set them up.
The full documentation to set up LiteLLM with a local proxy server is here, but in a nutshell:
pip install litellm[proxy]
Test it with litellm --test
After you set it up (as per the quickstart above), run it with:
litellm --model yourmodel --drop_params
(The --drop_params
flag is here to ignore OpenAI specific flags that wouldn't work for other LLMs).
Then edit the .env
file in GPT Pilot directory (see above) with:
OPENAI_ENDPOINT=http://0.0.0.0:8000/chat/completions
OPENAI_API_KEY=dummywhatever
(if you change the default port in litellm configuration, be sure to also update your .env
).
Thanks to our Discord community members @HeliosPrime
, @Kujila
and @Limenarity
for diving into, researching, and hacking GPT Pilot to make work (communicate with) the local LLMs. This tutorial is mostly based on the work they shared on GPT Pilot Discord channels.