Add the possibility to use offline models (maybe via ollama) #4424

gawlk · 2023-10-14T08:27:08Z

Check for existing issues

Completed

Describe the feature

Hi,

Having the possibility to use other models for example llama (most likely via ollama) would be really amazing instead of being forced to use the proprietary and unethical ChatGPT.

Here's a link to their API docs: https://github.com/jmorganca/ollama/blob/main/docs/api.md

Since an API is also used for ChatGPT it shouldn't be too much work

If applicable, add mockups / screenshots to help present your vision of the feature

No response

willtejeda · 2024-02-07T19:36:02Z

This would be awesome.

this project seems like it could serve as inspiration for the feature

https://github.com/continuedev/continue

tjohnman · 2024-02-08T08:29:35Z

It should be easy to add support to any service that provides an OpenAI-compatible API, such as Perplexity, or LiteLLM for local models.

octoth0rpe · 2024-02-09T11:49:50Z

ollama is now OpenAI-compatible as well: https://ollama.ai/blog/openai-compatibility

shaqq · 2024-02-14T21:05:34Z

a quantized version of CodeLlama would work well locally on macs:

https://huggingface.co/TheBloke/CodeLlama-34B-GGUF

ollama has a way of interacting with a quantized CodeLlama, but up to the Zed team whether they'd rather use ollama or run llama.cpp within Zed (ollama runs llama.cpp under the hood)

IMO this should be more generic than "offline vs. online", and more about giving users choice in which CoPilot model they'd like to use. There's a balance for sure!

Belluxx · 2024-02-19T20:56:04Z

A low effort approach to include this feature is allowing the configuration of a custom proxy for copilot.
It is already possible today in VSCode's Copilot extension using this config:

"github.copilot.advanced": {
    "debug.testOverrideProxyUrl": "http://localhost:5001",
    "debug.overrideProxyUrl": "http://localhost:5001"
}

It would be really cool to have this tweak available in Zed too

sumanmichael · 2024-02-22T07:53:49Z

ollama is now OpenAI-compatible as well: https://ollama.ai/blog/openai-compatibility

This is how, it's working for me

I couldn't add my custom model to the "default_open_ai_model" setting, For now, Zed is allowing only OpenAI models ("gpt-3.5-turbo-0613","gpt-4-0613","gpt-4-1106-preview"). So I had to clone it to proxy it.

Pull and run Mistral model from Ollama library
Clone the Mistral model as gpt-4-1106-preview

ollama run mistral
ollama cp mistral gpt-4-1106-preview

Added this to my Zed Settings (~/.config/zed/settings.json)

"assistant": {
    "openai_api_url": "http://localhost:11434/v1"
 }

Restart Zed

Belluxx · 2024-02-23T08:16:26Z

@sumanmichael Thank you for the tip. Can you confirm that this only works for "Assistant Panel" (chat) and "Inline Assist"?
I tried to use it for copilot (theoretically supported since copilot uses OpenAI-APIs) but with no success because i don't have a Microsoft/Copilot account.

Is there a way to bypass Zed login requirement to use Copilot?

taylorgoolsby · 2024-02-26T09:56:15Z

Can this also be made to work with any local server running an API similar to OpenAI's API? Specifically, I'm interested in using LM Studio.

JayGhiya · 2024-02-26T14:43:43Z

just integrate continue.dev please this will exponentially increase adoption as continue dev solves all of llm worries and works with all possible providers both local and cloud

oxaronick · 2024-02-26T14:59:29Z

Can this also be made to work with any local server running an API similar to OpenAI's API? Specifically, I'm interested in using LM Studio.

Same here, I'm using LiteLLM which presents an OpenAI-compatible API, and integrates with a bunch of model loaders on the back end (ollama, tgi, etc.). Would be nice to be able to just set a URL and token and have it use my server.

oxaronick · 2024-02-26T15:02:52Z

just integrate continue.dev please this will exponentially increase adoption as continue dev solves all of llm worries and works with all possible providers both local and cloud

Yeah, Continue is very flexible. This is what the Continue config looks like:

  "models": [
    {
      "title": "mixtral",
      "provider": "openai",
      "model": "mixtral:8x7b",
      "apiBase": "https://skynet.becomes.self.aware.io:444",
      "apiKey": "sk-somethingsomething"
    }
  ]

I suppose if Zed only supports the OpenAI API it wouldn't need provider, but otherwise that's a nice way to let users configure their server.

jianghoy · 2024-02-26T19:21:03Z

just integrate continue.dev please this will exponentially increase adoption as continue dev solves all of llm worries and works with all possible providers both local and cloud

Yeah, Continue is very flexible. This is what the Continue config looks like:
  "models": [
    {
      "title": "mixtral",
      "provider": "openai",
      "model": "mixtral:8x7b",
      "apiBase": "https://skynet.becomes.self.aware.io:444",
      "apiKey": "sk-somethingsomething"
    }
  ]
I suppose if Zed only supports the OpenAI API it wouldn't need provider, but otherwise that's a nice way to let users configure their server.

So I suppose a couple of QoL improvements can be made here, assuming all LLM below are compliant with OpenAI's specs:

Zed should be able to store multiple LLM configs, following the schema used by continue.dev, which, IMHO looks pretty robust.
Zed should be able to toggle between multiple LLM providers using configs provided by users.

oxaronick · 2024-02-26T20:04:36Z

Zed should be able to toggle between multiple LLM providers using configs provided by users.

As a starting point, even the ability to configure one model hosted locally or on my server would be great.

taylorgoolsby · 2024-02-27T00:24:49Z

Also, since OpenAI is so proprietary, I do not really feel comfortable with the idea that all these open source/weight models are copying the OpenAI API spec. It would not surprise me if in the future, an open standard is created rather than relying on OpenAI to set the standard. I'm not saying we should decide on an open spec right here right now, but just wanted to point this out, and emphasis a need for simplicity.

franz101 · 2024-02-27T21:41:45Z

Also, since OpenAI is so proprietary, I do not really feel comfortable with the idea that all these open source/weight models are copying the OpenAI API spec. It would not surprise me if in the future, an open standard is created rather than relying on OpenAI to set the standard. I'm not saying we should decide on an open spec right here right now, but just wanted to point this out, and emphasis a need for simplicity.

The OpenAI API spec is already the norm for many libraries.
Ollama has a PR to list model endpoint soon:
ollama/ollama#2476

I would propose to keep it simple and allow custom models names if the default api is anything other than openai

thosky · 2024-03-04T09:13:02Z

ollama is now OpenAI-compatible as well: https://ollama.ai/blog/openai-compatibility

This is how, it's working for me

I couldn't add my custom model to the "default_open_ai_model" setting, For now, Zed is allowing only OpenAI models ("gpt-3.5-turbo-0613","gpt-4-0613","gpt-4-1106-preview"). So I had to clone it to proxy it.

Pull and run Mistral model from Ollama library Clone the Mistral model as gpt-4-1106-preview
ollama run mistral
ollama cp mistral gpt-4-1106-preview
Added this to my Zed Settings (~/.config/zed/settings.json)
"assistant": {
    "openai_api_url": "http://localhost:11434/v1"
 }
Restart Zed

I have tried this solution with Mistral running locally with ollama. It doesn't work for me. Did anybody else actually make this work?

janerikmai · 2024-03-04T20:25:57Z

Works for

ollama is now OpenAI-compatible as well: https://ollama.ai/blog/openai-compatibility

This is how, it's working for me
I couldn't add my custom model to the "default_open_ai_model" setting, For now, Zed is allowing only OpenAI models ("gpt-3.5-turbo-0613","gpt-4-0613","gpt-4-1106-preview"). So I had to clone it to proxy it.
Pull and run Mistral model from Ollama library Clone the Mistral model as gpt-4-1106-preview
ollama run mistral
ollama cp mistral gpt-4-1106-preview
Added this to my Zed Settings (~/.config/zed/settings.json)
"assistant": {
    "openai_api_url": "http://localhost:11434/v1"
 }
Restart Zed
I have tried this solution with Mistral running locally with ollama. It doesn't work for me. Did anybody else actually make this work?

Works for me as described using codellama:7b-instruct or mistral. Did just a short test repo for testing.

taylorgoolsby · 2024-03-05T02:55:20Z

How would zed know what a model's token limit is?

Also, as a side note, some models use different tokenizers. Some well known ones are BPE, SentencePiece, and CodeGen. Counting tokens using the wrong tokenizer would produce inaccurate counts.

thosky · 2024-03-06T09:00:56Z

@janerikmai
Thanks for confirming it's working. It's working for me as well now after reinstalling Ollama. The old version might not have supported the open api.

james-haddock · 2024-03-13T17:09:16Z

If you want to use another model available on Hugging Face that's not native to Ollama (DeepSeek-Coder, WizardCoder etc), when creating from GGUF you can explicitly name it to be compatible with Zed whilst reading from the Modelfile.

ollama create gpt-4-1106-preview -f Modelfile

Krukov · 2024-03-15T10:00:47Z

After #8646 to make local LLM work you need to add this to Zed Settings (~/.config/zed/settings.json)

  "assistant": {
    "provider": {
      "type": "openai",
      "api_url": "http://localhost:11434/v1"
    }
  }

At least it works for me

versecafe · 2024-03-18T23:44:56Z

How is this working with the openAI calls for ada embeddings? or is that just dysfunctional?

crates/ai/src/providers/open_ai/embeddings.rs

impl OpenAiEmbeddingProvider {
    pub async fn new(client: Arc<dyn HttpClient>, executor: BackgroundExecutor) -> Self {
        let (rate_limit_count_tx, rate_limit_count_rx) = watch::channel_with(None);
        let rate_limit_count_tx = Arc::new(Mutex::new(rate_limit_count_tx));

        // Loading the model is expensive, so ensure this runs off the main thread.
        let model = executor
            .spawn(async move { OpenAiLanguageModel::load("text-embedding-ada-002") })
            .await;
        let credential = Arc::new(RwLock::new(ProviderCredential::NoCredentials));

        OpenAiEmbeddingProvider {
            model,
            credential,
            client,
            executor,
            rate_limit_count_rx,
            rate_limit_count_tx,
        }
    }
    // ... additional code
 }

andreicek · 2024-03-21T07:56:25Z

For what it's worth this is what I needed to do to make it work locally:

  "assistant": {
    "version": "1",
    "provider": {
      "name": "openai",
      "api_url": "http://localhost:11434/v1"
    }
  }

Then pick a model: ollama run mistral and then ollama cp mistral gpt-4-turbo-preview. Restart zed and enjoy.

craigcomstock · 2024-03-28T14:52:23Z

I tried this and I get a prompt to enter an OpenAI API key. I seem blocked even though I have the assistant config mentioned earlier.

If I enter a junk key it doesn't help then I get errors about not being able to connect to OpenAI. So the settings aren't working apparently?

my $HOME/.config/zed/settings.json
zed version: Zed 0.128.3

{
  "ui_font_size": 16,
  "buffer_font_size": 16,
  "assistant": {
    "version": "1",
    "provider": {
      "name": "openai",
      "api_url": "http://localhost:11434/v1"
    }
  }
}

After I add a valid OpenAI API key things seem to work. I choose gpt-4-turbo which I setup with ollama: ollama cp mistral gpt-4-turbo-preview.

With that I try a prompt in a file and it seems my local LLM is too slow, zed says: "request or operation took longer than the configured timeout time". I don't get any auto-complete bits in assistant or provider json entries for time...

I see this timeout is a known issue: #9913

duggan · 2024-03-28T15:08:31Z

~~Looks like there are some new required settings,~~ this does the trick for me:

  "assistant": {
    "version": "1",
    "provider": {
      "name": "openai",
      "type": "openai",
      "default_model": "gpt-4-turbo-preview",
      "api_url": "http://localhost:11434/v1"
    }
  }

Edit: maybe I just lost the API key after a restart 😅

related issues: zed-industries#9913 # assistant timeout zed-industries#4424 # add the possibility to use offline models (maybe via ollama) I am using ollama with mistral model. my local settings.json: { "theme": "One Dark", "ui_font_size": 16, "buffer_font_size": 16, "assistant": { "version": "1", "provider": { "name": "openai", "api_url": "http://localhost:11434/v1" } } }

ednanf · 2024-04-13T13:27:21Z

Here is a complete rundown of how I got it to work after collecting all the pieces of information in this thread:
Note: I run Ollama on Mac, if you run anything differently, you have to adapt it.

Add the following configuration to Zed config:

"assistant": {
    "version": "1",
    "provider": {
      "name": "openai",
      "type": "openai",
      "default_model": "gpt-4-turbo-preview",
      "api_url": "http://localhost:11434/v1"
    }
  }

Download Mistral via ollama cli:

ollama run mistral

Copy the downloaded Mistral, changing its name:

ollama cp mistral gpt-4-turbo-preview

Add the OpenAI API key to Zed (source):

ollama

Restart Zed to ensure everything is working as it is supposed to.

I hope this helps!

matthieuHenocque · 2024-04-21T12:11:38Z

I wish Zed could provide an easy way to point to a server API like the Continue extension. Here is an example in my config file in VSCodium.

"models": [
    {
      "title": "Mistral",
      "model": "mistral-7b-instruct-v0.2-code-ft.Q5_K_M",
      "contextLength": 4096,
      "provider": "lmstudio"
    }
  ],
"tabAutocompleteModel": {
    "title": "Tab Autocomplete Model",
    "provider": "lmstudio",
    "model": "mistral-7b-instruct-v0.2-code-ft.Q5_K_M",
    "apiBase": "http://localhost:1234/v1/"
  },

LM Studio supports multiple endpoints for different context:

GET /v1/models
POST /v1/chat/completions
POST /v1/embeddings            
POST /v1/completions

I followed @sumanmichael (Thank you so much) steps to make my local Mistral work, but while the chat box works flawlessly, the code completion is still messy compared to the same exact model running through Continue in VSCodium.

Being able to manually set an API endpoint instead of letting Zed concatenating '/completion' to an alleged openai server HTTP address would allow any type of installation.

psyv282j9d · 2024-04-25T11:55:45Z

Yeah, Continue is very flexible. This is what the Continue config looks like:

  "models": [
    {
      "title": "mixtral",
      "provider": "openai",
      "model": "mixtral:8x7b",
      "apiBase": "https://skynet.becomes.self.aware.io:444",
      "apiKey": "sk-somethingsomething"
    }
  ]

fwiw, I managed to get Continue to generate this very useful config fragment for using Ollama:

"models": [
    {
      "model": "AUTODETECT",
      "title": "Ollama",
      "completionOptions": {},
      "apiBase": "http://localhost:11434",
      "provider": "ollama"
    }
  ],

You need to restart VSCode when you add a model to Ollama, but at least you don't need to add another config fragment... very nice.

gawlk added admin read Pending admin review enhancement [core label] triage Maintainer needs to classify the issue labels Oct 14, 2023

JosephTLyons added ai Improvement related to Assistant, Copilot, or other AI features and removed triage Maintainer needs to classify the issue admin read Pending admin review labels Oct 16, 2023

JosephTLyons mentioned this issue Oct 16, 2023

Top-Ranking Issues (All Time) 📊 #5393

Open

JosephTLyons transferred this issue from zed-industries/community Jan 24, 2024

Belluxx mentioned this issue Feb 19, 2024

Allow the use of custom proxy for copilot #8034

Closed

1 task

Belluxx mentioned this issue Feb 22, 2024

Local LLM Support #8192

Closed

1 task

This was referenced Feb 28, 2024

Extend AI Support for local LLMs using OpenAI API #8571

Closed

need a reverse proxy service for chatgpt/openai #8576

Closed

TwistingTwists mentioned this issue Mar 1, 2024

CoPilot should be decoupled from zed #6708

Open

1 task

Moshyfawn mentioned this issue Apr 4, 2024

Low Data Mode #10170

Open

4 tasks

Moshyfawn mentioned this issue Apr 18, 2024

Add Configuration Option for Custom OpenAI Server URL #10716

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the possibility to use offline models (maybe via ollama) #4424

Add the possibility to use offline models (maybe via ollama) #4424

gawlk commented Oct 14, 2023

willtejeda commented Feb 7, 2024

tjohnman commented Feb 8, 2024

octoth0rpe commented Feb 9, 2024

shaqq commented Feb 14, 2024

Belluxx commented Feb 19, 2024

sumanmichael commented Feb 22, 2024

Belluxx commented Feb 23, 2024

taylorgoolsby commented Feb 26, 2024

JayGhiya commented Feb 26, 2024

oxaronick commented Feb 26, 2024

oxaronick commented Feb 26, 2024

jianghoy commented Feb 26, 2024

oxaronick commented Feb 26, 2024

taylorgoolsby commented Feb 27, 2024

franz101 commented Feb 27, 2024

thosky commented Mar 4, 2024

janerikmai commented Mar 4, 2024 •

edited

taylorgoolsby commented Mar 5, 2024 •

edited

thosky commented Mar 6, 2024

james-haddock commented Mar 13, 2024 •

edited

Krukov commented Mar 15, 2024 •

edited

versecafe commented Mar 18, 2024

andreicek commented Mar 21, 2024

craigcomstock commented Mar 28, 2024 •

edited

duggan commented Mar 28, 2024 •

edited

ednanf commented Apr 13, 2024

matthieuHenocque commented Apr 21, 2024

psyv282j9d commented Apr 25, 2024

Add the possibility to use offline models (maybe via ollama) #4424

Add the possibility to use offline models (maybe via ollama) #4424

Comments

gawlk commented Oct 14, 2023

Check for existing issues

Describe the feature

If applicable, add mockups / screenshots to help present your vision of the feature

willtejeda commented Feb 7, 2024

tjohnman commented Feb 8, 2024

octoth0rpe commented Feb 9, 2024

shaqq commented Feb 14, 2024

Belluxx commented Feb 19, 2024

sumanmichael commented Feb 22, 2024

Belluxx commented Feb 23, 2024

taylorgoolsby commented Feb 26, 2024

JayGhiya commented Feb 26, 2024

oxaronick commented Feb 26, 2024

oxaronick commented Feb 26, 2024

jianghoy commented Feb 26, 2024

oxaronick commented Feb 26, 2024

taylorgoolsby commented Feb 27, 2024

franz101 commented Feb 27, 2024

thosky commented Mar 4, 2024

janerikmai commented Mar 4, 2024 • edited

taylorgoolsby commented Mar 5, 2024 • edited

thosky commented Mar 6, 2024

james-haddock commented Mar 13, 2024 • edited

Krukov commented Mar 15, 2024 • edited

versecafe commented Mar 18, 2024

andreicek commented Mar 21, 2024

craigcomstock commented Mar 28, 2024 • edited

duggan commented Mar 28, 2024 • edited

ednanf commented Apr 13, 2024

matthieuHenocque commented Apr 21, 2024

psyv282j9d commented Apr 25, 2024

janerikmai commented Mar 4, 2024 •

edited

taylorgoolsby commented Mar 5, 2024 •

edited

james-haddock commented Mar 13, 2024 •

edited

Krukov commented Mar 15, 2024 •

edited

craigcomstock commented Mar 28, 2024 •

edited

duggan commented Mar 28, 2024 •

edited