Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying Local OpenAI api failed (ollama / LM Studio) #1886

Closed
2 tasks done
Namec999 opened this issue Jan 3, 2024 · 11 comments
Closed
2 tasks done

Trying Local OpenAI api failed (ollama / LM Studio) #1886

Namec999 opened this issue Jan 3, 2024 · 11 comments
Assignees
Labels
💪 enhancement New feature or request

Comments

@Namec999
Copy link

Namec999 commented Jan 3, 2024

Self Checks

Description of the new feature / enhancement

hello trying the new 0.4.1 with local Ollama and liteLLM, using both : Add openAI dify models, and the new compatible OpenAI API tab.
but i got each time error, not connected and/or no Model found. while my ollama is working and serving well.

is there any way to use tools like Ollama and/or LM Studio for local inference.

best

Scenario when this would be used?

is there any way to use tools like Ollama and/or LM Studio for local inference.

Supporting information

No response

@Namec999 Namec999 added the 💪 enhancement New feature or request label Jan 3, 2024
@takatost
Copy link
Collaborator

takatost commented Jan 4, 2024

Ollama seems to have not yet implemented the OpenAI Compatible API.
ollama/ollama#305
duplicate #1725

@takatost
Copy link
Collaborator

takatost commented Jan 4, 2024

And I tried LM Studio's Local Inference Server and it worked well after configuring it to be OpenAI-API-compatible.

@takatost takatost self-assigned this Jan 5, 2024
@ewebgh33
Copy link

ewebgh33 commented Jan 10, 2024

I tried the OpenAI-compatible API from text-generation-webui (Oobabooga) and this also did not work.

I get
An error occurred during credentials validation: HTTPConnectionPool(host='127.0.0.1', port=5000): Max retries exceeded with url: /v2/api/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f26a912f940>: Failed to establish a new connection: [Errno 111] Connection refused'))
Supposedly this API mimics the OpenAI api standard thought?
Should be running on
http://127.0.0.1:5000
(or http://127.0.0.1:5000/v1, tried both)

So how can we run this with Ollama or text-generation-webui please?

Their docs:
https://github.com/oobabooga/text-generation-webui/blob/main/docs/12%20-%20OpenAI%20API.md

@StreamlinedStartup
Copy link

Using LiteLLM with Ollama is working.

@takatost
Copy link
Collaborator

takatost commented Jan 11, 2024

I tried the OpenAI-compatible API from text-generation-webui (Oobabooga) and this also did not work.

I get An error occurred during credentials validation: HTTPConnectionPool(host='127.0.0.1', port=5000): Max retries exceeded with url: /v2/api/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f26a912f940>: Failed to establish a new connection: [Errno 111] Connection refused')) Supposedly this API mimics the OpenAI api standard thought? Should be running on http://127.0.0.1:5000 (or http://127.0.0.1:5000/v1, tried both)

So how can we run this with Ollama or text-generation-webui please?

Their docs: https://github.com/oobabooga/text-generation-webui/blob/main/docs/12%20-%20OpenAI%20API.md

How did you deploy Dify? If you're using Docker Compose to deploy it, then the BaseURL cannot be the IP address 127.0.0.1, it needs to be the IP address of the host machine, which is usually 172.17.0.1.

@ewebgh33
Copy link

Yes my local version is the docker version.

Why is the IP of the host machine 172.17.0.1? The host machine is this machine. Sorry, honest question.

Oobabooga is not docker, so the IP is the IP. Why would running docker Dify make the ooba API url into something else?
My Ollama is via WSL for windows, so that's yet another thing.

So I've got one "regular" app (conda environ), one WSL app, and one docker app. The docker one (Dify) needs the back end running the LLM from either textgeneration-webui (conda) or Ollama (WSL).

Background: I'm not a docker pro, I started using docker about a month ago because I've been testing LLM apps and a bunch of them use docker.
I've done some coding - personally I've used python for a couple of years on and off, and I know other web stuff (HTML, CSS, JS, PHP, before the web went to react and stuff) but literally never had to use docker until LLMs.

But situations like this, with all these LLM apps running in completely different architectures, but still needing to talk to each other, is new territory for me.

We may be getting off topic here though :)

@takatost
Copy link
Collaborator

Yes my local version is the docker version.

Why is the IP of the host machine 172.17.0.1? The host machine is this machine. Sorry, honest question.

Oobabooga is not docker, so the IP is the IP. Why would running docker Dify make the ooba API url into something else? My Ollama is via WSL for windows, so that's yet another thing.

So I've got one "regular" app (conda environ), one WSL app, and one docker app. The docker one (Dify) needs the back end running the LLM from either textgeneration-webui (conda) or Ollama (WSL).

Background: I'm not a docker pro, I started using docker about a month ago because I've been testing LLM apps and a bunch of them use docker. I've done some coding - personally I've used python for a couple of years on and off, and I know other web stuff (HTML, CSS, JS, PHP, before the web went to react and stuff) but literally never had to use docker until LLMs.

But situations like this, with all these LLM apps running in completely different architectures, but still needing to talk to each other, is new territory for me.

We may be getting off topic here though :)

Docker creates a virtual network for containers to enable isolation and easy communication between them. When a container runs, it gets its own IP address on this virtual network. The address 172.17.0.1 is typically the default gateway for Docker containers, which routes the traffic to the Docker host.

Using 172.17.0.1 instead of 127.0.0.1 (localhost) is crucial because 127.0.0.1 inside a container refers to the container itself, not the Docker host. So, to access services running on the Docker host from a container, 172.17.0.1 is used to correctly route the traffic outside the container to the host.

@takatost
Copy link
Collaborator

btw, v0.4.6 released which supported ollama.

@ewebgh33
Copy link

Very awesome, thankyou!

@FarVision2
Copy link

Depending on the age of your windows workstation development environment and other virtual products installed along with a few docker upgrades, may have the virtual Ethernet non default. You can type ipconfig from the command prompt to get the vEthernet listings. I don't see a way of getting this from docker itself but it's easy enough to suss out

there should be the default and then another for WSL

mine were
172.21.208.1
172.24.240.1

both of them worked in the settings window

http://172.24.240.1:11434/
Ollama is running

@csningli
Copy link

csningli commented Apr 2, 2024

I met the same issue with dify hosted with docker and solved it now. The solution is easy, and someone has mentioned that the problem is because inside docker, you should connect to host.docker.internal, in order to visit the host of the docker. As a summary, you should add a open-ai-compatible model provider in dify's model providers, and set the api address like

http://host.docker.internal:1234/v1

here I use LMStudio to provide a local access to llama2-7b-chat, and the default port is 1234.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💪 enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants