Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

馃挕 idea: Ollama support #127

Closed
Twilek-de opened this issue Jan 25, 2024 · 7 comments
Closed

馃挕 idea: Ollama support #127

Twilek-de opened this issue Jan 25, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@Twilek-de
Copy link

Provide direct Ollama support for a quick and easy self hosted setup.

@Twilek-de Twilek-de added the enhancement New feature or request label Jan 25, 2024
@crspeller
Copy link
Member

@Twilek-de Just noted Ollama adding OpenAI compatibility: https://ollama.ai/blog/openai-compatibility
Does that work for you?

@Twilek-de
Copy link
Author

Twilek-de commented Feb 9, 2024

Oh good catch. Just fired up Ollama with their latest docker container. OpenAI compatible endpoint works fine when evoking it with curl. With Mattermost AI plugin nothing happens. There is no error message but the ollama container does not ramp up CPU time, as it does when it is running a query. Don麓t know what is happing.

I have checked that Mattermost is really seeing the ollama container in the docker network, it does. My settings look like this.
Screenshot 2024-02-09 085305

Running curl http://ollama:11434/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "llama2","messages": [{"role": "system","content": "You are a helpful assistant."},{"role": "user","content": "Hello!"}]}' inside the mattermost container works just fine, so it is not connectivity.

Mattermost config.json looks like this:

"mattermost-ai": { "config": { "allowPrivateChannels": true, "allowedTeamIds": "chat", "enableLLMTrace": false, "enableUserRestrictions": false, "imageGeneratorBackend": "Ollama", "llmBackend": "Ollama", "onlyUsersOnTeam": "", "services": [ { "apiKey": "", "defaultModel": "llama2", "id": "j5dr1bnste", "name": "Ollama", "password": "", "serviceName": "openaicompatible", "url": "http://ollama:11434", "username": "" } ], "transcriptBackend": "Ollama" } },

@Twilek-de
Copy link
Author

Twilek-de commented Feb 9, 2024

Hmm Ollama supports a subset of the OpenAI API, namely completion. There is no support for streaming yet and I seem to remember that the MM AI Plugin uses the streaming endpoint....

Update:
Scratch that, the Ollama docu says streaming is supported as well...

@crspeller
Copy link
Member

@Twilek-de It works for me if I set the API URL to http://localhost:11434/v1 Maybe just add the /v1?
If that doesn't work for you please post the mattermost server logs and any logs from ollama.

@Twilek-de
Copy link
Author

Yes! That was it. Works now for me too. Thanks!

@Twilek-de
Copy link
Author

Twilek-de commented Mar 11, 2024

Hmm somehow the new version does time out. Is there any way to edit the timeout,,,
{
"caller": "app/plugin_api.go:1003",
"error": "timeout streaming",
"level": "error",
"msg": "Streaming result to post failed",
"plugin_id": "mattermost-ai",
"timestamp": "2024-03-12 22:22:33.939 Z"
}

@Twilek-de Twilek-de reopened this Mar 11, 2024
@Twilek-de
Copy link
Author

Found the "streamingTimeoutSeconds" setting. Is working again now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants