Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ollama model server example #46

Merged
merged 1 commit into from
Apr 1, 2024
Merged

Conversation

MichaelClifford
Copy link
Collaborator

This PR is intended as an example of how we could integrate ollama into our project. Open to any questions or discussions that arise form this PR 😄

  • Added a new directory model_services where we can store information about the different model services available.
  • Added model_services/ollama which includes an extremely simple Containerfile that calls the existing ollama/ollama:latest as well as a README.md describing how this model service can be used.
  • Updated the chatbot-langchain/chabout_ui.py so that it works well with either the llamacpp or the ollama model service.

@MichaelClifford
Copy link
Collaborator Author

cc @slemeur

@slemeur
Copy link

slemeur commented Feb 16, 2024

Thanks for the ping @MichaelClifford !
Looks very promising!

@MichaelClifford
Copy link
Collaborator Author

One caveat worth mentioning with this approach is that is has the user download the models to their host machine using ollama's CLI tooling. Meaning they already have ollama running on their machine, so why use the containerized version instead?

I think this experience could be improved by baking the model pulling steps into the ai studio somewhere.

@rhatdan
Copy link
Member

rhatdan commented Mar 27, 2024

Needs a rebase.

@MichaelClifford MichaelClifford changed the title Add ollama model server example [WIP] Add ollama model server example Mar 29, 2024
@MichaelClifford MichaelClifford marked this pull request as draft March 29, 2024 13:16
@MichaelClifford MichaelClifford marked this pull request as ready for review March 29, 2024 18:42
@MichaelClifford MichaelClifford changed the title [WIP] Add ollama model server example Add ollama model server example Mar 29, 2024
@rhatdan
Copy link
Member

rhatdan commented Mar 29, 2024

LGTM

Signed-off-by: Michael Clifford <mcliffor@redhat.com>
@rhatdan rhatdan merged commit 942df22 into containers:main Apr 1, 2024
4 checks passed
@ericcurtin
Copy link
Contributor

ericcurtin commented Apr 22, 2024

One caveat worth mentioning with this approach is that is has the user download the models to their host machine using ollama's CLI tooling. Meaning they already have ollama running on their machine, so why use the containerized version instead?

If you reorder the steps, run "podman run" and then do a "ollama pull" inside the container, you can just use the ollama binary inside the container for everything. It's probably better to avoid having to update two separate binaries.

You don't need to install ollama directly on your host machine to use it (and in fact that can be kinda nice you can do things rootless, etc.). The only thing you need to install is Nvidia drivers/container toolkit/etc. if you are using Nvidia.

I do have some ideas in this space partially related to podman-ollama project on my GitHub account I'd like to share with you guys sometime.

Another thing to checkout is Universal Blue's integration, they start the ollama container as a quadlet.

I think there's some advantages to trying to do this in a daemonless way though.

I think this experience could be improved by baking the model pulling steps into the ai studio somewhere.

@ericcurtin
Copy link
Contributor

I have some MRs open in Ollama upstream also that help install Ollama in a containerised way:

https://github.com/ollama/ollama/pulls/ericcurtin

there's a PR backlog in Ollama though.

@MichaelClifford
Copy link
Collaborator Author

If you reorder the steps, run "podman run" and then do a "ollama pull" inside the container, you can just use the ollama binary inside the container for everything. It's probably better to avoid having to update two separate binaries.

Fully agree. My comment was mainly an artifact of how we were (are) managing models. via volume mounts onto the host machine without write permissions. Meaning, there'd need to be a unique model management path to deal with ollama's registry. Mainly I think we can probably come up with a better way manage model files in general.

I do have some ideas in this space partially related to podman-ollama project on my GitHub account I'd like to share with you guys sometime.

Sounds good. Want to open an new issue to start that discussion?

@ericcurtin
Copy link
Contributor

Linked discussion:

#349

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants