Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deprecate modelfile embed command #759

Merged
merged 2 commits into from
Oct 16, 2023

Conversation

BruceMacD
Copy link
Contributor

@BruceMacD BruceMacD commented Oct 11, 2023

Embeddings in Modelfiles are a convenient idea, allowing the model to be packaged with embeddings created for it specifically, but the user-experience of this implementation isn't up to par.

This change leaves the /embed endpoint, but deprecates EMBED in the modelfile.

  • Ollama doesn't have any models designed for embedding generation
  • Generating embeddings from modelfile is slow, error prone, and only supports single line text inputs
  • Retrieving embeddings was too slow to be useful, and there was no mechanism to connect an external vector database

Instead of using the Modelfile the right way to do this is with an external tool such as PrivateGPT or LlamaIndex that uses Ollama as the runner.

New behavior:
On create a modelfile with the embed command:

$ ollama create embed-test -f /Users/bruce/modelfiles/embedded/Modelfile
⠋ parsing modelfile  Error: deprecated command: EMBED

On running a modelfile with the embed command:

2023/10/11 15:46:52 images.go:190: WARNING: model contains embeddings, but embeddings in modelfiles have been deprecated and will be ignored.

On running a modelfile with the embed in the template:

$ ollama run embed-test
⠦   Error: template: :5:7: executing "" at <.Embed>: can't evaluate field Embed in type struct { First bool; System string; Prompt string; Context []int }

@BruceMacD BruceMacD changed the title deprecate modelfile embed deprecate modelfile embed command Oct 11, 2023
Copy link
Contributor

@pdevine pdevine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. It might be good to leave it open for a few days to see if anyone is directly impacted by this. Even though this is a breaking change, I think it's OK given we're still only at 0.1.x.

parser/parser.go Outdated Show resolved Hide resolved
- go mod tidy
- better deprecation message
@jmorganca
Copy link
Member

Good cleanup, @technovangelist we should couple this with some good embedding examples under examples/

@chigkim
Copy link

chigkim commented Oct 16, 2023

Embedding in prompt would be useful for multimodal. Llama.cpp now supports Llava 1.5 for image+text.
Something like you can type /embed test.jpg
Should I open as a separate issue?

@BruceMacD
Copy link
Contributor Author

Hi @chigkim, thanks for the feedback. That does sound like a useful feature, let's open it as a separate issue for now.

@BruceMacD BruceMacD merged commit a0c3e98 into main Oct 16, 2023
@BruceMacD BruceMacD deleted the brucemacd/deprecate-modelfile-embed branch October 16, 2023 15:07
@vividfog
Copy link

I appreciate the effort keeping the codebase simple, Ollama is second to none in its elegance. But this was quick work outright removing of the feature within a week without much debate if and how people use it, and is it really not valuable, or maybe it's a fantastic feature on second thought. I am going to miss this feature a lot and was highlighting it to others as an Ollama special treat. It was in daily use.

I'd like to bring some more viewpoints to this, as a heavy user who's tried everything I've gotten my hands on:

  1. User experience in comparison to alternatives was great. Ollama comes with an ecosystem of APIs and chatbots. With nothing else to install, Ollama was a one-liner RAG chatbot with multi-line support. Upstream clients needed zero configuration to get these benefits for free.
  2. The alternatives are not good without plenty of developer effort that regular people can't do. Now the users need to ramp up a client for this, and every one of them is poor in their user experience in their own ways. No match for Ollama out of the box. UX doesn't happen in a vacuum, it's in comparison to others. Ollama + any chatbot GUI + dropdown to select a RAG-model was all that was needed, but now that's no longer possible.
  3. The PrivateGPT example is no match even close, I tried it and I've tried them all, built my own RAG routines at some scale for others. All else being equal, Ollama was actually the best no-bells-and-whistles RAG routine out there, ready to run in minutes with zero extra things to install and very few to learn. "Don't make me install new things" is an important UX perspective to this.
  4. Creating embeddings was a bit of extra work, but that's unavoidable if it's generic. Again comparing to alternatives, all other methods need some work to make the embeddings too. Ollama's was easy, even if there can be an argument that "one line per embedding isn't elegant". Well it is in its simplicity. The rest is string manipulation.
  5. It was instant fast at runtime. Embeddings took a while to create, but at runtime there is no delay, it's jut as instant as without embeddings.
  6. Turns out LLMs create totally usable embeddings. Even if Llama2 or Mistral aren't embedding models on paper, they worked great in practice. I was using it daily with esoteric documents and it was fine. This was an issue in theory only.
  7. Instead of outright deletion, it really needed just some cleanup, but not immediately. Finding the root cause for what made longer ingestions not work as a single batch. Create better documentation. That's it. Then it would have been fine to park it for a long time. Even without changes it was usable, and there are always issues in a sufficiently large codebase.

I'll write this as a new issue so it can be tracked, maybe there's more feedback. Please consider bringing it back. I'm going to park to v0.1.3 tag until new killer features come along. Thanks a lot for the great work! Please ask community opinion with a clear issue headline before deprecating powerful capabilities in a breaking change, and give it a few weeks if not urgent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants