deprecate modelfile embed command #759

BruceMacD · 2023-10-11T19:53:25Z

Embeddings in Modelfiles are a convenient idea, allowing the model to be packaged with embeddings created for it specifically, but the user-experience of this implementation isn't up to par.

This change leaves the /embed endpoint, but deprecates EMBED in the modelfile.

Ollama doesn't have any models designed for embedding generation
Generating embeddings from modelfile is slow, error prone, and only supports single line text inputs
Retrieving embeddings was too slow to be useful, and there was no mechanism to connect an external vector database

Instead of using the Modelfile the right way to do this is with an external tool such as PrivateGPT or LlamaIndex that uses Ollama as the runner.

New behavior:
On create a modelfile with the embed command:

$ ollama create embed-test -f /Users/bruce/modelfiles/embedded/Modelfile
⠋ parsing modelfile  Error: deprecated command: EMBED

On running a modelfile with the embed command:

2023/10/11 15:46:52 images.go:190: WARNING: model contains embeddings, but embeddings in modelfiles have been deprecated and will be ignored.

On running a modelfile with the embed in the template:

$ ollama run embed-test
⠦   Error: template: :5:7: executing "" at <.Embed>: can't evaluate field Embed in type struct { First bool; System string; Prompt string; Context []int }

pdevine

This looks good to me. It might be good to leave it open for a few days to see if anyone is directly impacted by this. Even though this is a breaking change, I think it's OK given we're still only at 0.1.x.

parser/parser.go

server/routes.go

- go mod tidy - better deprecation message

jmorganca · 2023-10-14T20:23:15Z

Good cleanup, @technovangelist we should couple this with some good embedding examples under examples/

chigkim · 2023-10-16T03:17:59Z

Embedding in prompt would be useful for multimodal. Llama.cpp now supports Llava 1.5 for image+text.
Something like you can type /embed test.jpg
Should I open as a separate issue?

BruceMacD · 2023-10-16T15:06:00Z

Hi @chigkim, thanks for the feedback. That does sound like a useful feature, let's open it as a separate issue for now.

vividfog · 2023-10-18T08:02:57Z

I appreciate the effort keeping the codebase simple, Ollama is second to none in its elegance. But this was quick work outright removing of the feature within a week without much debate if and how people use it, and is it really not valuable, or maybe it's a fantastic feature on second thought. I am going to miss this feature a lot and was highlighting it to others as an Ollama special treat. It was in daily use.

I'd like to bring some more viewpoints to this, as a heavy user who's tried everything I've gotten my hands on:

User experience in comparison to alternatives was great. Ollama comes with an ecosystem of APIs and chatbots. With nothing else to install, Ollama was a one-liner RAG chatbot with multi-line support. Upstream clients needed zero configuration to get these benefits for free.
The alternatives are not good without plenty of developer effort that regular people can't do. Now the users need to ramp up a client for this, and every one of them is poor in their user experience in their own ways. No match for Ollama out of the box. UX doesn't happen in a vacuum, it's in comparison to others. Ollama + any chatbot GUI + dropdown to select a RAG-model was all that was needed, but now that's no longer possible.
The PrivateGPT example is no match even close, I tried it and I've tried them all, built my own RAG routines at some scale for others. All else being equal, Ollama was actually the best no-bells-and-whistles RAG routine out there, ready to run in minutes with zero extra things to install and very few to learn. "Don't make me install new things" is an important UX perspective to this.
Creating embeddings was a bit of extra work, but that's unavoidable if it's generic. Again comparing to alternatives, all other methods need some work to make the embeddings too. Ollama's was easy, even if there can be an argument that "one line per embedding isn't elegant". Well it is in its simplicity. The rest is string manipulation.
It was instant fast at runtime. Embeddings took a while to create, but at runtime there is no delay, it's jut as instant as without embeddings.
Turns out LLMs create totally usable embeddings. Even if Llama2 or Mistral aren't embedding models on paper, they worked great in practice. I was using it daily with esoteric documents and it was fine. This was an issue in theory only.
Instead of outright deletion, it really needed just some cleanup, but not immediately. Finding the root cause for what made longer ingestions not work as a single batch. Create better documentation. That's it. Then it would have been fine to park it for a long time. Even without changes it was usable, and there are always issues in a sufficiently large codebase.

I'll write this as a new issue so it can be tracked, maybe there's more feedback. Please consider bringing it back. I'm going to park to v0.1.3 tag until new killer features come along. Thanks a lot for the great work! Please ask community opinion with a clear issue headline before deprecating powerful capabilities in a breaking change, and give it a few weeks if not urgent.

deprecate modelfile embed

dfb7778

BruceMacD changed the title ~~deprecate modelfile embed~~ deprecate modelfile embed command Oct 11, 2023

pdevine approved these changes Oct 11, 2023

View reviewed changes

parser/parser.go Outdated Show resolved Hide resolved

mxyng reviewed Oct 11, 2023

View reviewed changes

server/routes.go Show resolved Hide resolved

mxyng approved these changes Oct 11, 2023

View reviewed changes

PR feedback

7967f2d

- go mod tidy - better deprecation message

pdevine approved these changes Oct 12, 2023

View reviewed changes

jmorganca approved these changes Oct 14, 2023

View reviewed changes

BruceMacD merged commit a0c3e98 into main Oct 16, 2023

BruceMacD deleted the brucemacd/deprecate-modelfile-embed branch October 16, 2023 15:07

vividfog mentioned this pull request Oct 18, 2023

Bring back the EMBED feature in the Modelfile #834

Open

mxyng mentioned this pull request Oct 18, 2023

basic request validation #843

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deprecate modelfile embed command #759

deprecate modelfile embed command #759

BruceMacD commented Oct 11, 2023 •

edited

pdevine left a comment

jmorganca commented Oct 14, 2023

chigkim commented Oct 16, 2023

BruceMacD commented Oct 16, 2023

vividfog commented Oct 18, 2023

deprecate modelfile embed command #759

deprecate modelfile embed command #759

Conversation

BruceMacD commented Oct 11, 2023 • edited

pdevine left a comment

Choose a reason for hiding this comment

jmorganca commented Oct 14, 2023

chigkim commented Oct 16, 2023

BruceMacD commented Oct 16, 2023

vividfog commented Oct 18, 2023

BruceMacD commented Oct 11, 2023 •

edited