Embedding model support #327

jmorganca · 2023-08-11T03:53:45Z

Add embedding models to use primarily with /api/embeddings

instructor-xl
bge-large
all-MiniLM-L6-v2

The text was updated successfully, but these errors were encountered:

brunnolou · 2023-10-18T12:37:44Z

Yes, please! Any of these embedding models above text-embedding-ada-002 would be a great addition.

I've tried LLam2 and Mistral model with the /api/embeddings as is, and I'm getting poor-quality similarity scores.
Even with almost identical queries, It fails to retrieve results. Are there some prompting technics to improve the embedding quality?

Anyway, in comparison, I've tried Xenova/gte-small with transformers and it is much faster and yields better results.

corani · 2023-11-17T10:04:25Z

jinaai/jina-embeddings-v2-base-en (and other variants) also look promising.

sandangel · 2023-12-07T04:52:23Z

Hi, is there an update on this issue? I would love to contribute

corani · 2023-12-08T09:21:57Z

I've been playing with https://github.com/nlpodyssey/cybertron which is pure Go (but I guess CPU only?) and at least supports all-MiniLM-L6-v2, e5-*-v2, bge-*-en-v1.5 and ember-v1.

I did some testing with the STS-2016 dataset and got the below accuracies compared to llama2 and mistral:instruct (Pearson correlation with the gold answers):

Ollama
- llama2: 0.23431
- mistral:instruct: 0.5656
Cybertron
- all-MiniLM-L6-v2: 0.80344
- e5-small-v2: 0.82318
- e5-base-v2: 0.83845
- bge-small-en-v1.5: 0.84514
- bge-base-en-v1.5: 0.85297

So I agree with the previous comment that the embeddings generated by the completion models are pretty bad!

sandangel · 2023-12-08T10:18:58Z

That is interesting. For GPU support, I guess we will need to use: https://github.com/skeskinen/bert.cpp ?
I think the implementation would be something similar to llama.cpp?

sandangel · 2023-12-08T11:19:01Z

I found this issue: ggerganov/llama.cpp#2872
I think they plan to implement it in llama.cpp. So maybe we will just need to update the llama.cpp when it's done?

sandangel · 2023-12-11T11:44:27Z

I also found this https://github.com/ml-explore/mlx-examples/blob/main/bert/README.md, which we can use to run inference on M1 mac. Is it possible to support mlx with Ollama?

CodeWithKyrian · 2023-12-29T13:48:10Z

Any update on this or plan to allow Bert Models?

tjohnson4 · 2024-01-20T07:02:51Z

Any update on this issue?

ymohamed08 · 2024-01-20T20:19:49Z

Do you have any updates so far? very interested to contribute

ill-yes · 2024-02-02T11:13:45Z

Any updates here?

easp · 2024-02-03T19:47:31Z

Plans to support BERT models in llama.cpp stalled out when the dev who had assumed the task ended up focusing on something else. In the last few days it looks like the project management artifacts were updated to acknowledge this, so maybe there will be some action soon. Actually, it looks like there has been some activity. Maybe there will be working code soon:
ggerganov/llama.cpp#2872

AndreBerzun · 2024-02-15T01:09:03Z

BERT support was merged 3 days ago into llama.cpp

easp · 2024-02-15T06:25:34Z

Looks like there are still kinks being worked out.

s-kostyaev · 2024-02-15T07:37:12Z

Looks like there are still kinks being worked out.

Link to check the progress ggerganov/llama.cpp#5500

s-kostyaev · 2024-02-15T17:24:56Z

Looks like there are still kinks being worked out.

Link to check the progress ggerganov/llama.cpp#5500

It is merged now

AndreBerzun · 2024-02-19T22:55:52Z

@jmorganca just wanted to follow up and see if this topic is on your roadmap. Since llama.cpp added support for BERT models, this seems like a great low-hanging fruit, no?

Initial support for BERT models has been merged with ggerganov/llama.cpp#5423 and released with b2127. Some kinks related to embedding pooling were fixed with ggerganov/llama.cpp#5500. Batch embedding is supported as well.

There has been a new bug related to the tokenizer implementation but that's it as far as I can tell.

jmorganca · 2024-02-20T00:06:38Z

@AndreBerzun it absolutely is – working on it!

jmorganca added model request Model requests feature request New feature or request labels Aug 11, 2023

jmorganca changed the title ~~Embedding models~~ BERT model support Nov 14, 2023

This was referenced Dec 4, 2023

create Sentence Transformer models #963

Closed

Request: Include Embedding Models #933

Closed

BruceMacD mentioned this issue Dec 27, 2023

Bring back the EMBED feature in the Modelfile #834

Open

BruceMacD mentioned this issue Jan 2, 2024

[enhancement] use bert.cpp for /api/embeddings #1755

Closed

s-kostyaev mentioned this issue Jan 7, 2024

llama2 embedding vectors from Python and llm.el don't seem to match ahyatt/llm#15

Closed

easp mentioned this issue Jan 8, 2024

Importing (PyTorch & Safetensors) #1682

Closed

jmorganca self-assigned this Feb 20, 2024

jmorganca changed the title ~~BERT model support~~ Embedding model support Feb 20, 2024

jmorganca mentioned this issue Feb 20, 2024

List of embedding models supported by Ollama #2287

Closed

jmorganca mentioned this issue Feb 20, 2024

Support for bert and nomic-bert embedding models #2604

Merged

jmorganca closed this as completed in #2604 Feb 21, 2024

jackleeforce mentioned this issue Mar 7, 2024

Is it possible to support more embedding models in the future? #2965

Closed

kasperwelbers mentioned this issue Mar 15, 2024

Ollama now supports embedding models JBGruber/rollama#5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embedding model support #327

Embedding model support #327

jmorganca commented Aug 11, 2023 •

edited

Loading

brunnolou commented Oct 18, 2023

corani commented Nov 17, 2023

sandangel commented Dec 7, 2023

corani commented Dec 8, 2023 •

edited

Loading

sandangel commented Dec 8, 2023

sandangel commented Dec 8, 2023

sandangel commented Dec 11, 2023

CodeWithKyrian commented Dec 29, 2023

tjohnson4 commented Jan 20, 2024

ymohamed08 commented Jan 20, 2024

ill-yes commented Feb 2, 2024

easp commented Feb 3, 2024

AndreBerzun commented Feb 15, 2024

easp commented Feb 15, 2024

s-kostyaev commented Feb 15, 2024

s-kostyaev commented Feb 15, 2024

AndreBerzun commented Feb 19, 2024

jmorganca commented Feb 20, 2024

Embedding model support #327

Embedding model support #327

Comments

jmorganca commented Aug 11, 2023 • edited Loading

brunnolou commented Oct 18, 2023

corani commented Nov 17, 2023

sandangel commented Dec 7, 2023

corani commented Dec 8, 2023 • edited Loading

sandangel commented Dec 8, 2023

sandangel commented Dec 8, 2023

sandangel commented Dec 11, 2023

CodeWithKyrian commented Dec 29, 2023

tjohnson4 commented Jan 20, 2024

ymohamed08 commented Jan 20, 2024

ill-yes commented Feb 2, 2024

easp commented Feb 3, 2024

AndreBerzun commented Feb 15, 2024

easp commented Feb 15, 2024

s-kostyaev commented Feb 15, 2024

s-kostyaev commented Feb 15, 2024

AndreBerzun commented Feb 19, 2024

jmorganca commented Feb 20, 2024

jmorganca commented Aug 11, 2023 •

edited

Loading

corani commented Dec 8, 2023 •

edited

Loading