Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ollama now supports embedding models #5

Open
kasperwelbers opened this issue Mar 15, 2024 · 6 comments
Open

Ollama now supports embedding models #5

kasperwelbers opened this issue Mar 15, 2024 · 6 comments

Comments

@kasperwelbers
Copy link

kasperwelbers commented Mar 15, 2024

Ollama added support for embedding models like BERT. This is much faster than using a generative model, such as llama2, which is currently the default in embed_text.

Changing this default, and perhaps adding documentation to help people pick good embedding models, could make rollama super useful for all sorts of downsteam tasks in R!

@JBGruber
Copy link
Owner

JBGruber commented Mar 15, 2024

Nice! It didn't work until I updated to v0.1.29 (0.1.26 is apparently the minimum). But then nomic-embed-text was about 4 times faster than the default llama2 model in the embedding vignette example (and the f-means of the resulting model was 0.05 better 😉 ).

I think about the best approach for this. Having one default throughout the package is neat, but models meant for embedding are definitly faster and make more sense for a lot of people. I will at least add it to the vignette and the examples.

@JBGruber
Copy link
Owner

It would also be good to add how you can use arbitrary embedding models from huggingface. Not sure if the process is the same for these models as what is documented here: https://github.com/ollama/ollama/blob/main/docs/import.md

@JBGruber
Copy link
Owner

JBGruber commented Mar 15, 2024

[Post removed]

This only worked because I grabbed the wrong modelfile. It's actually more complicated...

@kasperwelbers
Copy link
Author

Nice, thats really cool!

What is the purpose of Python here? Is this only that it downloads the model? Because then it might also be done with hfhub, which seems to be an effort of team posit to get huggingface to R.

@JBGruber
Copy link
Owner

That's exactly what I was looking for! For some reason it didn't show up in my searches and I assumed that I've dreamed it 😅. Yes, the Python stuff was just for downloading the file. Now all we need is a good heuristic to identify the file Ollama wants.

@JBGruber
Copy link
Owner

Ok, I was a bit quick with the post above and couldn't reproduce it with the files downloaded through hfhub. Finally, I noticed I had acidetally grabbed the wrong model file.

You need indeed to first follow the steps to convert the model using convert-hf-to-gguf.py. And then move the converted bin file to a directory Ollama has access to (in my case inside the container).

So for now, I would tell people to rely on either nomic-embed-text or all-minilm and check what might be added in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants