Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: embedding support #70

Closed
1 of 3 tasks
mudler opened this issue Apr 23, 2023 · 6 comments · Fixed by #222
Closed
1 of 3 tasks

feature: embedding support #70

mudler opened this issue Apr 23, 2023 · 6 comments · Fixed by #222

Comments

@mudler
Copy link
Owner

mudler commented Apr 23, 2023

Add support to embeddings to the API and the llama backend: https://github.com/ggerganov/llama.cpp/blob/e4422e299c10c7e84c8e987770ef40d31905a76b/llama.cpp#L2160

  • go-llama.cpp
  • go-gpt4all-j.cpp
  • go-gpt2.cpp
@limcheekin
Copy link

limcheekin commented May 3, 2023

Just curious to find out what is the use/purpose of embeddings above.

For the following use case of Retrieval Augmented Data QA:
https://blog.langchain.dev/tutorial-chatgpt-over-your-data/

Can't we use the following embedding models? I plan to use gpt4all-j with one of the following embeddings model.

Please advise. Thank you.

@mudler
Copy link
Owner Author

mudler commented May 5, 2023

embeddings support has been merged to master. It is experimental and currently it's available only on llama.cpp based models, so any feedback is more than welcome!

To enable it you can set embeddings: true in the model's YAML config file

@mudler
Copy link
Owner Author

mudler commented May 6, 2023

I've published a sample using embeddings over here: https://github.com/go-skynet/LocalAI/tree/master/examples/query_data

@mudler
Copy link
Owner Author

mudler commented May 10, 2023

further optimizations in #222 - now embeddings can be used with bert on any model - and there is also a huge performance impact!

@v4rm3t
Copy link

v4rm3t commented May 24, 2023

Hello! I am trying to run a gpt4all-j model for building a local chatbot. How can I use an embedding using BERT and implement it for chat completions endpoint?

Currently, I am running it on Mac Mini i7, 32gb RAM. I am planning to upgrade it to a higher resource(vRAM) cloud server in future. Is it possible to make a fast chatbot API using own document embeddings?

@michelec1000
Copy link

https://github.com/go-skynet/LocalAI/tree/master/examples/query_data

Thank you for the example! But it can't be included in the API? Currently I think you run those commands inside the container, right? Is there already the scenario that calling a certain path executes the query on the documents?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants