Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not (only) rely on openai #8

Open
hpvd opened this issue Jun 22, 2023 · 5 comments
Open

not (only) rely on openai #8

hpvd opened this issue Jun 22, 2023 · 5 comments

Comments

@hpvd
Copy link

hpvd commented Jun 22, 2023

if this very interesting project moves on and is a success (will be used),
there are some good reasons not only rely on openai for text embeddings:

  • you have to trust openai if you give them your data
  • you have to trust that they keep their service compatible and available
  • you have to trust that they will always have a fair price...

It looks like there are some alternatives around (openai does not lead for embeddings like with chatgpt)
One starting point:
there is a benchmark for text embeddings
Massive Text Embedding Benchmark
https://arxiv.org/abs/2210.07316
https://github.com/embeddings-benchmark/mteb
https://huggingface.co/spaces/mteb/leaderboard (takes some time to load)

@GavinMendelGleason
Copy link
Contributor

We've intended OpenAI to be merely our first implementation and hope that ourselves and the community can provide other connectors to obtain the embeddings over time.

@hpvd
Copy link
Author

hpvd commented Jun 22, 2023

We've intended OpenAI to be merely our first implementation and hope that ourselves and the community can provide other connectors to obtain the embeddings over time.

good to hear! This was my guess/hope ;-)

@hpvd
Copy link
Author

hpvd commented Oct 7, 2023

maybe this a good approach to keep your content secure (in your cloud or even on premise)
Using Llama 2 models for text embedding with LangChain
https://medium.com/@liusimao8/using-llama-2-models-for-text-embedding-with-langchain-79183350593d

@hpvd
Copy link
Author

hpvd commented Jan 10, 2024

just looked a little deeper into this topic.
This is a great article on self hosting a text embedding model for cost, speed and privacy
https://medium.com/@kelvin.lu.au/hosting-a-text-embedding-model-that-is-better-cheaper-and-faster-than-openais-solution-7675d8e7cab2

@huyhandes
Copy link

Custom AI embedding like Ollama or LocalAI should be supported. And it's great to see that in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants