Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Integration of OpenAI Embeddings #101

Open
argen666 opened this issue Apr 27, 2023 · 11 comments
Open

Feature Request: Integration of OpenAI Embeddings #101

argen666 opened this issue Apr 27, 2023 · 11 comments
Labels

Comments

@argen666
Copy link

I would like to request the integration of OpenAI embeddings into the project. As OpenAI offers powerful language models, incorporating their embeddings could significantly improve the performance and capabilities of our project.
Please let me know if there are any concerns or additional requirements for implementing this feature. I am more than happy to contribute to the development and testing process.

@enricoros
Copy link
Owner

Hi @argen666, welcome! Please let me know which ways are you thinking to integrate embeddings;
could be used on a per-chat, per-message, per-chunk level, and to enable many use cases: search, memory, context injection.
First I'd like to hear from you: what would be the use case - how would you use embeddings and where would they show up in the user interface?

@argen666
Copy link
Author

argen666 commented Apr 28, 2023

Hi @enricoros, I guess the basic use case is to build a more complete research assistant trained on multiple custom documents.

The basic step-by-step guide using embeddings:

  1. Retrieve the custom papers related to the subjects you're interested in. (for example, academic papers)
  2. Compute the embeddings for each of the papers. LangChain framework can be used.
  3. Choose a platform to store the embeddings. For example, Pinecone or any of recommended by OPENAI vector database
    https://platform.openai.com/docs/guides/embeddings/how-can-i-retrieve-k-nearest-embedding-vectors-quickly
  4. Create an index for embeddings. An index is a data structure used to organize and search the embeddings.
  5. Upload the embeddings to Database.
  6. Once the embeddings are uploaded, we can start asking questions about the topics covered in the papers.

In our case, I think we need to add support for the vector databases listed above and add configuration for connecting to them in the application settings.
This way, the user will be able to connect their own knowledge base for use. So, we just need to implement only step 6 of the above guide.
Please share your thoughts on this matter.
Thanks

@michaelcreatesstuff
Copy link

@argen666 @enricoros I have made a PR for this here, it is a decent start functionality-wise as a proof of concept

I know it could be better integrated into the current codebase and have a better UI for sure

@argen666
Copy link
Author

argen666 commented May 23, 2023

@michaelcreatesstuff @enricoros Great work! I also implemented this functionality in parallel with you. I'm not creating a PR yet because I'm waiting for langchainJS to add the implementation to work with Redis and other vector databases. At the moment, I also have to use Pinecone because of these limitations.

@michaelcreatesstuff
Copy link

@argen666 thanks. Agreed, langchainJS seems a bit behind langchain python. I'm going to try python + FastAPI

Have you tried this? It was on my list of concepts to explore https://js.langchain.com/docs/modules/indexes/vector_stores/integrations/memory
but I will try python for a bit first

@argen666
Copy link
Author

@michaelcreatesstuff Thanks. I haven't tried that since I decided to focus on external vector stores to have an independent knowledge base

@argen666
Copy link
Author

@enricoros @michaelcreatesstuff Hi Team, I have made a pull request for this feature
#122
I would appreciate any feedback.
Thank you!

@bbaaxx
Copy link

bbaaxx commented Jan 18, 2024

I believe Big-AGI could benefit greatly from embeddings as this could allow for exploration of new use cases and extended functionalities for the code assistant and textual contexts.

Here is an attempt to provide a proper request description using the repo template to help continue the discussion.
This of course was generated with some help from Big-AGI running GPT4(preview) and vetted by me:

Why
Integrating textual embeddings into Big AGI will transform the way users interact with uploaded text files by providing a more efficient and semantically rich processing method. Instead of directly inserting text into the context window, the new feature will create embeddings that capture the essence of the text. This will enable users to perform complex language tasks on larger documents without being constrained by the context window size, leading to more accurate and context-aware responses from Big AGI.

Description
This enhancement to Big-AGI will involve a transparent shift in handling uploaded text files. Upon upload, instead of placing the text into the context window, the system will generate text embeddings using a selected embedding service. These embeddings will then be used within the current conversation to maintain the flow and context. The system will be designed to support a variety of embedding services and vector databases, ensuring flexibility and extensibility. The initial implementation will focus on an in-browser vector database to provide immediate, client-side functionality without the need for server-side processing.

Requirements

  • Design and implement an abstraction layer that allows for interaction with various embedding services (e.g., OpenAI, Hugging Face, LocalAI).
  • Create a service adapter interface that defines the contract for integrating new embedding service providers.
  • Modify the text file upload workflow to include an embedding generation step using the selected service provider through the abstraction layer.
  • Ensure the new embedding workflow is transparent to the user, mimicking the current user experience as closely as possible.
  • Integrate the embedding process with the existing conversation context management, replacing the direct text insertion mechanism.
  • Design a scalable architecture that allows for the future addition of external vector databases (e.g., FAISS, Chroma, Vectara, Pinecone).
  • Implement a configuration system that enables users to select their preferred embedding service and vector database.
  • Make the current flow a fallback strategy for situations where embedding generation fails or when users prefer the traditional text insertion method.
  • Create documentation and user guides explaining the benefits of the new feature and how to use it effectively.

(Generated with big-AGI using GPT4(1106) and vetted by the author of this post)

@enricoros
Copy link
Owner

Thanks for the description, clearly made by GPT-4 because it sounds good, but it's low on details.

I read when to generate and where to store. But how are the embeddings being used? Just storing them is not enough.

Is the objective to have a RAG use case? Embeddings can be used for many purposes, and I'd be curious about the top ways to use them. (Rag, MemGPT-like, etc.)

@enricoros enricoros added the RAG? label Jan 23, 2024
@bbaaxx
Copy link

bbaaxx commented Apr 25, 2024

I can share here my use cases here:

  • Semantic search of relevant data over a colection of documents
    -- I would like to be able to have multiple collections of documents to select which collection to chat with (for example all my financial documents in one collection or "workspace", my health-related documents on another and so on)

  • Semantic search over previous chat conversations or summarization of memory to have a bot "learn" as we interact.
    -- A personas-generated agent could be the "narrator" responsible for summarization of chat conversations which will then be stored as vector embeddings and then used on future conversations.

  • As to where to store the vectors: ... well that is a difficult one because there are only a couple vector-db for the browser (This is the only one I know about)
    -- but IMHO vectors should be stored on a more "persistenty" and user-governed database which could probably be enabled by adding integrations ... the fun part with this is the fact that the list of vector-db servicess is quite large (maybe leverage langchain-js?).
    -- (maybe we can start with some OpenSource integrations like Chroma and we can ask @lunamidori5 to add that to their installer 😉 )

I hope this adds to the conversation. I would love to lend a hand to make this land on big-AGI.

@lunamidori5
Copy link
Contributor

@bbaaxx its next on my list! Just need to get WSL working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Requests
Development

No branches or pull requests

5 participants