Feature Request: Integration of OpenAI Embeddings #101

argen666 · 2023-04-27T21:18:00Z

I would like to request the integration of OpenAI embeddings into the project. As OpenAI offers powerful language models, incorporating their embeddings could significantly improve the performance and capabilities of our project.
Please let me know if there are any concerns or additional requirements for implementing this feature. I am more than happy to contribute to the development and testing process.

enricoros · 2023-04-28T00:02:41Z

Hi @argen666, welcome! Please let me know which ways are you thinking to integrate embeddings;
could be used on a per-chat, per-message, per-chunk level, and to enable many use cases: search, memory, context injection.
First I'd like to hear from you: what would be the use case - how would you use embeddings and where would they show up in the user interface?

argen666 · 2023-04-28T20:03:44Z

Hi @enricoros, I guess the basic use case is to build a more complete research assistant trained on multiple custom documents.

The basic step-by-step guide using embeddings:

Retrieve the custom papers related to the subjects you're interested in. (for example, academic papers)
Compute the embeddings for each of the papers. LangChain framework can be used.
Choose a platform to store the embeddings. For example, Pinecone or any of recommended by OPENAI vector database
https://platform.openai.com/docs/guides/embeddings/how-can-i-retrieve-k-nearest-embedding-vectors-quickly
Create an index for embeddings. An index is a data structure used to organize and search the embeddings.
Upload the embeddings to Database.
Once the embeddings are uploaded, we can start asking questions about the topics covered in the papers.

In our case, I think we need to add support for the vector databases listed above and add configuration for connecting to them in the application settings.
This way, the user will be able to connect their own knowledge base for use. So, we just need to implement only step 6 of the above guide.
Please share your thoughts on this matter.
Thanks

michaelcreatesstuff · 2023-05-23T07:33:17Z

@argen666 @enricoros I have made a PR for this here, it is a decent start functionality-wise as a proof of concept

I know it could be better integrated into the current codebase and have a better UI for sure

argen666 · 2023-05-23T10:17:14Z

@michaelcreatesstuff @enricoros Great work! I also implemented this functionality in parallel with you. I'm not creating a PR yet because I'm waiting for langchainJS to add the implementation to work with Redis and other vector databases. At the moment, I also have to use Pinecone because of these limitations.

michaelcreatesstuff · 2023-05-23T15:42:33Z

@argen666 thanks. Agreed, langchainJS seems a bit behind langchain python. I'm going to try python + FastAPI

Have you tried this? It was on my list of concepts to explore https://js.langchain.com/docs/modules/indexes/vector_stores/integrations/memory
but I will try python for a bit first

argen666 · 2023-05-23T20:50:17Z

@michaelcreatesstuff Thanks. I haven't tried that since I decided to focus on external vector stores to have an independent knowledge base

argen666 · 2023-05-31T21:28:36Z

@enricoros @michaelcreatesstuff Hi Team, I have made a pull request for this feature
#122
I would appreciate any feedback.
Thank you!

bbaaxx · 2024-01-18T06:43:13Z

I believe Big-AGI could benefit greatly from embeddings as this could allow for exploration of new use cases and extended functionalities for the code assistant and textual contexts.

Here is an attempt to provide a proper request description using the repo template to help continue the discussion.
This of course was generated with some help from Big-AGI running GPT4(preview) and vetted by me:

Why
Integrating textual embeddings into Big AGI will transform the way users interact with uploaded text files by providing a more efficient and semantically rich processing method. Instead of directly inserting text into the context window, the new feature will create embeddings that capture the essence of the text. This will enable users to perform complex language tasks on larger documents without being constrained by the context window size, leading to more accurate and context-aware responses from Big AGI.

Description
This enhancement to Big-AGI will involve a transparent shift in handling uploaded text files. Upon upload, instead of placing the text into the context window, the system will generate text embeddings using a selected embedding service. These embeddings will then be used within the current conversation to maintain the flow and context. The system will be designed to support a variety of embedding services and vector databases, ensuring flexibility and extensibility. The initial implementation will focus on an in-browser vector database to provide immediate, client-side functionality without the need for server-side processing.

Requirements

Design and implement an abstraction layer that allows for interaction with various embedding services (e.g., OpenAI, Hugging Face, LocalAI).
Create a service adapter interface that defines the contract for integrating new embedding service providers.
Modify the text file upload workflow to include an embedding generation step using the selected service provider through the abstraction layer.
Ensure the new embedding workflow is transparent to the user, mimicking the current user experience as closely as possible.
Integrate the embedding process with the existing conversation context management, replacing the direct text insertion mechanism.
Design a scalable architecture that allows for the future addition of external vector databases (e.g., FAISS, Chroma, Vectara, Pinecone).
Implement a configuration system that enables users to select their preferred embedding service and vector database.
Make the current flow a fallback strategy for situations where embedding generation fails or when users prefer the traditional text insertion method.
Create documentation and user guides explaining the benefits of the new feature and how to use it effectively.

(Generated with big-AGI using GPT4(1106) and vetted by the author of this post)

enricoros · 2024-01-18T22:04:58Z

Thanks for the description, clearly made by GPT-4 because it sounds good, but it's low on details.

I read when to generate and where to store. But how are the embeddings being used? Just storing them is not enough.

Is the objective to have a RAG use case? Embeddings can be used for many purposes, and I'd be curious about the top ways to use them. (Rag, MemGPT-like, etc.)

bbaaxx · 2024-04-25T00:44:56Z

I can share here my use cases here:

Semantic search of relevant data over a colection of documents
-- I would like to be able to have multiple collections of documents to select which collection to chat with (for example all my financial documents in one collection or "workspace", my health-related documents on another and so on)
Semantic search over previous chat conversations or summarization of memory to have a bot "learn" as we interact.
-- A personas-generated agent could be the "narrator" responsible for summarization of chat conversations which will then be stored as vector embeddings and then used on future conversations.
As to where to store the vectors: ... well that is a difficult one because there are only a couple vector-db for the browser (This is the only one I know about)
-- but IMHO vectors should be stored on a more "persistenty" and user-governed database which could probably be enabled by adding integrations ... the fun part with this is the fact that the list of vector-db servicess is quite large (maybe leverage langchain-js?).
-- (maybe we can start with some OpenSource integrations like Chroma and we can ask @lunamidori5 to add that to their installer 😉 )

I hope this adds to the conversation. I would love to lend a hand to make this land on big-AGI.

lunamidori5 · 2024-04-25T00:47:33Z

@bbaaxx its next on my list! Just need to get WSL working

michaelcreatesstuff mentioned this issue May 23, 2023

PDF Upload Proof of Concept #117

Open

psociety mentioned this issue Jun 28, 2023

Input: 42,020 tokens · Limit: 254 tokens compression needed ≥ 16543 % #123

Closed

enricoros added the RAG? label Jan 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Integration of OpenAI Embeddings #101

Feature Request: Integration of OpenAI Embeddings #101

argen666 commented Apr 27, 2023

enricoros commented Apr 28, 2023

argen666 commented Apr 28, 2023 •

edited

Loading

michaelcreatesstuff commented May 23, 2023

argen666 commented May 23, 2023 •

edited

Loading

michaelcreatesstuff commented May 23, 2023

argen666 commented May 23, 2023

argen666 commented May 31, 2023

bbaaxx commented Jan 18, 2024

enricoros commented Jan 18, 2024

bbaaxx commented Apr 25, 2024

lunamidori5 commented Apr 25, 2024

Feature Request: Integration of OpenAI Embeddings #101

Feature Request: Integration of OpenAI Embeddings #101

Comments

argen666 commented Apr 27, 2023

enricoros commented Apr 28, 2023

argen666 commented Apr 28, 2023 • edited Loading

michaelcreatesstuff commented May 23, 2023

argen666 commented May 23, 2023 • edited Loading

michaelcreatesstuff commented May 23, 2023

argen666 commented May 23, 2023

argen666 commented May 31, 2023

bbaaxx commented Jan 18, 2024

enricoros commented Jan 18, 2024

bbaaxx commented Apr 25, 2024

lunamidori5 commented Apr 25, 2024

argen666 commented Apr 28, 2023 •

edited

Loading

argen666 commented May 23, 2023 •

edited

Loading