Setup a vector database with search API to assist RAG #71

josephjclark · 2024-06-17T08:52:16Z

Retrieval Augmented Generation (RAG) is a process by which relevant documentation is selected from a corpus and appended to a the prompt. This enables specialised and highly focused context to be added to the model's input.

We have a couple of services coming up which could benefit from this.

Rag works by converting, or embedding, strings into vectors. The source corpus or knowledgebase is encoded into vectors, as is the search query. These vectors can then be compared to find the best or worst matches, depending on what you want.

The first step is to add support for a vector database to apollo. Here is a rough spec:

Add a vector database like Milvus to apollo
Create a service which allows the database to be searched, and returns relevant strings. I think the search API is something like search(corpus_name, search_string), and the service will convert the search string into an embedding, and run it against the database corpus_name.
When the docker image is built, take a collection of corpuses (this should be simplified in this step and the corpus can be ah hard-coded list of strings) and embed them into the database. The database should be built into the docker image.

The runtime embeddings database is basically read-only. I don't see any need to extend the embeddings on the fly.

Note that the embedding function requires a pretrained model which will likely be 50-100mb in size. This will have to be bundled into the docker image and may have performance and storage implications for our deployment.

Embedding the knowledgebase can be done offline, when we build the image, but search queries must be embedded at runtime.

Note that this issue only requires a test corpus to run against - the problem of inputting a real knowledegbase (ie, embedding docs.openfn.org) is handled in a different issue.

Useful resources

The text was updated successfully, but these errors were encountered:

SatyamMattoo · 2024-07-03T21:17:40Z

Hello @josephjclark,

I hope you are doing great.

I have been working on this issue and encountered a problem where the embedding model is too large, causing a significant increase in the time required to build the Docker image. We can overcome this by sending an API request to the Hugging Face endpoint, which can use any model depending on our case. This approach will return the embeddings for our dataset and eliminate the need to include models like sentence-transformers in our dependencies (adding them to the -ft command isn't possible I guess as we will be needing the same model to embed the search queries).

I have tried using the sentence-transformers model in a local environment, and it works fine. However, the Docker build takes several minutes because dependencies like torch are also being installed during the build.

What are your thoughts on this approach?

I am almost done with setting up the embedding of a hardcoded corpus and adding it to a vector database cloud like Zilliz (cloud database for Milvus) during the Docker build. Only the search service needs a bit more work, and it will be completed in a couple of days.

Best regards

josephjclark mentioned this issue Jun 17, 2024

Embed docs.openfn.org into a vector database #72

Open

josephjclark added the project A project proposal label Jun 17, 2024

josephjclark mentioned this issue Jun 17, 2024

[DMP 2024] Generate Job expressions OpenFn/kit#620

Open

SatyamMattoo mentioned this issue Jun 23, 2024

Job generation service #78

Closed

SatyamMattoo linked a pull request Jul 4, 2024 that will close this issue

[86]: Adding milvus as a vector database and a search service (RAG) #86

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setup a vector database with search API to assist RAG #71

Setup a vector database with search API to assist RAG #71

josephjclark commented Jun 17, 2024

SatyamMattoo commented Jul 3, 2024

Setup a vector database with search API to assist RAG #71

Setup a vector database with search API to assist RAG #71

Comments

josephjclark commented Jun 17, 2024

Useful resources

SatyamMattoo commented Jul 3, 2024