<a href="https://colab.research.google.com/github/milvus-io/bootcamp/blob/master/bootcamp/tutorials/integration/rag_with_milvus_and_haystack.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Retrieval-Augmented Generation (RAG) with Milvus and Haystack

This guide demonstrates how to build a Retrieval-Augmented Generation (RAG) system using Haystack and Milvus.

The RAG system combines a retrieval system with a generative model to generate new text based on a given prompt. The system first retrieves relevant documents from a corpus using a vector similarity search engine like Milvus, and then uses a generative model to generate new text based on the retrieved documents.

[Haystack](https://haystack.deepset.ai/) is the open source Python framework by deepset for building custom apps with large language models (LLMs). [Milvus](https://milvus.io/) is the world's most advanced open-source vector database, built to power embedding similarity search and AI applications.



## Prerequisites

Before running this notebook, make sure you have the following dependencies installed:

In [1]:
# ! pip install --upgrade --quiet pymilvus milvus-haystack

We will use the models from OpenAI. You should prepare the [api key](https://platform.openai.com/docs/quickstart) `OPENAI_API_KEY` as an environment variable.

In [2]:
import os
os.environ["OPENAI_API_KEY"] = "sk-***********"

## Prepare the data

We use a popular online [essay](https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt) to be used as private knowledge in our RAG, which is a good data source for a simple RAG pipeline.

Download it and save it as a local text file.

In [3]:
import requests
import os

url = "https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt"
file_path = "./paul_graham_essay.txt"

if not os.path.exists(file_path):
    response = requests.get(url)
    with open(file_path, "wb") as file:
        file.write(response.content)

## Create the indexing Pipeline

Create an indexing pipeline that converts the text into documents, splits them into sentences, and embeds them. The documents are then written to the Milvus document store.

In [4]:
from haystack import Pipeline
from haystack.components.converters import MarkdownToDocument
from haystack.components.embedders import OpenAIDocumentEmbedder, OpenAITextEmbedder
from haystack.components.preprocessors import DocumentSplitter
from haystack.components.writers import DocumentWriter

from milvus_haystack import MilvusDocumentStore
from milvus_haystack.milvus_embedding_retriever import MilvusEmbeddingRetriever

document_store = MilvusDocumentStore(
    connection_args={"uri": "./milvus.db"},
    drop_old=True,
)
indexing_pipeline = Pipeline()
indexing_pipeline.add_component("converter", MarkdownToDocument())
indexing_pipeline.add_component("splitter", DocumentSplitter(split_by="sentence", split_length=2))
indexing_pipeline.add_component("embedder", OpenAIDocumentEmbedder())
indexing_pipeline.add_component("writer", DocumentWriter(document_store))
indexing_pipeline.connect("converter", "splitter")
indexing_pipeline.connect("splitter", "embedder")
indexing_pipeline.connect("embedder", "writer")
indexing_pipeline.run({"converter": {"sources": [file_path]}})

print("Number of documents:", document_store.count_documents())

Converting markdown files to Documents: 100%|█| 1/
Calculating embeddings: 100%|█| 12/12 [00:07<00:00
E20240515 20:52:54.001194 4751750 milvus_local.cpp:189] [SERVER][GetCollection][] Collecton HaystackCollection not existed
E20240515 20:52:54.001967 4751750 milvus_local.cpp:189] [SERVER][GetCollection][] Collecton HaystackCollection not existed
E20240515 20:52:54.002050 4751750 milvus_local.cpp:189] [SERVER][GetCollection][] Collecton HaystackCollection not existed
E20240515 20:52:54.002069 4751750 milvus_local.cpp:189] [SERVER][GetCollection][] Collecton HaystackCollection not existed


Number of documents: 368


## Create the retrieval pipeline

Create a retrieval pipeline that retrieves documents from the Milvus document store using a vector similarity search engine.

In [5]:
question = "what is the first programs the author tried writing?"

retrieval_pipeline = Pipeline()
retrieval_pipeline.add_component("embedder", OpenAITextEmbedder())
retrieval_pipeline.add_component("retriever", MilvusEmbeddingRetriever(document_store=document_store, top_k=3))
retrieval_pipeline.connect("embedder", "retriever")

retrieval_results = retrieval_pipeline.run({"embedder": {"text": question}})

for doc in retrieval_results["retriever"]["documents"]:
    print(doc.content)
    print("-" * 10)

 I wrote simple games, a program to predict how high my model rockets would fly, and a word processor that my father used to write at least one book. There was only room in memory for about 2 pages of text, so he'd write 2 pages at a time and then print them out, but it was a lot better than a typewriter.
----------
 They had hardly any plot, just characters with strong feelings, which I imagined made them deep.

The first programs I tried writing were on the IBM 1401 that our school district used for what was then called "data processing.
----------


So in the summer of 1995, after I submitted the camera-ready copy of ANSI Common Lisp to the publishers, we started trying to write software to build online stores. At first this was going to be normal desktop software, which in those days meant Windows software.
----------


## Create the RAG pipeline

Create a RAG pipeline that combines the MilvusEmbeddingRetriever and the OpenAIGenerator to answer the question using the retrieved documents.

In [6]:
from haystack.utils import Secret
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator

prompt_template = """Answer the following query based on the provided context. If the context does
                     not include an answer, reply with 'I don't know'.\n
                     Query: {{query}}
                     Documents:
                     {% for doc in documents %}
                        {{ doc.content }}
                     {% endfor %}
                     Answer:
                  """

rag_pipeline = Pipeline()
rag_pipeline.add_component("text_embedder", OpenAITextEmbedder())
rag_pipeline.add_component("retriever", MilvusEmbeddingRetriever(document_store=document_store, top_k=3))
rag_pipeline.add_component("prompt_builder", PromptBuilder(template=prompt_template))
rag_pipeline.add_component("generator", OpenAIGenerator(api_key=Secret.from_token(os.getenv("OPENAI_API_KEY")),
                                                        generation_kwargs={"temperature": 0}))
rag_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
rag_pipeline.connect("retriever.documents", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "generator")

results = rag_pipeline.run(
    {
        "text_embedder": {"text": question},
        "prompt_builder": {"query": question},
    }
)
print('RAG answer:', results["generator"]["replies"][0])

RAG answer: The first programs the author tried writing were on the IBM 1401 used by the school district for data processing.
