# Building retriever pipelines with an LLM

The provided code implements a Retrieval-Augmented Generation (RAG) pipeline. It retrieves relevant documents from an in-memory document store and uses a language model (OpenAI's GPT) to generate an answer based on those documents. Here's a step-by-step breakdown of what this code does:

### Imports and Setup:
Imports various necessary components from the haystack package.
The OpenAI API Key is set using `os.environ["OPENAI_API_KEY"]`, which is required to interact with the OpenAI API for generating answers.

In [5]:
import os
from haystack import Document
from haystack import Pipeline
from haystack.components.builders.answer_builder import AnswerBuilder
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
from dotenv import load_dotenv
from haystack.utils import Secret

load_dotenv('.env')

openai_api_key = os.getenv("OPENAI_API_KEY")

### Prompt Template Definition:
The `prompt_template` is a string that defines how the input documents and query should be formatted for the language model (OpenAI).
The template uses Jinja2 syntax to iterate over the documents and inject their content into the prompt, followed by the question that is asked. The question and the documents are combined in a way that OpenAI can use to generate an answer.

In [2]:
prompt_template = """
    Given these documents, answer the question.\nDocuments:
    {% for doc in documents %}
        {{ doc.content }}
    {% endfor %}

    \nQuestion: {{question}}
    \nAnswer:
    """


### Creating the RAG Pipeline:

* Pipeline creation: An instance of the Haystack Pipeline class is created to represent the full RAG flow.

Adding components:
* Retriever (`InMemoryBM25Retriever`): This component is responsible for retrieving relevant documents from the InMemoryDocumentStore based on a query.
* Prompt Builder (`PromptBuilder`): This component uses the prompt template to build a prompt from the documents retrieved and the query.
* Generator (`OpenAIGenerator`): This component sends the generated prompt to OpenAI's GPT model and retrieves the generated answer.
* Answer Builder (`AnswerBuilder`): This component processes the output from the generator to format and extract the final answer.

In [6]:
rag_pipeline = Pipeline()
rag_pipeline.add_component(instance=InMemoryBM25Retriever(document_store=InMemoryDocumentStore()), name="retriever")
rag_pipeline.add_component(instance=PromptBuilder(template=prompt_template), name="prompt_builder")
rag_pipeline.add_component(instance=OpenAIGenerator(api_key= Secret.from_env_var("OPENAI_API_KEY")), name="llm")
rag_pipeline.add_component(instance=AnswerBuilder(), name="answer_builder")


### Connecting Components:
The components are connected sequentially using `rag_pipeline.connect()`.

* The retriever component sends retrieved documents to the prompt builder.
* The prompt builder generates the formatted prompt and sends it to the language model (OpenAI).
* The language model's replies are sent to the answer builder for extracting and returning the final answers.

In [8]:
rag_pipeline.connect("retriever", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "llm")
rag_pipeline.connect("llm.replies", "answer_builder.replies")
rag_pipeline.connect("retriever", "answer_builder.documents")


<haystack.core.pipeline.pipeline.Pipeline object at 0x14141b200>
🚅 Components
  - retriever: InMemoryBM25Retriever
  - prompt_builder: PromptBuilder
  - llm: OpenAIGenerator
  - answer_builder: AnswerBuilder
🛤️ Connections
  - retriever.documents -> prompt_builder.documents (List[Document])
  - retriever.documents -> answer_builder.documents (List[Document])
  - prompt_builder.prompt -> llm.prompt (str)
  - llm.replies -> answer_builder.replies (List[str])

### Drawing the Pipeline:
The `rag_pipeline.draw("./rag_pipeline.png")` command generates a visualization of the pipeline and saves it as an image (`rag_pipeline.png`). This helps visualize the structure and flow of the pipeline.


In [9]:
rag_pipeline.draw("./rag_pipeline.png")


![](./rag_pipeline.png)

### Adding Documents:
Three example documents are created with some content:
* Document 1: "There are over 7,000 languages spoken around the world today."
* Document 2: "Elephants have been observed to behave in a way that indicates a high level of self-awareness, such as recognizing themselves in mirrors."
* Document 3: "In certain parts of the world, like the Maldives, Puerto Rico, and San Diego, you can witness the phenomenon of bioluminescent waves."


These documents are then added to the `InMemoryDocumentStore` using the `write_documents()` method.

In [10]:
# Add Documents
documents = [Document(content="There are over 7,000 languages spoken around the world today."),
			       Document(content="Elephants have been observed to behave in a way that indicates a high level of self-awareness, such as recognizing themselves in mirrors."),
			       Document(content="In certain parts of the world, like the Maldives, Puerto Rico, and San Diego, you can witness the phenomenon of bioluminescent waves.")]
rag_pipeline.get_component("retriever").document_store.write_documents(documents)


3

### Running the Pipeline:
* The pipeline is executed with a query "How many languages are there?".
* The retriever component fetches documents relevant to the query.
* The prompt builder then formats the documents and query into a prompt that is sent to the OpenAI model for generating an answer.
* The answer builder extracts the answer from the model's response.

In [11]:
question = "How many languages are there?"
result = rag_pipeline.run(
    {
        "retriever": {"query": question},
        "prompt_builder": {"question": question},
        "answer_builder": {"query": question},
    }
)


In [12]:
print(result['answer_builder']['answers'][0])


GeneratedAnswer(data='There are over 7,000 languages spoken around the world today.', query='How many languages are there?', documents=[Document(id=cfe93bc1c274908801e6670440bf2bbba54fad792770d57421f85ffa2a4fcc94, content: 'There are over 7,000 languages spoken around the world today.', score: 3.9351818820430142), Document(id=6f20658aeac3c102495b198401c1c0c2bd71d77b915820304d4fbc324b2f3cdb, content: 'Elephants have been observed to behave in a way that indicates a high level of self-awareness, such ...', score: 1.8390548493969865), Document(id=7f225626ad1019b273326fbaf11308edfca6d663308a4a3533ec7787367d59a2, content: 'In certain parts of the world, like the Maldives, Puerto Rico, and San Diego, you can witness the ph...', score: 1.8390548493969865)], meta={})


## Summary of What Happens:
* Document Retrieval: The `InMemoryBM25Retriever` fetches relevant documents based on the question "How many languages are there?".
* Prompt Creation: The PromptBuilder formats the documents and the question into a prompt for the OpenAI model.
* Answer Generation: The OpenAIGenerator sends the prompt to OpenAI's GPT model, and the model generates an answer.
* Answer Extraction: The AnswerBuilder processes the model's response and returns the final answer.
Output: The answer (e.g., "Over 7,000 languages are spoken around the world today") is printed as the final result.

This pipeline demonstrates how to implement a Retrieval-Augmented Generation (RAG) pipeline, which combines document retrieval and language model generation to answer questions using external information.