# RAG Movies Pipeline

Note: based on this [Haystack Blog](https://haystack.deepset.ai/tutorials/27_first_rag_pipeline).

We will implement a simple Haystack pipeline that implements RAG: data will be embedded and stored. Later it will be searched, retrieved and used by an LLM to answer questions. We will use text embedding OPEA component, as well as an OPEA generator, representing the LLM.

Let's go, step by step!

In [None]:
!pip install datasets

### Downloading the data

We download a [movie dataset](https://huggingface.co/datasets/facebook/wiki_movies).

In [2]:
from datasets import load_dataset
from haystack import Document

dataset = load_dataset("facebook/wiki_movies", split="train", trust_remote_code=True)

## Document Store

Initialize a DocumentStore to index your documents. We combine the question and answer in each example in our dataset to serve as our documents in the local document store. We first insert the text and then we add the embeddings.

In [3]:
from haystack.document_stores.in_memory import InMemoryDocumentStore

document_store = InMemoryDocumentStore()

In [5]:
docs = [Document(content=f"Q: {doc['question']} A: {doc['answer']}") for doc in dataset.select(range(1000))]

## Document Embedding

We use the `OPEADocumentEmbedder` to convert each example to a vector and then store it in the document store.

In [6]:
from haystack_opea import OPEADocumentEmbedder

In [7]:
doc_embedder = OPEADocumentEmbedder("http://localhost:6006/v1")
doc_embedder.warm_up()

In [8]:
docs_with_embeddings = doc_embedder.run(docs)
document_store.write_documents(docs_with_embeddings["documents"])

Calculating embeddings: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [00:12<00:00,  2.51it/s]


1000

## Building the RAG Pipeline

The pipeline will be comprised by an `OPEATextEmbedder` for converting the question into a vector to use for retrieval.

In [9]:
from haystack_opea import OPEATextEmbedder

In [10]:
text_embedder = OPEATextEmbedder("http://localhost:6006/v1")

## Retriever

We use a simple in-memory retriever.

In [11]:
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever

retriever = InMemoryEmbeddingRetriever(document_store)

## Prompt

We introduce a prompt using the `PromptBuilder` component. Prompts are represented using jinja templating.

In [12]:
from haystack.components.builders.prompt_builder import PromptBuilder

In [13]:
prompt_template = """
Given the following information, answer the question.

Context:
{% for document in documents %}
    {{ document.content }}
{% endfor %}

Question: {{question}}
Answer:
"""

In [14]:
prompt_builder = PromptBuilder(template=prompt_template)

## LLM

We use the `OPEAGenerator`. It requires an endpoint, and optionally a model name and model options.

In [15]:
from haystack_opea import OPEAGenerator

In [16]:
from haystack_opea import OPEAGenerator

llm = OPEAGenerator(
    "http://localhost:9009/v1",
    "Qwen/Qwen2.5-7B-Instruct",
    model_arguments={"max_tokens": 500}
)

## Building the Pipeline

Building a pipeline is done by defining the components and then defining the connections.

In [17]:
from haystack import Pipeline

basic_rag_pipeline = Pipeline()

basic_rag_pipeline.add_component("text_embedder", text_embedder)
basic_rag_pipeline.add_component("retriever", retriever)
basic_rag_pipeline.add_component("prompt_builder", prompt_builder)
basic_rag_pipeline.add_component("llm", llm)

In [18]:
basic_rag_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
basic_rag_pipeline.connect("retriever", "prompt_builder")
basic_rag_pipeline.connect("prompt_builder.prompt", "llm.prompt")

<haystack.core.pipeline.pipeline.Pipeline object at 0x7fb56ef5bb60>
🚅 Components
  - text_embedder: OPEATextEmbedder
  - retriever: InMemoryEmbeddingRetriever
  - prompt_builder: PromptBuilder
  - llm: OPEAGenerator
🛤️ Connections
  - text_embedder.embedding -> retriever.query_embedding (List[float])
  - retriever.documents -> prompt_builder.documents (List[Document])
  - prompt_builder.prompt -> llm.prompt (str)

# Run

When asking a question, use the `run()` method of the pipeline. Make sure to provide the question to both the `text_embedder` and the `prompt_builder`. This ensures that the `{{question}}` variable in the template prompt gets replaced with your specific question.

In [20]:
question = "What are some movies by Ridley Scott?"

In [21]:
response = basic_rag_pipeline.run({"text_embedder": {"text": question}, "prompt_builder": {"question": question}})

In [22]:
print(response['llm']["replies"][0])

Some movies by Ridley Scott include Gladiator, Alien, Prometheus, Blade Runner, American Gangster, Black Hawk Down, Kingdom of Heaven, Robin Hood, Hannibal, Body of Lies, Matchstick Men, The Counselor, A Good Year, G.I. Jane, Legend, Black Rain, White Squall, and The Duellists.
