# Tutorial: Creating Your First QA Pipeline with Retrieval-Augmentation

In [1]:
## Environtment
# %%bash
# pip install haystack-ai
# pip install "datasets>=2.6.1"
# pip install "sentence-transformers>=2.2.0"

## Fetching and Indexing Documents


### Initializing the DocumentStore

- Initialize a **DocumentStore** to index your documents
- A DocumentStore stores the Documents that the question answering system uses to find answers to your questions

In [2]:
from haystack.document_stores.in_memory import InMemoryDocumentStore
document_store = InMemoryDocumentStore()

  from .autonotebook import tqdm as notebook_tqdm


### Fetch the Data

In [3]:
from datasets import load_dataset
from haystack import Document

dataset = load_dataset("bilgeyucel/seven-wonders", split="train")
docs = [Document(content=doc["content"], meta=doc["meta"]) for doc in dataset]

In [4]:
docs[0]

Document(id=75fd8474f2c88337f7e0dad69eba0f24ba293cb06693fb746ec403df01a1c0c5, content: 'The Colossus of Rhodes (Ancient Greek: ὁ Κολοσσὸς Ῥόδιος, romanized: ho Kolossòs Rhódios Greek: Κολο...', meta: {'url': 'https://en.wikipedia.org/wiki/Colossus_of_Rhodes', '_split_id': 0})

### Initalize a Document Embedder

To store your data in the `DocumentStore` with embeddings, initialize a `SentenceTransformersDocumentEmbedder` with the model name and call `warm_up()` to download the embedding model.

In [5]:
from haystack.components.embedders import SentenceTransformersDocumentEmbedder

doc_embedder = SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
doc_embedder.warm_up()




### Write Documents to the DocumentStore

In [6]:
docs_with_embeddings = doc_embedder.run(docs)

Batches:   0%|          | 0/5 [00:00<?, ?it/s]

Batches: 100%|██████████| 5/5 [00:01<00:00,  3.64it/s]


In [7]:
docs_with_embeddings['documents'][0]

Document(id=75fd8474f2c88337f7e0dad69eba0f24ba293cb06693fb746ec403df01a1c0c5, content: 'The Colossus of Rhodes (Ancient Greek: ὁ Κολοσσὸς Ῥόδιος, romanized: ho Kolossòs Rhódios Greek: Κολο...', meta: {'url': 'https://en.wikipedia.org/wiki/Colossus_of_Rhodes', '_split_id': 0}, embedding: vector of size 384)

In [8]:
document_store.write_documents(docs_with_embeddings["documents"])

151

## Building the RAG Pipeline

The next step is to build a `Pipeline` to generate answers for the user query following the `RAG` approach. To create the pipeline, you first need to initialize *each component*, add them to your pipeline, and connect them.

### Initialize a Text Embedder

Initialize a text embedder to create an embedding for the user query. The created embedding will later be used by the Retriever to retrieve relevant documents from the DocumentStore.

In [9]:
from haystack.components.embedders import SentenceTransformersTextEmbedder
text_embedder = SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")


### Initialize the Retriever

In [10]:
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
retriever = InMemoryEmbeddingRetriever(document_store)


### Define a Template Prompt

- Tạo `prompt` tùy chỉnh cho question answering task tổng quát bằng cách sử dụng phương pháp RAG. 
- `prompt` sẽ có hai tham số: `documents`, được truy xuất từ ​​kho lưu trữ tài liệu và a `question` từ người dùng. 
- Sử dụng cú pháp lặp `Jinja2` để kết hợp nội dung của tài liệu được truy xuất trong lời nhắc.

- Tiếp theo, khởi tạo phiên bản `PromptBuilder` bằng mẫu lời nhắc của bạn. `PromptBuilder`, khi được cung cấp các giá trị cần thiết, sẽ tự động điền vào các giá trị biến và tạo lời nhắc hoàn chỉnh. Cách tiếp cận này cho phép trải nghiệm trả lời câu hỏi phù hợp và hiệu quả hơn.

In [11]:
from haystack.components.builders import PromptBuilder

template = """
Given the following information, answer the question.

Context:
{% for document in documents %}
    {{ document.content }}
{% endfor %}

Question: {{question}}
Answer:
"""

prompt_builder = PromptBuilder(template=template)


### Initialize a Generator

In [12]:
# !pip install ollama-haystack

In [13]:
import os
from getpass import getpass
from haystack_integrations.components.generators.ollama import OllamaGenerator

generator = OllamaGenerator(model="llama3",
                            url = "http://localhost:11434/api/generate",
                            generation_kwargs={
                                "num_predict": 100,
                                "temperature": 0.9,}
                            )


### Build the Pipeline

In [14]:
from haystack import Pipeline

basic_rag_pipeline = Pipeline()
# Add components to your pipeline
basic_rag_pipeline.add_component("text_embedder", text_embedder)
basic_rag_pipeline.add_component("retriever", retriever)
basic_rag_pipeline.add_component("prompt_builder", prompt_builder)
basic_rag_pipeline.add_component("llm", generator)

# Now, connect the components to each other
basic_rag_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
basic_rag_pipeline.connect("retriever", "prompt_builder.documents")
basic_rag_pipeline.connect("prompt_builder", "llm")


<haystack.core.pipeline.pipeline.Pipeline object at 0x34e7ebc70>
🚅 Components
  - text_embedder: SentenceTransformersTextEmbedder
  - retriever: InMemoryEmbeddingRetriever
  - prompt_builder: PromptBuilder
  - llm: OllamaGenerator
🛤️ Connections
  - text_embedder.embedding -> retriever.query_embedding (List[float])
  - retriever.documents -> prompt_builder.documents (List[Document])
  - prompt_builder.prompt -> llm.prompt (str)

### Asking a Question

In [15]:
question = "What does Rhodes Statue look like?"
response = basic_rag_pipeline.run({"text_embedder": {"text": question}, "prompt_builder": {"question": question}})
print(response["llm"]["replies"][0])

Batches: 100%|██████████| 1/1 [00:02<00:00,  2.84s/it]


According to the text, there is no description of what the Rhodes Colossus looked like, as it was a statue of Helios (the sun god) that stood for 54 years until it was destroyed in an earthquake. The text only discusses the construction and features of the Statue of Zeus at Olympia.
