```{contents}
```
## RetrievalQA Chain (LangChain)

### What a RetrievalQA Chain Is

A **RetrievalQA chain** is a LangChain abstraction that combines:

* **Retrieval** (searching relevant documents)
* **Generation** (answering using an LLM)

to answer questions **grounded in external data**.

> RetrievalQA = **Retriever + Prompt + LLM**

It is one of the earliest and most common **RAG (Retrieval-Augmented Generation)** patterns in LangChain.

---

### Why RetrievalQA Exists

LLMs alone:

* Hallucinate
* Have stale knowledge
* Cannot access private data

RetrievalQA solves this by:

* Fetching relevant context at runtime
* Grounding answers in source documents
* Reducing hallucinations
* Enabling enterprise/private data QA

---

### Conceptual Flow

```
User Question
   ↓
Retriever (Vector DB / Search)
   ↓
Relevant Documents
   ↓
Prompt (stuffed with context)
   ↓
LLM
   ↓
Answer
```

---

### Core Components of RetrievalQA

### Retriever

Responsible for fetching relevant documents.

Examples:

* VectorStore retriever (FAISS, Chroma, Pinecone)
* BM25 / keyword retriever
* Hybrid retriever

---

### Prompt

Defines how retrieved documents are injected into the LLM.

Typical pattern:

```
Answer the question using the following context:
{context}
```

---

### LLM

Generates the final answer using the provided context.

---

### Basic RetrievalQA Demonstration

#### Step 1: Create a Retriever



In [3]:
from langchain_classic.vectorstores import FAISS
from langchain_openai.embeddings import OpenAIEmbeddings

vectorstore = FAISS.from_texts(
    texts=["LangChain is a framework for LLM apps"],
    embedding=OpenAIEmbeddings()
)

retriever = vectorstore.as_retriever()



---

#### Step 2: Create the RetrievalQA Chain



In [5]:
from langchain_classic.chains import RetrievalQA
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    chain_type="stuff"
)







---

#### Step 3: Ask a Question



In [6]:
result = qa_chain.invoke(
    {"query": "What is LangChain?"}
)

print(result["result"])


LangChain is a framework designed for building applications that utilize large language models (LLMs). It provides tools and components to facilitate the development of LLM-based applications, making it easier for developers to integrate and manage these models in their projects.




Output:

```text
LangChain is a framework for building applications using language models.
```

---

### What RetrievalQA Returns

By default:

```python
{
  "query": "...",
  "result": "final answer"
}
```

Optional:

* Source documents
* Metadata

---

### RetrievalQA with Source Documents



In [7]:
qa_chain = RetrievalQA.from_chain_type(
        llm=llm,
    retriever=retriever,
    return_source_documents=True
)

result = qa_chain.invoke({"query": "What is LangChain?"})

print(result["source_documents"])


[Document(id='dedc0e08-e26e-481d-84ee-154d9279f09b', metadata={}, page_content='LangChain is a framework for LLM apps')]



Useful for:

* Debugging
* Citations
* Trust & explainability

---

### Chain Types in RetrievalQA

RetrievalQA internally uses **document combination chains**:

| chain_type | Description                                |
| ---------- | ------------------------------------------ |
| stuff      | Stuff all docs into one prompt             |
| map_reduce | Summarize docs independently, then combine |
| refine     | Incrementally refine answer                |
| map_rerank | Score answers and pick best                |

---

### RetrievalQA vs ConversationalRetrievalChain

| Aspect       | RetrievalQA | ConversationalRetrievalChain |
| ------------ | ----------- | ---------------------------- |
| Chat history | ❌           | ✅                            |
| Multi-turn   | ❌           | ✅                            |
| Memory       | ❌           | ✅                            |

Use RetrievalQA for **single-turn QA**.

---

### Limitations of RetrievalQA

* ❌ Legacy abstraction
* ❌ Limited customization
* ❌ Not fully Runnable-based
* ❌ Harder to debug
* ❌ Less control than LCEL

---

### RetrievalQA vs LCEL-based RAG (Modern)

#### RetrievalQA (Legacy)

```python
RetrievalQA.from_chain_type(...)
```

#### LCEL RAG (Recommended)

```python
chain = (
    {"context": retriever, "input": RunnablePassthrough()}
    | prompt
    | llm
)
```

LCEL provides:

* Better observability
* Streaming & async
* Custom guards
* Flexible composition

---

### When to Use RetrievalQA

* Learning RAG concepts
* Quick prototypes
* Legacy codebases
* Simple QA over documents

---

### When NOT to Use RetrievalQA

* Production RAG systems
* Multi-turn chat
* Agentic workflows
* Advanced reranking
* Streaming APIs

---

### Best Practices (If You Use It)

* Use low temperature (0–0.2)
* Limit retrieved documents
* Enable source documents
* Validate token size
* Plan migration to LCEL

---

### Interview-Ready Summary

> “RetrievalQA is a LangChain chain that answers questions by retrieving relevant documents and passing them to an LLM. It implements a classic RAG pattern but is a legacy abstraction superseded by LCEL-based pipelines.”

---

### Rule of Thumb

* **Simple single-turn QA → RetrievalQA**
* **Conversational QA → ConversationalRetrievalChain**
* **Production RAG → LCEL or LangGraph**
