```{contents}
```

## ConversationalRetrievalChain

### What ConversationalRetrievalChain Is

A **ConversationalRetrievalChain** is a LangChain abstraction that extends **RetrievalQA** to support **multi-turn conversations** by combining:

* **Chat history (memory)**
* **Question rewriting**
* **Document retrieval**
* **Answer generation**

> It enables **context-aware RAG**, where follow-up questions depend on previous turns.

---

### Why ConversationalRetrievalChain Exists

In real conversations, users ask:

* “What is LangChain?”
* “How does it work?”
* “Does it support agents?”

The last two questions are **ambiguous without context**.

ConversationalRetrievalChain:

* Rewrites follow-up questions into standalone queries
* Retrieves relevant documents using that rewritten query
* Produces grounded answers using conversation context

---

### Conceptual Flow

```
User Question
   ↓
Chat History
   ↓
Question Condenser (LLM)
   ↓
Standalone Question
   ↓
Retriever
   ↓
Relevant Documents
   ↓
LLM
   ↓
Answer
```

---

### Core Components

#### Chat History (Memory)

Stores prior Human/AI messages and provides conversational context.

---

### Question Condenser

An LLM prompt that converts a follow-up question into a **standalone query**.

Example:

```
Original: "Does it support agents?"
Rewritten: "Does LangChain support agents?"
```

---

### Retriever

Fetches relevant documents using the rewritten question.

---

### Answer Generation Chain

Uses retrieved documents + question to generate the final answer.

---

### Basic ConversationalRetrievalChain Demonstration

#### Step 1: Create a Retriever



In [1]:
from langchain_classic.vectorstores import FAISS
from langchain_openai.embeddings import OpenAIEmbeddings

texts = [
    "LangChain is a framework for building LLM-powered applications.",
    "LangChain supports agents, tools, and retrieval pipelines."
]

vectorstore = FAISS.from_texts(texts, OpenAIEmbeddings())
retriever = vectorstore.as_retriever()




---

#### Step 2: Create Memory



In [10]:
from langchain_classic.memory import ConversationBufferMemory

memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True,
    output_key="answer"
)



---

#### Step 3: Create the ConversationalRetrievalChain



In [11]:

from langchain_classic.chains.conversational_retrieval.base import ConversationalRetrievalChain
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

qa_chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    memory=memory,
    return_source_documents=True,
)



---

#### Step 4: Run a Conversation



In [12]:
qa_chain.invoke({"question": "What is LangChain?"})
qa_chain.invoke({"question": "Does it support agents?"})


{'question': 'Does it support agents?',
 'chat_history': [HumanMessage(content='What is LangChain?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='LangChain is a framework for building applications powered by large language models (LLMs). It supports various components such as agents, tools, and retrieval pipelines to facilitate the development of these applications.', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='Does it support agents?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='Yes, LangChain supports agents.', additional_kwargs={}, response_metadata={})],
 'answer': 'Yes, LangChain supports agents.',
 'source_documents': [Document(id='c72ccfcb-188c-4eb0-9ce3-946df5995104', metadata={}, page_content='LangChain supports agents, tools, and retrieval pipelines.'),
  Document(id='84cf6952-5d3b-431c-b442-d69d1cfe9665', metadata={}, page_content='LangChain is a framework for building LLM-powered applications.')]}



The second question is automatically **contextualized**.

---

### What the Chain Returns

```python
{
  "answer": "...",
  "source_documents": [...],
  "chat_history": [...]
}
```

Useful for:

* UI display
* Citations
* Debugging

---

### ConversationalRetrievalChain vs RetrievalQA

| Aspect             | RetrievalQA | ConversationalRetrievalChain |
| ------------------ | ----------- | ---------------------------- |
| Chat history       | ❌           | ✅                            |
| Multi-turn         | ❌           | ✅                            |
| Question rewriting | ❌           | ✅                            |
| Context-aware      | ❌           | ✅                            |

---

### Chain Types for Answer Generation

Internally, it supports the same document-combining strategies:

* `stuff`
* `map_reduce`
* `refine`
* `map_rerank`

---

### Limitations of ConversationalRetrievalChain

* ❌ Legacy abstraction
* ❌ Less control over internals
* ❌ Harder to customize prompts
* ❌ Not fully Runnable-based

---

### Conversational Retrieval vs LCEL (Modern RAG)

### Legacy Approach

```python
ConversationalRetrievalChain.from_llm(...)
```

### LCEL-Based Context-Aware RAG (Recommended)

```python
standalone_q = condense_prompt | llm
rag_chain = (
    {"context": retriever, "input": RunnablePassthrough()}
    | answer_prompt
    | llm
)
```

LCEL offers:

* Full control
* Streaming
* Custom guardrails
* Better observability

---

### Common Use Cases

* Chatbots over documents
* Internal knowledge assistants
* Helpdesk bots
* Enterprise search chat
* Multi-turn Q&A systems

---

### Best Practices

* Use low temperature (0–0.2)
* Limit retrieved chunks
* Enable source documents
* Monitor question rewriting quality
* Plan migration to LCEL for production

---

### When to Use ConversationalRetrievalChain

* Learning conversational RAG
* Rapid prototyping
* Legacy LangChain systems

---

### When NOT to Use It

* New production systems
* Complex agentic workflows
* Advanced reranking pipelines
* Streaming-first APIs

---

### Interview-Ready Summary

> “ConversationalRetrievalChain is a LangChain abstraction for multi-turn, context-aware RAG. It rewrites follow-up questions using chat history, retrieves relevant documents, and generates grounded answers, but is a legacy construct superseded by LCEL-based pipelines.”

---

### Rule of Thumb

* **Single-turn QA → RetrievalQA**
* **Multi-turn document chat → ConversationalRetrievalChain**
* **Production conversational RAG → LCEL or LangGraph**

---

If you want next:

* LCEL conversational RAG from scratch
* Question condensation prompt design
* Memory strategies for chat RAG
* Reranking in conversational search
