# 📘 LangChain — Retrievers and Chains

---

## 1. **Executive Summary**
- **Retrievers:** Components that fetch relevant documents or knowledge chunks from a database/vector store based on a query.  
- **Chains:** Pipelines that define a sequence of steps, combining LLMs, prompts, memory, and tools to produce structured output.  
- **Why important:** They allow **multi-step reasoning, RAG workflows, and structured GenAI applications**.

---

## 2. **Retrievers**

| Component | Description | Example | Notes |
|-----------|------------|--------|-------|
| **Vector Retriever** | Retrieves top-k documents using vector similarity | FAISS, Pinecone, Chroma | Embeddings → similarity search |
| **Keyword Retriever** | Retrieves docs based on keyword matching | Simple text search | Good for small datasets or metadata filtering |
| **Time/Date Retriever** | Filters by timestamp | Logs, versioned documents | Combine with vector retrieval for hybrid search |
| **Custom Retriever** | User-defined retrieval logic | API calls, business rules | Flexible for domain-specific needs |

**Notes:**  
- Often paired with **Embedding models** for semantic similarity.  
- Retrieval quality directly impacts RAG or chain outputs.  

---

## 3. **Chains**

| Type | Description | Example Use Case | Notes |
|------|------------|----------------|------|
| **LLMChain** | Single LLM + PromptTemplate | Explain a concept | Simplest pipeline |
| **SequentialChain** | Multiple LLMChains executed in sequence | Summarize → Translate → Format | Maintains output between steps |
| **SimpleSequentialChain** | SequentialChain without input/output mapping | Quick multi-step tasks | Lightweight version |
| **RetrievalQA Chain** | LLM + Retriever | Answer questions from documents | Core of RAG workflows |
| **StuffingChain** | Combines all retrieved docs into one prompt | Small context window LLMs | Simple, can exceed token limit |
| **Map-Reduce Chain** | Processes chunks individually → combines results | Summarization of long docs | Efficient for large datasets |
| **Refine Chain** | Iteratively improves output with new info | Long-form answers, summaries | Handles large context gracefully |
| **Custom Chains** | User-defined sequences | Multi-step workflows with agents/tools | Flexible for enterprise apps |

---

## 4. **Practical Usage Patterns**

| Pattern | Components | Notes |
|---------|-----------|------|
| RAG | Retriever + RetrievalQA Chain | LLM grounded in external knowledge |
| Multi-step processing | SequentialChain or Map-Reduce Chain | Summarize → Answer → Format |
| Chatbot with context | LLMChain + Memory | Maintains conversational history |
| Domain QA | Retriever + Refine Chain | Iteratively improves factual accuracy |

---

## 5. **Python Examples**

### 5.1 Retrieval Example

```python
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.document_loaders import PyPDFLoader

# Load documents
loader = PyPDFLoader("sample_doc.pdf")
docs = loader.load_and_split()

# Create embeddings + vector DB
embeddings = OpenAIEmbeddings()
vector_db = FAISS.from_documents(docs, embeddings)

# Simple retrieval
query = "Explain LangChain architecture"
retrieved_docs = vector_db.similarity_search(query, k=3)
for i, doc in enumerate(retrieved_docs):
    print(f"Doc {i+1}:", doc.page_content[:200], "...\n")
````

### 5.2 LLMChain Example

```python
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain_openai import OpenAI

llm = OpenAI(temperature=0)

prompt = PromptTemplate(
    input_variables=["topic"],
    template="Explain {topic} in simple terms."
)

chain = LLMChain(llm=llm, prompt=prompt)
output = chain.run("LangChain Retrievers and Chains")
print(output)
```

### 5.3 RetrievalQA Chain Example

```python
from langchain.chains import RetrievalQA

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=vector_db.as_retriever(search_type="similarity", search_kwargs={"k":3}),
    return_source_documents=True
)

result = qa_chain.run("Key concepts of LangChain RAG workflow")
print(result)
```

---

## 6. **Best Practices**

* **Retriever Tips:**

  * Use semantic embeddings for relevance, not just keyword match.
  * Consider hybrid retrieval: semantic + metadata filtering.
* **Chain Tips:**

  * Use Map-Reduce for large documents to avoid token overflow.
  * Refine chains improve factual accuracy iteratively.
  * Modularize chains for reusability and testing.
* **Integration:**

  * Combine retrievers + chains + memory for multi-turn, grounded GenAI apps.
  * Always monitor token usage and latency for cost management.

---

✅ *Quick Review*:

* **Retriever = fetch relevant knowledge**
* **Chain = structured sequence of LLM + prompts + tools**
* Core for **RAG, multi-step reasoning, and chatbots** in LangChain.

```

