
---

## 🧠 LangChain v0.3 for Generative AI — Structured Roadmap

---

### 🟢 **I. Foundation Concepts**

| Topics                     | Description                                                            |
| -------------------------- | ---------------------------------------------------------------------- |
| 🧱 What is LangChain?      | Core purpose, architecture, and how it simplifies GenAI app building   |
| 🪄 Why LangChain?          | Use cases: Chatbots, QA, Agents, Retrieval systems, Workflow pipelines |
| ⚙️ LangChain v0.3 Overview | Key updates, simplified interfaces (LCEL, Runnable), modular design    |

---

### 🟡 **II. Core Components in LangChain v0.3**

#### 🔹 **1. Models**

* `ChatModel`, `LLM`, `TextEmbeddingModel`
* Using OpenAI, Anthropic, HuggingFace, etc.
* Temperature, Top-p, max tokens

#### 🔹 **2. Prompts**

* PromptTemplates (LLM + Chat)
* ChatMessagePromptTemplate (system, user, AI roles)
* Partial prompts & dynamic variables

#### 🔹 **3. Output Parsers**

* `StrOutputParser`, `JsonOutputParser`, `PydanticOutputParser`
* Structured output from LLMs
* Combining with prompt chains

#### 🔹 **4. Chains**

* Runnable + LCEL (LangChain Expression Language)
* Sequential + Parallel chains
* RouterChain, Map-Reduce, Conditionally Routed Chains

#### 🔹 **5. Tools & Toolkits**

* Google Search Tool, Wikipedia Tool
* Custom tool creation using `@tool` decorators
* Tool integration in Agent workflows

#### 🔹 **6. Agents**

* ReAct Agent, OpenAI Functions Agent, Tool-Using Agent
* Multi-tool orchestration
* Memory-integrated agents

#### 🔹 **7. Memory**

* `ConversationBufferMemory`, `SummaryMemory`, `EntityMemory`
* Adding memory to chains and agents
* Custom memory types

---

### 🟠 **III. Data-Aware / RAG Systems**

#### 🔹 **1. Document Loaders**

* TextLoader, PDFLoader, WebBaseLoader
* Unstructured.io, Playwright

#### 🔹 **2. Text Splitters**

* RecursiveCharacterTextSplitter
* Sentence & Token-based splitters
* Chunk overlap best practices

#### 🔹 **3. Embedding Models**

* OpenAI, HuggingFace, Cohere embeddings
* Creating and normalizing vector representations

#### 🔹 **4. Vector Stores**

* FAISS, Chroma, Qdrant, Weaviate, Pinecone
* Adding, searching, filtering documents
* Custom metadata support

#### 🔹 **5. Retrievers**

* VectorStoreRetriever
* MultiQueryRetriever, ContextualCompressionRetriever

#### 🔹 **6. Retrieval Chains**

* `RetrievalQA`, `ConversationalRetrievalChain`
* RAG with memory, search tools, filters

---

### 🔵 **IV. Advanced Topics**

#### 🔹 **1. LangGraph (stateful flows)**

* Multi-agent collaboration
* Graph-based workflow orchestration
* Building dynamic task routing systems

#### 🔹 **2. Async & Streaming**

* Async LLM calls, streaming outputs
* Event-driven architecture

#### 🔹 **3. LangServe (API deployment)**

* Turning chains/agents into REST APIs
* FastAPI integration
* LangSmith + LangServe monitoring

#### 🔹 **4. Tracing with LangSmith**

* Observability for prompts, tokens, agents
* Debugging, optimization
* Custom metadata logs

---

### 🟣 **V. Practical Projects & Use Cases**

| Level           | Projects                                                      |
| --------------- | ------------------------------------------------------------- |
| 🧩 Beginner     | PDF Q\&A Bot, LLM Summarizer, SQL Agent                       |
| 🧠 Intermediate | Memory Chatbot, RAG QA System, AI Tutor                       |
| 🤖 Advanced     | Multi-Agent Planner, Voice-based Assistant, AutoGPT-style app |

---



---

## 🧠 **Foundation Concepts of LangChain v0.3**

---

### 🧱 **1. What is LangChain?**

**🧑‍🔬 Definition:**
LangChain is an open-source **framework to build applications powered by Large Language Models (LLMs)** like GPT-4, Claude, and others.

**📦 Core Purpose:**
It abstracts away the complexity of LLM workflows by combining:

* 🧠 Language models
* 📄 External data (documents, APIs, tools)
* 🔁 Multi-step logic (chains, agents, routing)
* 🧷 Memory & state (chatbots, long-term history)
* 🔧 Deployment & observability

> 🧰 LangChain = *LLM + Tools + Data + Logic + Deployment*

---

### 📐 **LangChain Architecture Overview**

| Layer                 | Purpose                                      |
| --------------------- | -------------------------------------------- |
| **LLMs & Tools**      | GPT-4, Claude, calculators, APIs             |
| **Prompts & Parsers** | Templates, formatting, structured output     |
| **Chains**            | Sequences of steps (like function pipelines) |
| **Agents**            | Dynamically decide next actions              |
| **Memory**            | Store chat or state context                  |
| **RAG / Data-aware**  | Connect to vector stores for retrieval       |
| **Deployment**        | Via LangServe, traced with LangSmith         |

---

### 🪄 **2. Why LangChain?**

LangChain is used when you want to go beyond a single prompt. It's ideal for building:

| Use Case                       | Description                                           |
| ------------------------------ | ----------------------------------------------------- |
| 🤖 **Chatbots**                | Multi-turn, memory-aware LLM chat interfaces          |
| ❓ **Question Answering (QA)**  | Answering from PDFs, websites, private docs           |
| 🧠 **Agents**                  | Dynamic LLMs that reason, decide, act with tools      |
| 🔎 **Retrieval Systems (RAG)** | Search + generate pipelines using vector DBs          |
| 🔄 **Workflow Pipelines**      | Step-by-step document processing, summarization, etc. |

**✨ Benefits:**

* Abstracts boilerplate LLM logic
* Supports modular composition (`.pipe()`, `invoke()`)
* Easily scalable, testable, and debuggable
* Fast integration with vector stores, databases, APIs, and tools

---

### ⚙️ **3. What’s New in LangChain v0.3?**

LangChain v0.3 is a **complete redesign** for clarity, speed, and production use.

---

#### 🔄 **Major Improvements:**

| Feature                                    | Description                                                  |
| ------------------------------------------ | ------------------------------------------------------------ |
| ✅ **LCEL** (LangChain Expression Language) | Chain logic as simple function calls: `.pipe()`, `.invoke()` |
| ✅ **Runnable Interfaces**                  | Standard interface for LLMs, retrievers, chains, etc.        |
| ✅ **Streaming + Async**                    | Built-in real-time output + concurrency                      |
| ✅ **Modular & Typed**                      | You can plug anything into anything (chains, tools, LLMs)    |
| ✅ **LangServe Integration**                | Deploy chains as FastAPI REST APIs                           |
| ✅ **LangSmith**                            | Trace, debug, and evaluate everything easily                 |
| ✅ **LangGraph**                            | Orchestrate complex multi-agent workflows                    |

---

#### 🔧 Example LCEL Code:

```python
from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser

chain = (
  PromptTemplate.from_template("Tell me a joke about {topic}")
  .pipe(ChatOpenAI())
  .pipe(StrOutputParser())
)

response = chain.invoke({"topic": "AI"})
print(response)
```

✅ That’s it! Clean, composable, async/stream-ready chain logic.

---

### ✅ Summary

| Concept               | Description                                            |
| --------------------- | ------------------------------------------------------ |
| 🧱 What is LangChain? | Framework to build intelligent LLM apps                |
| 🪄 Why use it?        | Ideal for chatbots, agents, retrieval, pipelines       |
| ⚙️ v0.3 Update        | LCEL, modularity, deployability, LangGraph & LangSmith |

---




---

# 🧱 **1. Models**

---

### 1. ✅ Definition

**LangChain models** are wrappers for different types of large language models (LLMs), chat models, and embedding models. These interfaces abstract away vendor-specific APIs and unify them under a common interface so you can switch easily between OpenAI, Anthropic, HuggingFace, etc.

---

### 2. ✅ Types & Built-in Functions

#### 🔸 A. Chat Models

| Class            | Import                                              | Description            | When to Use                        |
| ---------------- | --------------------------------------------------- | ---------------------- | ---------------------------------- |
| `ChatOpenAI`     | `from langchain_openai import ChatOpenAI`           | OpenAI GPT-3.5 / GPT-4 | Best for conversations             |
| `ChatAnthropic`  | `from langchain_anthropic import ChatAnthropic`     | Claude models          | For long context & safer reasoning |
| `ChatGooglePalm` | `from langchain_google_genai import ChatGooglePalm` | Google PaLM 2          | Google’s chat models               |
| `ChatCohere`     | `from langchain_cohere import ChatCohere`           | Cohere models          | Simpler chat setups                |

```python
llm = ChatOpenAI(model="gpt-4", temperature=0.7)
```

---

#### 🔸 B. LLMs (text completion)

| Class       | Import                                      | Description                    | When to Use                       |
| ----------- | ------------------------------------------- | ------------------------------ | --------------------------------- |
| `OpenAI`    | `from langchain_openai import OpenAI`       | Traditional LLM (text-davinci) | For legacy/line-based completions |
| `Anthropic` | `from langchain_anthropic import Anthropic` | Claude in non-chat format      | If chat abstraction isn’t needed  |

```python
llm = OpenAI(model="text-davinci-003")
```

---

#### 🔸 C. Embedding Models

| Class                   | Import                                                             | Description                     | When to Use            |
| ----------------------- | ------------------------------------------------------------------ | ------------------------------- | ---------------------- |
| `OpenAIEmbeddings`      | `from langchain_openai import OpenAIEmbeddings`                    | Embeds text as vectors          | For search, RAG        |
| `HuggingFaceEmbeddings` | `from langchain_community.embeddings import HuggingFaceEmbeddings` | Local or open-source embeddings | Privacy, offline setup |
| `CohereEmbeddings`      | `from langchain_cohere import CohereEmbeddings`                    | Uses Cohere API                 | Alternatives to OpenAI |

```python
embedding = OpenAIEmbeddings()
```

---

### 3. ✅ Use Cases, Advantages, Disadvantages

| Aspect          | Description                                                                                                           |
| --------------- | --------------------------------------------------------------------------------------------------------------------- |
| 💡 Use Cases    | - Chatbots<br>- Text generation<br>- Document retrieval (embeddings)<br>- Summarization<br>- Structured output        |
| ✅ Advantages    | - Unified abstraction for various models<br>- Easily swappable providers<br>- Supports advanced configs (temp, top-p) |
| ❌ Disadvantages | - External API cost<br>- Rate limits<br>- Model drift across versions                                                 |
| ⚠️ Limitations  | - Requires internet/API access<br>- May not support every new model out of the box                                    |

---

### 4. ✅ Best Model Integrations & When

| Scenario                          | Best Model                                                         |
| --------------------------------- | ------------------------------------------------------------------ |
| Conversational AI                 | `ChatOpenAI`, `ChatAnthropic`                                      |
| Cost-sensitive or quick prototype | `ChatOpenAI` with gpt-3.5                                          |
| Private/Enterprise use            | `ChatGooglePalm`, `Cohere`, or local models                        |
| Long context (>100k tokens)       | `Claude 2/3 (ChatAnthropic)`                                       |
| Embedding for RAG                 | `OpenAIEmbeddings` for accuracy, `HuggingFaceEmbeddings` for local |

---

### 5. ✅ Extras

* 📌 **Temperature (0–1)**: Lower = more deterministic
* 📌 **Top-p**: Controls sampling diversity (alternative to temperature)
* 📌 Use `.bind()` to fix certain values into the model for reuse:




---

# ✳️ **2. Prompts**

---

### 1. ✅ Definition

In LangChain, **prompts** are dynamic templates used to structure the input sent to LLMs. They allow you to combine static instructions with dynamic user data — ensuring consistent, reusable, and controlled LLM inputs.

LangChain supports:

* Text prompts → for `LLM` models
* Chat prompts → for `ChatModel` models using system, user, AI roles

---

### 2. ✅ Types & Built-in Functions (with Definitions + Use Cases)

#### 🔹 A. `PromptTemplate`

📘 **Definition**: A template for formatting string prompts with dynamic variables.

```python
from langchain.prompts import PromptTemplate
```

| Function / Class   | Purpose                                                         | When to Use                                 |
| ------------------ | --------------------------------------------------------------- | ------------------------------------------- |
| `PromptTemplate()` | Create a prompt with input variables and a template string      | For traditional LLM models (not chat-based) |
| `from_template()`  | Alternative constructor to quickly define prompt with variables | Faster setup                                |

```python
template = PromptTemplate.from_template("What is the capital of {country}?")
prompt = template.format(country="France")
```

---

#### 🔹 B. `ChatPromptTemplate`

📘 **Definition**: Used to construct a sequence of chat messages for chat-based models (e.g., GPT-4).

```python
from langchain.prompts import ChatPromptTemplate
```

| Function / Class                     | Purpose                                                                    | When to Use                                           |
| ------------------------------------ | -------------------------------------------------------------------------- | ----------------------------------------------------- |
| `ChatPromptTemplate.from_messages()` | Build prompts using a list of role-based messages (`system`, `user`, `ai`) | Best for use with `ChatOpenAI`, `ChatAnthropic`, etc. |

```python
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a smart tutor."),
    ("user", "Explain {topic} in 2 lines.")
])
```

---

#### 🔹 C. `SystemMessagePromptTemplate`, `HumanMessagePromptTemplate`, `AIMessagePromptTemplate`

📘 **Definition**: Specialized prompt blocks to structure chat messages by role.

```python
from langchain.prompts import (
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
    AIMessagePromptTemplate
)
```

| Class                                         | Purpose                                                    | When to Use                               |
| --------------------------------------------- | ---------------------------------------------------------- | ----------------------------------------- |
| `SystemMessagePromptTemplate.from_template()` | Define system role instructions (sets tone or role of LLM) | Use when you want to control LLM behavior |
| `HumanMessagePromptTemplate.from_template()`  | Define user messages with placeholders                     | Use for user inputs                       |
| `AIMessagePromptTemplate.from_template()`     | (Optional) Simulate previous AI responses                  | For chat memory simulation                |

---

#### 🔹 D. `MessagesPlaceholder`

📘 **Definition**: Placeholder for injecting past messages (used with memory).

```python
from langchain.prompts import MessagesPlaceholder
```

| Class                                               | Purpose                                                  | When to Use                  |
| --------------------------------------------------- | -------------------------------------------------------- | ---------------------------- |
| `MessagesPlaceholder(variable_name="chat_history")` | Plug in dynamic memory-based message history into prompt | Use in memory-based chatbots |

---

#### 🔹 E. `prompt.partial()`

📘 **Definition**: Pre-fill or “lock” certain variables in a prompt for reuse.

| Method                      | Purpose                          | When to Use                                                       |
| --------------------------- | -------------------------------- | ----------------------------------------------------------------- |
| `prompt.partial(var=value)` | Set a fixed value for a variable | Useful when the same value is used repeatedly (e.g., system role) |

```python
partial_prompt = prompt.partial(topic="machine learning")
```

---

### 3. ✅ Use Cases, Advantages, Disadvantages, Limitations

| Aspect          | Description                                                                                       |
| --------------- | ------------------------------------------------------------------------------------------------- |
| 💡 Use Cases    | - Dynamic prompting<br>- Chat instruction flows<br>- Role-based responses<br>- Injecting memory   |
| ✅ Advantages    | - Reusable<br>- Cleaner logic<br>- Supports chat format + variable substitution                   |
| ❌ Disadvantages | - Can become complex with too many variables<br>- Needs careful formatting                        |
| ⚠️ Limitations  | - Not model-aware — doesn’t prevent misuse (e.g., using ChatPromptTemplate with text-only models) |

---

### 4. ✅ Best Model Integrations

| Prompt Type                         | Best Paired Model                                     |
| ----------------------------------- | ----------------------------------------------------- |
| `PromptTemplate`                    | `OpenAI`, `Anthropic`, `TextGen models`               |
| `ChatPromptTemplate`                | `ChatOpenAI`, `ChatAnthropic`, `ChatCohere`, `Claude` |
| Role-based prompts (System/User/AI) | `ChatOpenAI`, `Claude`, `Gemini`                      |
| `MessagesPlaceholder`               | Use with `Memory-enabled` chains or agents            |

---

### 5. ✅ Extras / Best Practices

| Tip                                              | Why It Matters                                 |
| ------------------------------------------------ | ---------------------------------------------- |
| Use `ChatPromptTemplate` with modern chat models | Better formatting and separation of roles      |
| Favor `partial()` to reduce prompt complexity    | Improves reusability                           |
| Combine prompts with `OutputParsers`             | Ensure output is structured & machine-readable |
| Use `{variable}` instead of string interpolation | LangChain safely manages variable formatting   |

---

### 📌 Summary Code Example

```python
from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "Tell me about {topic}")
])

print(prompt.format_messages(topic="LangChain"))
```

---




---

# 🧪 **3. Output Parsers**

---

### 1. ✅ Definition

**Output Parsers** in LangChain are components that convert the raw response from a model (usually a text string) into a **structured, usable format** like:

* Plain text
* JSON dict
* Python objects (via Pydantic)

They're essential when chaining prompts → models → usable outputs.

---

### 2. ✅ Types & Built-in Functions

---

#### 🔸 A. `StrOutputParser`

📘 **Definition**: Parses model output into plain string by trimming and returning it directly.

```python
from langchain.output_parsers import StrOutputParser

parser = StrOutputParser()
```

✅ **When to Use**:

* You just need plain text output (summaries, answers, explanations)
* Default for most LLM chains

🧠 Example:

```python
chain = prompt | llm | StrOutputParser()
response = chain.invoke({"question": "What is LangChain?"})
```

---

#### 🔸 B. `JsonOutputParser`

📘 **Definition**: Parses model output into a Python `dict` assuming it's valid JSON.

```python
from langchain.output_parsers import JsonOutputParser

parser = JsonOutputParser()
```

✅ **When to Use**:

* LLM is prompted to return structured JSON
* Ideal for structured tasks like form-filling, RAG metadata, tabular Q\&A

🧠 Example Prompt:

```python
"Extract a JSON object with `name`, `age`, and `hobby`: {text}"
```

💡 Pro Tip: GPT-4 + `response_format='json'` (OpenAI tools) → works best here.

---

#### 🔸 C. `PydanticOutputParser`

📘 **Definition**: Converts model output into a Python object based on a defined `Pydantic` schema — validates field types.

```python
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int

parser = PydanticOutputParser(pydantic_object=Person)
```

✅ **When to Use**:

* You want **strong type safety**
* Output will be used in APIs, databases, or further processing

🧠 Example:

```python
chain = prompt | llm | parser
result = chain.invoke({"text": "name: Alice, age: 30"})
```

---

#### 🔸 D. `RetryOutputParser`

📘 **Definition**: Automatically **retries** parsing if LLM output is invalid — using feedback prompts.

```python
from langchain.output_parsers import RetryOutputParser
```

✅ **When to Use**:

* You're using `JsonOutputParser` or `PydanticOutputParser` and want to handle format errors gracefully
* Mission-critical or production systems

🧠 Example:

```python
parser = RetryOutputParser.from_parser(parser=JsonOutputParser())
```

---

### 3. ✅ Use Cases, Advantages, Disadvantages

| Use Case        | Examples                           |
| --------------- | ---------------------------------- |
| String output   | Answer generation, summarization   |
| JSON extraction | Structured data, document metadata |
| Typed output    | APIs, validators, downstream logic |

| ✅ Advantages         | ❌ Disadvantages               |
| -------------------- | ----------------------------- |
| Clean parsing logic  | LLM must follow strict format |
| Modular & composable | Fails if output malformed     |
| Works with all LLMs  | Retry parser adds latency     |

---

### 4. ✅ Best Model Integrations

| Parser                 | Best With       | Why                         |
| ---------------------- | --------------- | --------------------------- |
| `StrOutputParser`      | Any LLM         | Default output              |
| `JsonOutputParser`     | GPT-4, Claude 3 | Structured output mode      |
| `PydanticOutputParser` | GPT-4, Claude   | Typed schemas for APIs      |
| `RetryOutputParser`    | OpenAI, Claude  | Handles bad format recovery |

---

### 5. ✅ Tips & Best Practices

✅ Use `output_parser.get_format_instructions()` in your prompt
➡️ Adds automatic formatting hints for the model.

🧠 Example with prompt:

```python
prompt = PromptTemplate.from_template(
  "Extract details:\n{format_instructions}\nText: {input_text}"
).partial(format_instructions=parser.get_format_instructions())
```

---

### 🔗 Common Pattern: Prompt → Model → Parser

```python
chain = prompt | llm | JsonOutputParser()
result = chain.invoke({"input_text": "My name is John and I am 22 years old."})
```

---



---

# 🔁 **4. Chains (Runnable + LCEL)**

---

### 1. ✅ Definition

**Chains** are sequences of operations (prompt → LLM → output parser) that execute together. LangChain v0.3 uses a new composition system called **LCEL (LangChain Expression Language)** built on the `Runnable` interface — making it modular, fast, and flexible.

---

### 2. ✅ Types & Built-in Functions

---

#### 🔸 A. `Runnable` (Base class)

📘 **Definition**: Abstract interface for anything that can “run” — a model, prompt, parser, or even function.

```python
from langchain_core.runnables import Runnable
```

| Function / Class         | Purpose                                   | When to Use                           |
| ------------------------ | ----------------------------------------- | ------------------------------------- |
| `Runnable.invoke(input)` | Runs a single input through the component | Default for most use cases            |
| `Runnable.batch(inputs)` | Run multiple inputs at once               | Use for performance (batch LLM calls) |
| `Runnable.map()`         | Runs on a stream of inputs in parallel    | Use with lists or streams             |

---

#### 🔸 B. `RunnableLambda`

📘 **Definition**: Wraps any Python function as a `Runnable`.

```python
from langchain_core.runnables import RunnableLambda
```

| Function             | Purpose                          | When to Use                   |
| -------------------- | -------------------------------- | ----------------------------- |
| `RunnableLambda(fn)` | Wrap function as a runnable step | Add pre/post processing logic |

✅ Example:

```python
strip_input = RunnableLambda(lambda x: x.strip())
```

---

#### 🔸 C. `RunnableSequence`

📘 **Definition**: Chain multiple `Runnable` components manually (alternative to using `|`).

```python
from langchain_core.runnables import RunnableSequence
```

✅ Example:

```python
chain = RunnableSequence(first_step, second_step)
```

📌 **Tip**: `step1 | step2 | step3` is preferred for cleaner syntax.

---

#### 🔸 D. LCEL Composition (`|` operator)

📘 **Definition**: Connect components using the pipe `|` operator.

```python
chain = prompt | llm | output_parser
```

✅ This is the most modern and preferred chaining method in v0.3.

---

#### 🔸 E. `invoke`, `batch`, `stream`

| Method      | Purpose                     | When to Use                   |
| ----------- | --------------------------- | ----------------------------- |
| `.invoke()` | For single inputs           | Most common use               |
| `.batch()`  | For multiple inputs         | Performance-sensitive cases   |
| `.stream()` | For streaming output tokens | Use with chat UI or CLI tools |

---

### 3. ✅ Use Cases, Advantages, Disadvantages

| Use Case             | Example                   |
| -------------------- | ------------------------- |
| Prompt → LLM → Parse | Chatbot, completion, Q\&A |
| File → Embed → Store | RAG pipelines             |
| Tool → LLM → Decide  | Agents                    |

---

| ✅ Advantages                        | ❌ Disadvantages                              |                                    |
| ----------------------------------- | -------------------------------------------- | ---------------------------------- |
| Clean composition (\`               | \` style)                                    | Complex logic may need custom code |
| Efficient + parallelizable          | Debugging deeply nested chains is harder     |                                    |
| Works with all LangChain components | Requires understanding Runnable architecture |                                    |

---

### 4. ✅ Best Model Integrations

| Chain Type       | Best With      |                                    |                               |
| ---------------- | -------------- | ---------------------------------- | ----------------------------- |
| \`Prompt         | ChatModel      | Parser\`                           | `ChatOpenAI`, `ChatAnthropic` |
| \`Prompt         | LLM            | Parser\`                           | `OpenAI`, `Claude`            |
| \`Embedding      | Vector Store\` | `OpenAIEmbeddings`, `HFEmbeddings` |                               |
| \`RunnableLambda | LLM            | Parser\`                           | For preprocessing pipelines   |

---

### 5. ✅ Extras & Best Practices

| Best Practice                              | Why It’s Useful                       |
| ------------------------------------------ | ------------------------------------- |
| Use `RunnableLambda` to preprocess data    | Trim, clean, enrich inputs before LLM |
| Use `.bind()` for partial config injection | Reuse chains with static params       |
| Compose reusable building blocks           | Clean logic & better modularity       |
| Prefer `.invoke()` in notebooks / scripts  | Simple and synchronous                |

---



---

# 🧰 **5. Tools & Toolkits**

---

### 1. ✅ Definition

**Tools** in LangChain are modular, callable actions that LLM-based agents can use to interact with the outside world. Examples: web search, Wikipedia lookup, code execution, calculators, APIs.

LangChain also allows you to define **custom tools** using the `@tool` decorator.

---

### 2. ✅ Types & Built-in Functions

---

#### 🔸 A. Built-in Tools

| Tool                                  | Description                                   | When to Use                    |
| ------------------------------------- | --------------------------------------------- | ------------------------------ |
| `WikipediaQueryRun`                   | Search & extract content from Wikipedia       | Quick factual lookup           |
| `GoogleSearchRun`                     | Perform Google search (via SerpAPI or Tavily) | Up-to-date information         |
| `ArxivQueryRun`                       | Search academic papers                        | Research-based agents          |
| `PythonREPLTool`                      | Run Python code snippets                      | Coding assistants, math agents |
| `RequestsGetTool`, `RequestsPostTool` | Make HTTP requests                            | API-based toolchains           |
| `TerminalTool`                        | Run shell commands (⚠️ risky)                 | DevOps, scripting workflows    |

✅ Example:

```python
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

wiki_tool = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())
```

---

#### 🔸 B. Custom Tools with `@tool`

📘 **Definition**: Use the `@tool` decorator to turn a Python function into a LangChain-compatible tool.

```python
from langchain.tools import tool

@tool
def calculate_area(length: float, width: float) -> float:
    """Calculate area of a rectangle"""
    return length * width
```

✅ **When to Use**:

* You want to add business logic
* Integrate external APIs
* Provide internal databases or calculations to agents

🧠 Tips:

* Use docstrings! Agents rely on them to understand the tool's purpose.

---

#### 🔸 C. `Tool` Class (Manual Tool)

```python
from langchain_core.tools import Tool

tool = Tool(
    name="Echo",
    func=lambda x: x,
    description="Repeats the input."
)
```

✅ Use this when you want full control (no decorator).

---

### 3. ✅ Use Cases, Advantages, Disadvantages

| Use Case               | Example                               |
| ---------------------- | ------------------------------------- |
| Answer with web search | `GoogleSearchTool`, `WikipediaTool`   |
| Math/code agents       | `PythonREPLTool`                      |
| Custom logic           | `@tool` function for price prediction |

| ✅ Advantages                      | ❌ Disadvantages                          |
| --------------------------------- | ---------------------------------------- |
| Easy to add external capabilities | Tools must be deterministic              |
| Integrate any API                 | Agents may misuse poorly described tools |
| Works well with agent workflows   | Requires prompt clarity for usage        |

---

### 4. ✅ Best Model Integrations

| Tool Type    | Best With                                            |
| ------------ | ---------------------------------------------------- |
| Web tools    | `ChatOpenAI`, `ChatAnthropic` (with agent framework) |
| Python tools | GPT-4 (for code reasoning)                           |
| Custom APIs  | Any chat model used in agents                        |

---

### 5. ✅ Tips & Best Practices

| Tip                                                    | Why                                           |
| ------------------------------------------------------ | --------------------------------------------- |
| Always write clear `description` or docstring          | Agent uses it to decide when to call the tool |
| Combine tools into a toolkit                           | Cleaner modular architecture                  |
| Test tools independently                               | Prevent agent confusion during use            |
| Use `Tool` or `@tool` — both register tools for agents |                                               |

---

### 🔗 Tool Integration in Agent Workflows

✅ Once you define tools, you can pass them to agents like this:

```python
from langchain.agents import initialize_agent

agent = initialize_agent(
    tools=[wiki_tool, calculate_area],
    llm=ChatOpenAI(),
    agent_type="openai-functions"
)
```

Then:

```python
agent.invoke("What is the area of a 10x5 rectangle?")
```

---



---

# 🧠 **6. Agents**

LangChain v0.3 — All About Smart, Tool-Using Agents 🔧🧑‍💻

---

### 1. ✅ Definition

**Agents** in LangChain are intelligent LLM-powered decision-makers that dynamically **select tools**, plan tasks, call APIs, and remember past interactions.

Unlike static chains, agents decide what steps to take based on user input and their internal reasoning.

---

### 2. ✅ Types & Built-in Agent Executors

---

#### 🔸 A. **ReAct Agent (Reasoning + Acting)**

📘 Combines reasoning with tool usage — the LLM thinks step-by-step before selecting a tool.

```python
from langchain.agents import initialize_agent, AgentType

agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent_type=AgentType.REACT_DESCRIPTION
)
```

✅ **When to Use**:

* You want detailed intermediate steps (Thought → Action → Observation)
* Transparent reasoning flow

🧠 Outputs:

```
Thought: I should look up the capital
Action: WikipediaTool
Observation: Capital is Paris
Final Answer: Paris
```

---

#### 🔸 B. **OpenAI Functions Agent**

📘 Uses OpenAI’s `function_calling` capability (GPT-3.5/4) to **invoke tools like API functions** directly.

```python
agent = initialize_agent(
    tools=tools,
    llm=ChatOpenAI(model="gpt-4", temperature=0),
    agent_type=AgentType.OPENAI_FUNCTIONS
)
```

✅ **When to Use**:

* You want function-calling interface (JSON-style interaction)
* GPT-4 with structured outputs
* Less verbose than ReAct

---

#### 🔸 C. **Tool-Using Agent (Zero-shot or Multi-tool)**

📘 Uses tool descriptions only (no examples) to **decide which tool to call**.

```python
agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION
)
```

✅ **When to Use**:

* You have several tools and want the agent to pick the best
* Multi-tool workflows (weather, calculator, search, etc.)

---

### 3. ✅ Multi-Tool Orchestration

LangChain agents can:

* Choose **one tool** or multiple tools in sequence
* Chain thoughts + tool use recursively
* Combine with `@tool`, `Tool`, `Toolkit` classes

✅ Example:

```python
tools = [search_tool, calc_tool, code_tool]
agent = initialize_agent(tools, llm, agent_type="openai-functions")
```

Then invoke:

```python
agent.invoke("Search the weather and calculate what to wear for 15°C.")
```

---

### 4. ✅ Memory-Integrated Agents

Agents can **retain memory** of the conversation using `ConversationBufferMemory`, `SummaryMemory`, etc.

📘 Add memory like this:

```python
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent_type="openai-functions",
    memory=memory
)
```

✅ Agent will now remember past questions, facts, and its own responses.

---

### 5. ✅ Use Cases, Pros, Cons

| Use Case                | Agent Type                     |
| ----------------------- | ------------------------------ |
| Complex reasoning       | ReAct Agent                    |
| API automation          | OpenAI Functions               |
| Many tools, no training | Zero-shot Tool Agent           |
| Human-like assistant    | Memory agent with chat history |

---

| ✅ Advantages                      | ❌ Disadvantages               |
| --------------------------------- | ----------------------------- |
| Dynamic, intelligent behavior     | Higher latency                |
| Supports tools, memory, reasoning | Can hallucinate tool names    |
| Extensible with any tool          | Needs clear tool descriptions |

---

### 6. ✅ Best Model Integrations

| Agent Type       | Best With                              |
| ---------------- | -------------------------------------- |
| ReAct            | OpenAI, Claude                         |
| OpenAI Functions | GPT-4 with `function_calling`          |
| Tool-Using       | All LLMs via LangChain                 |
| Memory Agents    | Any ChatModel (`ChatOpenAI`, `Claude`) |

---

### 7. ✅ Tips & Best Practices

| Tip                                          | Why                             |
| -------------------------------------------- | ------------------------------- |
| Use `@tool` with good docstrings             | LLM uses them to choose tools   |
| Prefer `OpenAI Functions` for clarity        | JSON format & API alignment     |
| Add memory for chatbot-style agents          | Keeps context across steps      |
| Use `AgentExecutor` for fine-grained control | Manage steps, timeouts, retries |

---




---

# 🧠 **7. Memory**

---

### 1. ✅ Definition

**Memory** in LangChain allows chains and agents to remember past interactions. This enables **contextual conversations**, follow-ups, and continuity — making LLMs feel more intelligent and less repetitive.

Memory stores previous inputs/outputs and feeds them back into prompts automatically.

---

### 2. ✅ Types & Built-in Memory Classes

---

#### 🔸 A. `ConversationBufferMemory`

📘 **Stores** the entire raw conversation (input/output) as a buffer.

```python
from langchain.memory import ConversationBufferMemory
```

✅ **When to Use**:

* You want full chat history preserved
* Best for assistant/chatbots with short-to-mid term memory

🧠 Example:

```python
memory = ConversationBufferMemory()
```

🔁 Feeds entire text history into the prompt like:

```
User: Hello  
AI: Hi!  
User: What’s my name?  
```

---

#### 🔸 B. `ConversationSummaryMemory`

📘 **Summarizes** the previous messages to stay within context window limits.

```python
from langchain.memory import ConversationSummaryMemory
```

✅ **When to Use**:

* Long conversations where full history won't fit into context
* You want compression + memory

🧠 How it works:

* Uses an LLM to summarize past interactions into a single paragraph

```python
memory = ConversationSummaryMemory(llm=llm)
```

---

#### 🔸 C. `ConversationBufferWindowMemory`

📘 Keeps only the **last N messages** in buffer.

```python
from langchain.memory import ConversationBufferWindowMemory
```

✅ **When to Use**:

* You want to limit token usage but preserve recent context (e.g., last 3 exchanges)

🧠 Example:

```python
memory = ConversationBufferWindowMemory(k=3)
```

---

#### 🔸 D. `ConversationKGMemory` (Knowledge Graph-based)

📘 Tracks **entities and their relationships** from conversation history.

```python
from langchain.memory import ConversationKGMemory
```

✅ **When to Use**:

* You want to track subjects like people, places, and objects
* Especially useful for QA bots or assistants handling structured data

---

#### 🔸 E. `EntityMemory`

📘 Extracts and tracks **named entities** across the conversation.

```python
from langchain.memory import ConversationEntityMemory
```

✅ **When to Use**:

* Chatbot needs to remember people, organizations, etc. by name
* Personal assistants, customer service bots

---

### 3. ✅ Adding Memory to Chains & Agents

---

#### 🔹 A. With Agents

```python
agent = initialize_agent(
    tools=tools,
    llm=ChatOpenAI(),
    agent_type="openai-functions",
    memory=ConversationBufferMemory()
)
```

#### 🔹 B. With Chains

```python
from langchain.chains import ConversationChain

chain = ConversationChain(
    llm=ChatOpenAI(),
    memory=ConversationBufferMemory()
)
```

---

### 4. ✅ Custom Memory Types

You can also build your **own memory class** by subclassing:

```python
from langchain_core.memory import BaseMemory

class MyCustomMemory(BaseMemory):
    def load_memory_variables(self, inputs):
        ...
    def save_context(self, inputs, outputs):
        ...
```

✅ Use this when:

* You want to store memory in a DB
* Or fetch memories based on semantic similarity

---

### 5. ✅ Use Cases, Advantages, Limitations

| Use Case                | Best Memory                 |
| ----------------------- | --------------------------- |
| Chatbot (short context) | `ConversationBufferMemory`  |
| Long dialogue           | `ConversationSummaryMemory` |
| Entity tracking         | `EntityMemory`              |
| Structured info         | `KGMemory`                  |

| ✅ Advantages                                | ❌ Disadvantages                         |
| ------------------------------------------- | --------------------------------------- |
| Retains memory without you managing context | Can leak irrelevant history into prompt |
| Supports summarization and scaling          | Summarizers may hallucinate facts       |
| Multiple plug-and-play types                | Some memory types increase token usage  |

---

### 6. ✅ Best Model Integrations

| Memory Type                | Best With                               |
| -------------------------- | --------------------------------------- |
| `BufferMemory`             | Any chat model (`ChatOpenAI`, `Claude`) |
| `SummaryMemory`            | GPT-4 (for better summaries)            |
| `KGMemory`, `EntityMemory` | Claude, GPT-4 (structured extraction)   |

---

### 7. ✅ Tips & Best Practices

| Tip                                         | Why                                   |
| ------------------------------------------- | ------------------------------------- |
| Use `BufferMemory` to start                 | Easiest and transparent               |
| For long sessions, use `SummaryMemory`      | Avoids context overflow               |
| Combine with `MessagesPlaceholder`          | Plug memory into `ChatPromptTemplate` |
| Use in agents to enable multi-step planning | Maintains logical continuity          |

---

🧠 Example: Integrating Memory in a Chain

```python
from langchain.prompts import ChatPromptTemplate
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(return_messages=True)
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI assistant."),
    ("user", "{input}"),
    ("ai", "{chat_history}")
])

chain = prompt | llm | StrOutputParser()
chain.invoke({"input": "Hi", "chat_history": memory.load_memory_variables({})})
```

---



---

# 📄 **1. Document Loaders**

*(Part of 🟠 III. Data-Aware / RAG Systems)*

---

### 1. ✅ Definition

**Document Loaders** are LangChain components that **ingest and convert raw data sources** (like PDFs, text files, websites, emails, databases) into a standard `Document` format (containing `page_content` and `metadata`).

These are essential in any **RAG** pipeline, where LLMs answer questions based on external knowledge.

---

### 2. ✅ Common Loader Types & Usage

---

#### 🔸 A. `TextLoader`

📘 Loads plain `.txt` files into documents.

```python
from langchain_community.document_loaders import TextLoader

loader = TextLoader("example.txt")
docs = loader.load()
```

✅ When to Use:

* Simple plain-text files
* CLI data or code exports

---

#### 🔸 B. `PDFLoader`

📘 Loads text from PDFs (using `pdfplumber`, `PyMuPDF`, or `pdfminer`).

```python
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("sample.pdf")
docs = loader.load()
```

✅ When to Use:

* Academic papers
* Reports, manuals, scanned documents

📌 Alternative:

```python
from langchain_community.document_loaders import PDFMinerLoader, PDFPlumberLoader
```

---

#### 🔸 C. `WebBaseLoader`

📘 Loads content from a web page URL.

```python
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://en.wikipedia.org/wiki/LangChain")
docs = loader.load()
```

✅ When to Use:

* Knowledge scraping from websites
* Wikipedia, blogs, documentation

📌 Uses `requests + BeautifulSoup` for HTML parsing.

---

#### 🔸 D. `UnstructuredLoader` (via Unstructured.io)

📘 Parses complex layouts like tables, headers, lists, multi-column PDFs.

```python
from langchain_community.document_loaders import UnstructuredFileLoader

loader = UnstructuredFileLoader("invoice.pdf")
docs = loader.load()
```

✅ When to Use:

* You need semantic parsing beyond plain text
* Ideal for enterprise docs, forms, and tables

🔁 Requires: `pip install unstructured`

---

#### 🔸 E. `PlaywrightURLLoader`

📘 Uses a headless browser to load and render dynamic JavaScript-heavy websites (like React apps).

```python
from langchain_community.document_loaders import PlaywrightURLLoader

loader = PlaywrightURLLoader(["https://example.com"])
docs = loader.load()
```

✅ When to Use:

* Pages that require scrolling, clicking, or dynamic rendering
* Websites with SPA (single-page apps)

🔁 Requires: `playwright install`

---

### 3. ✅ Document Format

All loaders output:

```python
[
  Document(
    page_content="Text goes here...",
    metadata={"source": "example.pdf", ...}
  )
]
```

✅ You can access both the **text** and **where it came from** — important for retrieval and citation.

---

### 4. ✅ Use Cases, Pros, Cons

| Use Case            | Loader                            |
| ------------------- | --------------------------------- |
| Ingest web articles | `WebBaseLoader`                   |
| Academic PDFs       | `PDFLoader`, `UnstructuredLoader` |
| Dynamic JS content  | `PlaywrightLoader`                |
| Plain files         | `TextLoader`                      |

---

| ✅ Advantages                | ❌ Disadvantages                   |
| --------------------------- | --------------------------------- |
| Supports all common formats | Parsing quality depends on loader |
| Pluggable and chainable     | Some need extra dependencies      |
| Includes source metadata    | Dynamic content can break parsers |

---

### 5. ✅ Best Practices

| Tip                                                       | Why                            |
| --------------------------------------------------------- | ------------------------------ |
| Use metadata fields like `source`                         | Helps retrieval/citation later |
| Preprocess long PDFs with splitters                       | Avoids context overflow        |
| Use `UnstructuredLoader` for quality layout-aware parsing | Better than naive loaders      |

---




---

# ✂️ **2. Text Splitters**

*(Part of 🟠 III. Data-Aware / RAG Systems)*

---

### 1. ✅ Definition

**Text Splitters** in LangChain break large texts into smaller, manageable **chunks** that fit into LLM context limits and can be indexed into vector databases.

They preserve semantic meaning while optimizing for token length, chunk overlap, and relevance.

---

### 2. ✅ Types & Built-in Splitters

---

#### 🔸 A. `RecursiveCharacterTextSplitter` ✅ *(Most Recommended)*

📘 **Smart splitting** that recursively splits text by paragraph → sentence → word → character boundaries until it fits desired chunk size.

```python
from langchain.text_splitter import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=100
)

chunks = splitter.split_documents(documents)
```

✅ Best for:

* Most RAG tasks
* Unstructured docs with mixed content
* Balances chunk size + coherence

---

#### 🔸 B. `CharacterTextSplitter`

📘 Naively splits text by a single character (like `\\n` or space).

```python
from langchain.text_splitter import CharacterTextSplitter

splitter = CharacterTextSplitter(
    separator="\\n\\n",
    chunk_size=1000,
    chunk_overlap=200
)
```

✅ Best for:

* Simple use cases
* Controlled formats (e.g., logs, articles)

⚠️ Limitation: No semantic fallback if text isn't cleanly separated.

---

#### 🔸 C. `SentenceTransformersTokenTextSplitter`

📘 Splits text based on **tokens**, not characters — uses HuggingFace tokenizer.

```python
from langchain.text_splitter import SentenceTransformersTokenTextSplitter

splitter = SentenceTransformersTokenTextSplitter(chunk_overlap=50, tokens_per_chunk=400)
```

✅ Best for:

* Fine-tuned control by token count
* Embedding-quality-sensitive tasks

---

#### 🔸 D. `NLTKTextSplitter` or `SpacyTextSplitter`

📘 Splits using **natural language rules**: paragraphs, sentences (requires NLTK or spaCy).

```python
from langchain.text_splitter import NLTKTextSplitter

splitter = NLTKTextSplitter()
```

✅ Best for:

* Academic documents
* News, legal, medical use cases

---

### 3. ✅ Chunk Overlap (Best Practices)

| Parameter       | Why It Matters                                                               |
| --------------- | ---------------------------------------------------------------------------- |
| `chunk_size`    | How much content fits into one chunk — balance between detail and token cost |
| `chunk_overlap` | How much overlap between chunks — helps with context continuity              |

✅ **Recommended Settings**:

```python
chunk_size = 500
chunk_overlap = 100
```

📌 Why overlap?

> To ensure that related information isn't split between chunks and lost during retrieval.

---

### 4. ✅ Use Cases, Advantages, Limitations

| Use Case            | Splitter                         |
| ------------------- | -------------------------------- |
| Raw PDFs, web pages | `RecursiveCharacterTextSplitter` |
| Logs, clean text    | `CharacterTextSplitter`          |
| Token-precise tasks | `TokenTextSplitter`              |
| Sentence integrity  | `NLTK`, `SpaCy`                  |

| ✅ Advantages                    | ❌ Disadvantages                           |
| ------------------------------- | ----------------------------------------- |
| Optimizes for LLM limits        | Poorly tuned parameters = loss of context |
| Preserves semantic meaning      | Token-aware splitting needs setup         |
| Supports overlap for better RAG | Needs experimentation for best chunking   |

---

### 5. ✅ Example: Load → Split → Store

```python
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Load
docs = TextLoader("example.txt").load()

# Split
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
chunks = splitter.split_documents(docs)
```

Each `chunk` will have:

```python
Document(
  page_content="This is a small section of the full doc...",
  metadata={"source": "example.txt"}
)
```

---




---

# 🧬 **3. Embedding Models**

*(Part of 🟠 III. Data-Aware / RAG Systems)*

---

### 1. ✅ Definition

**Embedding Models** convert text (chunks, queries, documents) into **high-dimensional vectors** (lists of floats) that capture semantic meaning. These vectors are used for **similarity search** in vector databases.

LangChain supports several pluggable embedding providers (OpenAI, Cohere, HuggingFace, etc.).

---

### 2. ✅ Common Embedding Providers & Imports

---

| Model                      | Import                                                             | Use Case                          |
| -------------------------- | ------------------------------------------------------------------ | --------------------------------- |
| 🔹 `OpenAIEmbeddings`      | `from langchain_openai import OpenAIEmbeddings`                    | Cloud, accurate, default for GPT  |
| 🔹 `HuggingFaceEmbeddings` | `from langchain_community.embeddings import HuggingFaceEmbeddings` | Local/offline use, open models    |
| 🔹 `CohereEmbeddings`      | `from langchain_cohere import CohereEmbeddings`                    | Alternative to OpenAI, commercial |
| 🔹 `GooglePalmEmbeddings`  | `from langchain_google_genai import GooglePalmEmbeddings`          | Google’s embedding APIs           |

---

### 3. ✅ How to Create Embeddings

---

#### 🔸 A. OpenAI (Default & Most Common)

```python
from langchain_openai import OpenAIEmbeddings

embedder = OpenAIEmbeddings()
vectors = embedder.embed_documents(["Apple is a fruit", "Google is a company"])
```

✅ Best for:

* Plug-and-play with GPT LLMs
* Supports OpenAI vector DBs like Pinecone, Weaviate, etc.

---

#### 🔸 B. HuggingFace (Local Embeddings)

```python
from langchain_community.embeddings import HuggingFaceEmbeddings

embedder = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vectors = embedder.embed_documents(["Apple is a fruit", "Google is a company"])
```

✅ Best for:

* Local/offline inference
* Fine-tuned open-source tasks

📦 Requires: `transformers`, `sentence-transformers`

---

#### 🔸 C. Cohere

```python
from langchain_cohere import CohereEmbeddings

embedder = CohereEmbeddings()
vectors = embedder.embed_documents(["LangChain is great"])
```

✅ Best for:

* Commercial alternatives to OpenAI
* Good multilingual support

---

### 4. ✅ Normalization (Best Practice)

To ensure **consistent cosine similarity** search, **normalize** vectors.

LangChain auto-normalizes in some vector stores (like FAISS), but you can also do:

```python
import numpy as np

normalized = [v / np.linalg.norm(v) for v in vectors]
```

📌 Helps with:

* Consistent similarity ranking
* Avoiding dimensional bias

---

### 5. ✅ Use Cases, Pros, Cons

| Use Case                | Example                              |
| ----------------------- | ------------------------------------ |
| RAG Search              | Embed + store chunks in a vector DB  |
| Semantic Matching       | Search docs, FAQs, knowledge base    |
| Cross-lingual retrieval | Multilingual embeddings (Cohere, HF) |

| ✅ Advantages                  | ❌ Disadvantages                           |
| ----------------------------- | ----------------------------------------- |
| Encodes semantic meaning      | Quality depends on model                  |
| Plug-and-play with many LLMs  | Embeddings are opaque (hard to interpret) |
| Supports both cloud and local | Some models require GPU locally           |

---

### 6. ✅ Best Practices

| Practice                                     | Why It’s Useful                            |
| -------------------------------------------- | ------------------------------------------ |
| Use the **same embedder** for docs & queries | Ensures vector space alignment             |
| Use **chunk metadata** with vectors          | Helps with filtering (e.g., source, topic) |
| Normalize vectors for cosine distance        | Prevents score bias                        |

---

### 7. ✅ Example: Embed and Prepare for Vector Store

```python
from langchain_openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS

embedder = OpenAIEmbeddings()

# Embed text chunks
vectors = embedder.embed_documents(["chunk 1", "chunk 2", "chunk 3"])

# Save into vector DB (e.g., FAISS)
db = FAISS.from_texts(["chunk 1", "chunk 2", "chunk 3"], embedding=embedder)
```

---




---

# 🧲 **4. Vector Stores**

*(Part of 🟠 III. Data-Aware / RAG Systems)*

---

### 1. ✅ Definition

**Vector Stores** are specialized databases for storing and retrieving **embedding vectors** based on similarity (cosine, dot product, etc.).

In LangChain, they store chunked documents and allow fast **semantic search** — finding the most relevant chunks for a query.

---

### 2. ✅ Supported Vector Stores & Imports

| Vector Store  | Import                                        | Key Trait                                        |
| ------------- | --------------------------------------------- | ------------------------------------------------ |
| 🔹 `FAISS`    | `from langchain.vectorstores import FAISS`    | Local, lightweight, fast                         |
| 🔹 `Chroma`   | `from langchain.vectorstores import Chroma`   | Local/embedded DB, persistent                    |
| 🔹 `Qdrant`   | `from langchain.vectorstores import Qdrant`   | Open-source, scalable, metadata filtering        |
| 🔹 `Weaviate` | `from langchain.vectorstores import Weaviate` | Cloud/local, hybrid search                       |
| 🔹 `Pinecone` | `from langchain.vectorstores import Pinecone` | Managed service, powerful filtering, cloud-scale |

---

### 3. ✅ Adding Documents to Vector Store

All vector stores follow the same pattern:

```python
from langchain.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

texts = ["Doc1 content", "Doc2 content"]
metadatas = [{"source": "pdf1"}, {"source": "pdf2"}]
embedding = OpenAIEmbeddings()

db = FAISS.from_texts(texts, embedding=embedding, metadatas=metadatas)
```

✅ Each `Document` contains:

* `page_content` (the chunk text)
* `metadata` (source, id, topic, etc.)

---

### 4. ✅ Searching (Retrieval)

```python
retrieved_docs = db.similarity_search("What is LangChain?", k=3)
```

* `similarity_search(query, k)`: Basic nearest neighbor search
* `similarity_search_with_score()`: Returns results + similarity scores

---

### 5. ✅ Metadata Filtering (Advanced)

Only supported by vector stores like **Chroma**, **Qdrant**, **Pinecone**, and **Weaviate**.

Example:

```python
results = db.similarity_search(
  query="Show me AWS docs",
  filter={"source": "aws_docs"},
  k=2
)
```

✅ Use this to:

* Filter by document source
* Restrict by author, domain, timestamp, etc.

---

### 6. ✅ Persistence (Save & Load)

#### 🔹 FAISS:

```python
db.save_local("faiss_index/")
db = FAISS.load_local("faiss_index/", embedding=OpenAIEmbeddings())
```

#### 🔹 Chroma:

```python
db = Chroma(persist_directory="db/", embedding_function=embedding)
```

---

### 7. ✅ When to Use Which?

| Store      | Best For                    | Offline?  | Metadata Filtering? |
| ---------- | --------------------------- | --------- | ------------------- |
| `FAISS`    | Local dev, fast, simple     | ✅         | ❌                   |
| `Chroma`   | Lightweight, persistent     | ✅         | ✅                   |
| `Qdrant`   | Scalable + filters          | ✅/☁️      | ✅                   |
| `Weaviate` | Hybrid (semantic + keyword) | ✅/☁️      | ✅                   |
| `Pinecone` | Enterprise RAG at scale     | ❌ (cloud) | ✅ (rich support)    |

---

### 8. ✅ Use Cases, Pros, Cons

| Use Case               | Store                            |
| ---------------------- | -------------------------------- |
| Local notebooks, demos | `FAISS`, `Chroma`                |
| Long-term persistence  | `Chroma`, `Qdrant`               |
| Filtered search        | `Qdrant`, `Weaviate`, `Pinecone` |

| ✅ Advantages               | ❌ Limitations                     |
| -------------------------- | --------------------------------- |
| Fast semantic retrieval    | Most don’t support hybrid ranking |
| Scalable + pluggable       | Must tune chunking for quality    |
| Filters + metadata support | FAISS lacks filtering             |

---

### 9. ✅ Example: Full Pipeline

```python
from langchain.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import TextLoader

# Load + Split
docs = TextLoader("book.txt").load()
chunks = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50).split_documents(docs)

# Embed + Store
db = FAISS.from_documents(chunks, OpenAIEmbeddings())

# Search
results = db.similarity_search("What is the main idea?", k=2)
print(results[0].page_content)
```

---


---

# 🔍 **5. Retrievers**

*(Part of 🟠 III. Data-Aware / RAG Systems)*

---

### 1. ✅ Definition

**Retrievers** are LangChain wrappers around vector stores (or other sources) that **fetch the most relevant chunks** for a user query, based on semantic similarity or enhanced logic.

They’re used in **RAG pipelines** to dynamically retrieve the right documents for the LLM to generate grounded answers.

---

### 2. ✅ Built-in Retriever Types & Usage

---

#### 🔸 A. `VectorStoreRetriever`

📘 Wraps a vector store like FAISS, Chroma, etc., for basic similarity search.

```python
retriever = db.as_retriever(search_kwargs={"k": 3})
docs = retriever.invoke("What is LangChain?")
```

✅ Use when:

* You want simple top-k vector search
* No additional filtering or logic required

---

#### 🔸 B. `MultiQueryRetriever`

📘 Generates **multiple reworded queries** using an LLM → retrieves documents using each → aggregates results.

```python
from langchain.retrievers.multi_query import MultiQueryRetriever

retriever = MultiQueryRetriever.from_llm(
    retriever=db.as_retriever(),
    llm=ChatOpenAI()
)
docs = retriever.invoke("Tell me about LangChain agents")
```

✅ Use when:

* Your query might be ambiguous or have multiple angles
* You want to maximize recall (get more relevant info)

🧠 Example:
For `"LangChain memory"`, it might generate:

* "How does LangChain store history?"
* "What is ConversationMemory in LangChain?"

---

#### 🔸 C. `ContextualCompressionRetriever`

📘 Retrieves documents → compresses them using an LLM → returns only the **most relevant parts** of each chunk.

```python
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor

compressor = LLMChainExtractor.from_llm(ChatOpenAI())
retriever = ContextualCompressionRetriever(
    base_compressor=compressor,
    base_retriever=db.as_retriever()
)
```

✅ Use when:

* Chunks are too long, noisy, or redundant
* You want to **refine retrieved data** before sending it to the LLM

---

#### 🔸 D. `BM25Retriever`, `ParentDocumentRetriever`, etc.

Other advanced retrievers (optional):

* `BM25Retriever`: Keyword-based retrieval using TF-IDF
* `ParentDocumentRetriever`: Maps chunk → parent doc for full context
* `EnsembleRetriever`: Combines multiple retrievers (e.g., BM25 + vector)

---

### 3. ✅ Use Cases, Pros, and Cons

| Use Case                   | Retriever                        |
| -------------------------- | -------------------------------- |
| Simple FAQ / docs          | `VectorStoreRetriever`           |
| Complex or broad questions | `MultiQueryRetriever`            |
| Large, verbose chunks      | `ContextualCompressionRetriever` |
| Legal/long docs            | `ParentDocumentRetriever`        |

| ✅ Advantages                                  | ❌ Limitations                           |
| --------------------------------------------- | --------------------------------------- |
| Simple API to connect LLMs to relevant data   | Overlap/redundancy if not chunked well  |
| Easy to combine, extend, compose              | LLM-based retrievers increase latency   |
| Contextual compression = smaller prompt sizes | Multi-query can retrieve too much noise |

---

### 4. ✅ Best Practices

| Practice                                        | Why                                                   |
| ----------------------------------------------- | ----------------------------------------------------- |
| Tune `chunk_size` + `k`                         | Prevents under/over-retrieving                        |
| Use `MultiQueryRetriever` in open-ended QA      | Improves recall                                       |
| Use `ContextualCompressionRetriever` with GPT-4 | Gives focused context                                 |
| Combine `metadata filters`                      | Useful for document segmentation by source/topic/date |

---

### 5. ✅ Example RAG Flow with Retriever

```python
retriever = db.as_retriever(search_kwargs={"k": 3})

# Connect retriever to chain
from langchain.chains import RetrievalQA

qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(),
    retriever=retriever
)

response = qa_chain.invoke({"query": "Explain memory in LangChain"})
print(response['result'])
```

---

### 6. ✅ Bonus: When to Use What?

| Retriever                        | Best With               | Key Strength          |
| -------------------------------- | ----------------------- | --------------------- |
| `VectorStoreRetriever`           | All vector stores       | Fast, simple          |
| `MultiQueryRetriever`            | GPT-4/Claude            | Deep recall           |
| `ContextualCompressionRetriever` | GPT-4                   | Clean + short results |
| `BM25Retriever`                  | Legal, code, structured | Exact matches         |
| `ParentDocumentRetriever`        | Large docs              | Full-document context |

---




---

# 🔗 **6. Retrieval Chains**

*(Part of 🟠 III. Data-Aware / RAG Systems)*

---

### 1. ✅ Definition

**Retrieval Chains** are LangChain pipelines that combine:

* 🔍 **Retriever** (fetch relevant documents)
* 💬 **LLM** (answer question using those docs)
* 🧠 Optionally: **memory, search tools, filters**

This forms a complete **RAG (Retrieval-Augmented Generation)** system.

---

### 2. ✅ Built-in Retrieval Chains

---

#### 🔸 A. `RetrievalQA`

📘 Basic Question Answering over documents using retrieval.

```python
from langchain.chains import RetrievalQA

qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(),
    retriever=retriever,        # any retriever
    chain_type="stuff",         # how docs are passed into prompt
)
```

✅ Use when:

* You need simple, fast document QA
* No chat memory required

---

#### 🔸 B. `ConversationalRetrievalChain`

📘 A multi-turn chat-style retrieval chain with **memory support**.

```python
from langchain.chains import ConversationalRetrievalChain

chat_chain = ConversationalRetrievalChain.from_llm(
    llm=ChatOpenAI(),
    retriever=retriever,
    memory=ConversationBufferMemory(return_messages=True)
)
```

✅ Use when:

* You want a chatbot experience with history
* Follow-up questions, clarification, etc.

---

### 3. ✅ How Chains Work Internally

For both chains:

1. User asks a question
2. Retriever fetches relevant chunks
3. Prompt template fills in question + context
4. LLM answers using only the retrieved data

> 📌 Prompt style depends on `chain_type`:

* `"stuff"` – simply concatenates all docs
* `"map_reduce"` – processes chunks individually then summarizes
* `"refine"` – incrementally improves answer using each doc

---

### 4. ✅ Adding Tools, Memory, Filters

| Feature                | How                                                                              |
| ---------------------- | -------------------------------------------------------------------------------- |
| ✅ Memory               | Use `ConversationalRetrievalChain(memory=...)`                                   |
| ✅ Metadata filters     | Use `retriever = db.as_retriever(search_kwargs={"filter": {"source": "file1"}})` |
| ✅ Custom prompts       | Use `RetrievalQA.from_chain_type(..., chain_type_kwargs={"prompt": my_prompt})`  |
| ✅ Tool-enhanced agents | Combine `retriever` with agent tools in `initialize_agent()`                     |

---

### 5. ✅ Use Cases, Pros & Cons

| Use Case                      | Chain                          |
| ----------------------------- | ------------------------------ |
| Static document QA            | `RetrievalQA`                  |
| Long-form chatbot             | `ConversationalRetrievalChain` |
| Code, legal, or PDF assistant | Either + memory                |

| ✅ Advantages                        | ❌ Limitations                               |
| ----------------------------------- | ------------------------------------------- |
| Easy to plug into any retriever     | RAG quality depends on retrieval relevance  |
| Flexible (LLM + retriever + prompt) | Need to manage chunk size, overlap, filters |
| Supports memory & tools             | Some chain types (e.g. refine) are slow     |

---

### 6. ✅ Example: Build a RAG Chatbot

```python
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(return_messages=True)
retriever = db.as_retriever(search_kwargs={"k": 3})

chain = ConversationalRetrievalChain.from_llm(
    llm=ChatOpenAI(),
    retriever=retriever,
    memory=memory
)

chat_history = []
question = "What is LangChain memory?"

result = chain.invoke({"question": question, "chat_history": chat_history})
print(result['answer'])
```

---

### 7. ✅ Best Practices

| Tip                                             | Why                           |
| ----------------------------------------------- | ----------------------------- |
| Tune `chunk_size`, `k`, and `chain_type`        | Impacts retrieval relevance   |
| Use `ConversationalRetrievalChain` for chatbots | Tracks conversation           |
| Use metadata filters with retrievers            | More precise control          |
| Try `map_reduce` for long docs                  | Better summarization accuracy |

---

### 8. ✅ When to Use What

| Chain                          | Use When              | Memory   |
| ------------------------------ | --------------------- | -------- |
| `RetrievalQA`                  | One-shot Q\&A         | ❌        |
| `ConversationalRetrievalChain` | Chatbot w/ follow-ups | ✅        |
| `Agent + Retriever`            | Tools + search + QA   | Optional |

---




---

# 🔵 **1. LangGraph (Stateful Flows)**

*(Part of IV. Advanced Topics)*

---

### 1. ✅ Definition

**LangGraph** is a LangChain-native library for building **stateful, multi-step, and multi-agent workflows** using a **graph-based architecture**.

It’s ideal for:

* 🧠 **Multi-agent systems**
* 🔁 **Dynamic tool/step routing**
* 🗂️ **Persistent state & memory**
* 🔧 **Workflow orchestration with logic control**

LangGraph = `LangChain + NetworkX + Async FSM (finite-state machine)`

---

### 2. ✅ Key Concepts

| Concept   | Meaning                             |
| --------- | ----------------------------------- |
| **Nodes** | LLMs, tools, functions, agents      |
| **Edges** | Transitions between nodes           |
| **Graph** | A full pipeline with multiple paths |
| **State** | Memory/context passed along edges   |

---

### 3. ✅ Installation

```bash
pip install langgraph
```

---

### 4. ✅ Example Use Case: Multi-Agent Collaboration

> Build a **multi-agent research assistant**:
> One agent searches the web, one summarizes, one critiques.

#### Step-by-Step

```python
from langgraph.graph import StateGraph, END
from langchain_core.runnables import RunnableLambda

# 1. Define state schema
class ResearchState(TypedDict):
    topic: str
    result: str

# 2. Define nodes (steps)
search_node = RunnableLambda(lambda state: {"result": f"Searching for {state['topic']}..."})
summarize_node = RunnableLambda(lambda state: {"result": f"Summarized: {state['result']}"})
end_node = RunnableLambda(lambda state: state)

# 3. Create graph
graph = StateGraph(ResearchState)
graph.add_node("search", search_node)
graph.add_node("summarize", summarize_node)
graph.set_entry_point("search")
graph.add_edge("search", "summarize")
graph.add_edge("summarize", END)

# 4. Compile and invoke
app = graph.compile()
output = app.invoke({"topic": "LangChain agents"})
print(output)
```

---

### 5. ✅ Advanced: Conditional Routing (Dynamic Graphs)

```python
def route(state):
    return "summarize" if "summary" not in state else END

graph.add_conditional_edges("search", route)
```

✅ Useful for:

* Deciding next step based on content
* Building feedback loops, retries, approvals

---

### 6. ✅ Multi-Agent Use Case

You can wrap agents or chains as graph nodes:

```python
graph.add_node("coding_agent", code_generation_chain)
graph.add_node("reviewer", reviewer_chain)
```

✅ Example:

* `coding_agent` → generates code
* `reviewer` → checks logic & style
* If rejected, loop back to `coding_agent`

---

### 7. ✅ Use Cases, Pros, Cons

| Use Case                  | Description                               |
| ------------------------- | ----------------------------------------- |
| Multi-agent collaboration | Researcher ↔ Summarizer ↔ Critic          |
| Tool routers              | Route query to calculator, API, SQL, etc. |
| Conditional task flows    | Based on metadata, user input, memory     |
| Complex workflows         | Like Airflow but LLM-aware                |

| ✅ Pros                       | ❌ Cons                        |
| ---------------------------- | ----------------------------- |
| Precise control of LLM flows | Slightly complex setup        |
| Real-time dynamic routing    | Async required for full power |
| Easy to debug & visualize    | Limited built-in visualizers  |

---

### 8. ✅ LangGraph + Memory

LangGraph supports **custom shared state** across nodes — like agent memories, counters, task logs, etc.

You define a `TypedDict` or `BaseModel` for state and pass it through.

```python
class TaskState(TypedDict):
    query: str
    history: List[str]
```

✅ Keeps LLMs **stateful and aware** in multi-step settings.

---

### 9. ✅ Best Practices

| Tip                                  | Reason                               |
| ------------------------------------ | ------------------------------------ |
| Use `RunnableLambda` for small steps | Quick test/debug logic               |
| Define a clear state schema          | Prevents bugs + improves readability |
| Add retries & fallback nodes         | Improves robustness                  |
| Mix tools + chains + agents          | Build real-world pipelines           |

---

### 10. ✅ Summary

| Feature                     | Benefit                          |
| --------------------------- | -------------------------------- |
| 🌐 Graph-based routing      | Complex flows made easy          |
| 🧠 Shared memory            | Real multi-agent reasoning       |
| 🔁 Loops & conditionals     | Flexible, intelligent automation |
| ⚙️ Plug & play chains/tools | Modular workflows                |

---




---

# ⚡ **2. Async & Streaming**

*(Part of 🔵 IV. Advanced Topics)*

---

### 1. ✅ Definition

LangChain supports:

* **Async LLM calls** → handle multiple LLM operations concurrently
* **Streaming outputs** → get tokens as they’re generated in real time
* **Event-driven workflows** → trigger updates/UI responses as the model "thinks"

These improve **speed**, **interactivity**, and **real-time user experience** — especially in agents, chatbots, and UI-integrated apps (like Streamlit, Gradio).

---

### 2. ✅ Async LLM Calls

---

#### 🔸 Why Use Async?

* Traditional sync (`.invoke()`) is blocking
* Async allows parallel LLM calls (e.g., agents, tools, retrievals)

#### 🔸 Example

```python
import asyncio
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4", temperature=0)

async def get_response():
    result = await llm.ainvoke("What is LangChain?")
    print(result.content)

asyncio.run(get_response())
```

✅ Useful for:

* **Concurrent** tool execution in agents
* **Multiple user queries** in parallel

---

### 3. ✅ Streaming Outputs (Token-by-Token Generation)

---

#### 🔸 Why Use Streaming?

* Immediate feedback to the user
* Great for chat interfaces and CLI tools
* More engaging UX

#### 🔸 With OpenAI Chat Models

```python
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser

llm = ChatOpenAI(streaming=True)

def stream_tokens():
    stream = llm.pipe(StrOutputParser()).stream("Tell me a joke")
    for token in stream:
        print(token, end="", flush=True)

stream_tokens()
```

---

### 4. ✅ Async + Streaming (Together)

```python
async def stream_async():
    async for chunk in llm.astream("Give me a summary of LangChain"):
        print(chunk.content, end="")

asyncio.run(stream_async())
```

✅ Best for:

* Chatbots with typing effect
* UI apps with progress animations
* LLM dashboards

---

### 5. ✅ Event-Driven Architecture in LangChain

---

LangChain supports **event hooks** using the `CallbackManager` system — letting you capture and handle events like:

* Token generation
* Tool invocation
* Chain start/end
* Errors

#### 🔸 Example: Custom Callback

```python
from langchain.callbacks.base import BaseCallbackHandler

class PrintCallbackHandler(BaseCallbackHandler):
    def on_llm_new_token(self, token, **kwargs):
        print(token, end="")

llm = ChatOpenAI(streaming=True, callbacks=[PrintCallbackHandler()])
llm.invoke("Tell me a short poem about AI")
```

✅ Useful for:

* Logging
* Live dashboards
* Tool monitoring

---

### 6. ✅ Real-World Use Cases

| Use Case                             | Feature Used                        |
| ------------------------------------ | ----------------------------------- |
| Chat app with token-by-token display | `stream=True`, `on_llm_new_token()` |
| Dashboard that logs all actions      | Custom callbacks                    |
| Agent with parallel tools            | `async/await` LLM + tools           |
| Voice assistant                      | Streaming + async                   |

---

### 7. ✅ Pros & Limitations

| ✅ Advantages                 | ❌ Limitations                     |
| ---------------------------- | --------------------------------- |
| Faster, responsive UX        | Some LLMs don’t support streaming |
| Enables UI interactivity     | Async syntax more complex         |
| Better tool/agent throughput | Callbacks require setup & testing |

---

### 8. ✅ Best Practices

| Tip                                | Why                              |
| ---------------------------------- | -------------------------------- |
| Use `stream=True` with UIs         | Feels live & responsive          |
| Wrap chains in `async def`         | Enables concurrency              |
| Combine with LangServe or FastAPI  | Build scalable APIs              |
| Use callbacks for monitoring tools | Full observability over pipeline |

---

### ✅ Summary Table

| Feature           | Usage                        |
| ----------------- | ---------------------------- |
| `ainvoke()`       | Async LLM call               |
| `astream()`       | Async + Streaming            |
| `stream=True`     | Streaming in sync mode       |
| `CallbackHandler` | Custom token/tool monitoring |

---




---

# 🌐 **3. LangServe (API Deployment)**

*(Part of 🔵 IV. Advanced Topics)*

---

### 1. ✅ Definition

**LangServe** is a LangChain-native framework for **deploying chains, agents, tools, or entire apps as RESTful APIs** using **FastAPI**.

It also integrates tightly with **LangSmith** for:

* Monitoring
* Debugging
* Tracing
* Evaluation

> ⚙️ “LangServe = LangChain + FastAPI + Observability + AutoDocs”

---

### 2. ✅ Key Features

| Feature                           | Description                      |
| --------------------------------- | -------------------------------- |
| 🔁 Auto-wraps any LangChain chain | into a FastAPI REST server       |
| 📊 Logs, traces, and metadata     | via LangSmith                    |
| ⚡ Async-ready & scalable          | for production deployment        |
| 🧪 Built-in testing/debug UI      | via `http://localhost:8000/docs` |

---

### 3. ✅ Installation

```bash
pip install langserve
```

---

### 4. ✅ Quickstart: Serve a Chain as API

```python
# serve.py

from langserve import add_routes
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
from fastapi import FastAPI

# 1. Define chain
prompt = PromptTemplate.from_template("Write a poem about {topic}")
chain = LLMChain(llm=ChatOpenAI(), prompt=prompt)

# 2. Create API app
app = FastAPI()
add_routes(app, chain, path="/poem")

# 3. Run server
# ➤ uvicorn serve:app --reload
```

🌐 Go to:
`http://localhost:8000/poem/invoke`
with body:

```json
{"input": {"topic": "AI"}}
```

✅ It responds with an LLM-generated poem 🎉

---

### 5. ✅ Supported Endpoints

| Endpoint                      | Purpose                            |
| ----------------------------- | ---------------------------------- |
| `/invoke`                     | Run the chain                      |
| `/stream`                     | Stream the result (token-by-token) |
| `/batch`                      | Run a batch of inputs              |
| `/config`, `/schema`, `/docs` | Auto-generated OpenAPI UI          |

---

### 6. ✅ Example with Tool/Agent

```python
from langchain.agents import initialize_agent, load_tools
from langchain_openai import ChatOpenAI

tools = load_tools(["serpapi"])  # e.g. Google Search
agent = initialize_agent(tools, ChatOpenAI(), agent_type="zero-shot-react-description")

app = FastAPI()
add_routes(app, agent, path="/agent")
```

✅ You’ve now deployed a **multi-tool reasoning agent** as a fully usable REST API.

---

### 7. ✅ LangSmith + LangServe Integration

LangServe supports LangSmith **by default** — if your environment includes:

```bash
export LANGCHAIN_API_KEY="your_langsmith_api_key"
export LANGCHAIN_PROJECT="my-deployment"
```

✅ This enables:

* Request/response tracing
* Token usage stats
* Latency metrics
* Chain visualization

---

### 8. ✅ Use Cases

| Use Case                    | How LangServe Helps                        |
| --------------------------- | ------------------------------------------ |
| Deploy chatbot for frontend | Wrap `ConversationalRetrievalChain` as API |
| Create semantic search API  | Wrap RAG retriever + LLM                   |
| Deploy agent with tools     | Serve as `/agent` route                    |
| QA over private docs        | Load vector store + chain → serve it       |

---

### 9. ✅ Pros & Cons

| ✅ Pros                        | ❌ Cons                                   |
| ----------------------------- | ---------------------------------------- |
| Very easy to deploy any chain | Not as customizable as full FastAPI apps |
| Built-in async + streaming    | Still maturing (API shape may evolve)    |
| Observability via LangSmith   | Requires good design to scale            |

---

### 10. ✅ Example Full Workflow

```bash
uvicorn serve:app --reload
```

Request:

```bash
curl -X POST http://localhost:8000/poem/invoke \
  -H "Content-Type: application/json" \
  -d '{"input": {"topic": "robots"}}'
```

Response:

```json
{"output": "A poem about robots and AI... 🤖"}
```

---

### 11. ✅ Best Practices

| Tip                                    | Reason                      |
| -------------------------------------- | --------------------------- |
| Use environment variables for API keys | Secure your deployment      |
| Use `LangSmith` for observability      | See exactly what went wrong |
| Add custom endpoints for status/logs   | Production-grade needs      |
| Use `stream` endpoint in chat apps     | For smooth UX               |

---

### ✅ Summary Table

| Component                      | Purpose                       |
| ------------------------------ | ----------------------------- |
| `add_routes()`                 | Turns any LangChain into API  |
| `/invoke`, `/stream`, `/batch` | Built-in endpoints            |
| LangSmith integration          | Full tracing & logs           |
| FastAPI-compatible             | Add your own routes if needed |

---




---

# 🧠 **4. Tracing with LangSmith**

*(Part of 🔵 IV. Advanced Topics)*

---

### 1. ✅ Definition

**LangSmith** is LangChain’s observability and evaluation platform. It helps developers:

* 🪵 **Trace** every LLM call, tool, and chain step
* 🧪 **Debug** and replay errors
* 📊 **Analyze** token usage, latency, outputs
* 🧷 **Log metadata** and inputs/outputs for QA or production

LangSmith = **“Debug Console + Profiler + Analytics + Evaluation Suite”** for LangChain apps.

---

### 2. ✅ Core Features

| Feature                     | Description                                |
| --------------------------- | ------------------------------------------ |
| 🔎 Traces                   | Every step: prompt → LLM → tool → response |
| 🪙 Token & latency tracking | See cost & time per call                   |
| 🛠️ Replay & inspect        | Fix errors without reruns                  |
| 📘 Dataset evals            | Run model comparisons                      |
| 🏷️ Custom metadata         | Add tags, versions, session IDs            |

---

### 3. ✅ Setup LangSmith

#### Step 1: Install SDK

```bash
pip install langsmith
```

#### Step 2: Set Environment Variables

```bash
export LANGCHAIN_API_KEY="your_langsmith_key"
export LANGCHAIN_PROJECT="MyRAGApp"
```

> You’ll find the API key at [https://smith.langchain.com](https://smith.langchain.com) under ⚙️ → **Account Settings**.

---

### 4. ✅ Basic Usage in Code

LangChain automatically logs everything **if LangSmith is enabled**:

```python
from langchain.chains import LLMChain
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate

prompt = PromptTemplate.from_template("Tell me a joke about {topic}")
llm_chain = LLMChain(llm=ChatOpenAI(), prompt=prompt)

response = llm_chain.invoke({"topic": "robots"})
```

✅ This will now appear in your LangSmith dashboard:

```
Project: MyRAGApp
Trace:  ChatOpenAI → PromptTemplate → Response
```

---

### 5. ✅ Adding Custom Metadata (Logging)

You can tag runs with metadata for filtering and tracking:

```python
llm_chain.invoke(
  {"topic": "robots"},
  config={"metadata": {"session_id": "123", "user": "mukesh"}}
)
```

✅ Use cases:

* Multi-user app logging
* Trace-specific debugging
* Version tracking (e.g., model version, prompt version)

---

### 6. ✅ Visual Trace Example

In the LangSmith UI, you’ll see:

```
📄 Root: LLMChain
 ├── 🧠 PromptTemplate
 ├── 🤖 ChatOpenAI
 └── 📝 Output
```

Each step includes:

* Inputs
* Prompt used
* Output
* Token usage
* Time taken

---

### 7. ✅ Tracing Tools, Agents, Memory

LangSmith also traces:

* 🔧 Tool calls in agents
* 🧠 Memory variables in chat
* 🔁 Retry logic in chains
* 🧱 Nested chains (e.g., inside LangGraph)

✅ This means **entire LangChain pipelines**, no matter how complex, are fully traceable.

---

### 8. ✅ Datasets & Evaluation (Bonus)

You can create **test datasets** and evaluate chains:

```python
from langsmith.evaluation import RunEvaluator

evaluator = RunEvaluator("qa")
evaluator.evaluate_run(run_id="abc123")
```

Or from UI:

* Create dataset → Add input/output pairs
* Evaluate different chain versions
* Track accuracy, latency, token cost

---

### 9. ✅ Best Practices

| Practice                          | Why                          |
| --------------------------------- | ---------------------------- |
| Always set `LANGCHAIN_PROJECT`    | Groups related runs together |
| Use `metadata` tags               | Enables powerful filtering   |
| Trace agents + RAG + LangGraph    | Debug even the deepest stack |
| Use datasets for regression tests | LLM eval at scale            |

---

### 10. ✅ Use Cases

| Use Case            | LangSmith Benefit              |
| ------------------- | ------------------------------ |
| Debug broken prompt | See the full prompt + response |
| Compare models      | Use trace diffing + evals      |
| Optimize cost       | Check token usage per call     |
| Fix agent routing   | View tool traces step by step  |

---

### ✅ Summary Table

| Feature        | Value                       |
| -------------- | --------------------------- |
| 🔍 Tracing     | Prompt-to-output visibility |
| 📊 Analytics   | Token + time cost           |
| 🧷 Metadata    | Tags, versions, sessions    |
| 🧪 Evaluations | Ground truth testing        |

---