```{contents}
```
## LLM 

In **LangChain**, an LLM is:

> A **pluggable, provider-agnostic text generation interface** that can be composed with prompts, tools, memory, and retrieval pipelines.

LangChain **does not build models**.
It **standardizes how you call them**.

---

### Two Core LLM Abstractions in LangChain

#### A. `LLM` (Legacy / Text-only)

* Input: `str`
* Output: `str`
* Example: older OpenAI completions

#### B. `ChatModel` (Primary / Modern)

* Input: list of messages
* Output: AIMessage
* Supports:

  * Tool calling
  * Function calling
  * System messages
  * Multi-turn chat

**Most LangChain code today uses ChatModels**

---

### Mental Model

```
Prompt → LLM → Output
Prompt + Tools → Agent → LLM → Tool → LLM → Output
Prompt + Docs → Retriever → LLM → Answer
```

LLM is **just one node** in a larger graph.

---

### Basic LLM Usage (Chat Model)

#### Example: OpenAI via LangChain



In [4]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.2,
    max_tokens=200
)

response = llm.invoke("Explain RAG in one sentence")
print(response.content)


RAG, or Retrieval-Augmented Generation, is a machine learning approach that combines retrieval of relevant information from a knowledge base with generative models to produce more accurate and contextually relevant responses.


#### What LangChain adds

* Unified API
* Retry logic
* Streaming
* Async support
* Easy swapping of providers

---

### Prompt Templates + LLM (LangChain Style)


In [5]:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an IT support assistant"),
    ("human", "{question}")
])

chain = prompt | llm

result = chain.invoke({"question": "What is LangChain?"})
print(result.content)


LangChain is a framework designed to simplify the development of applications that utilize large language models (LLMs). It provides tools and components to help developers build applications that can leverage the capabilities of LLMs for various tasks, such as natural language understanding, text generation, and conversational agents.

Key features of LangChain include:

1. **Modularity**: LangChain is built with a modular architecture, allowing developers to mix and match components according to their needs. This includes various modules for handling prompts, memory, and chains of operations.

2. **Chains**: The framework allows for the creation of "chains," which are sequences of operations that can be executed in order. This is useful for building complex workflows that involve multiple steps or interactions with the language model.

3. **Memory**: LangChain supports memory management, enabling applications to maintain context over multiple interactions. This is particularly useful


Key idea:

* Prompts are **first-class objects**
* LLM is **composable**

---

### LCEL (LangChain Expression Language)

LCEL is how LangChain **wires LLMs into pipelines**.


In [6]:
chain = (
    {"question": lambda x: x["query"]}
    | prompt
    | llm
)


Benefits:

* Declarative
* Async by default
* Streaming-ready

---



### LLM with Structured Output


In [6]:
from pydantic import BaseModel
from langchain_openai import ChatOpenAI

class TicketInfo(BaseModel):
    category: str
    priority: str

llm = ChatOpenAI(model="gpt-4o-mini")

structured_llm = llm.with_structured_output(TicketInfo)

result = structured_llm.invoke(
    "Email service is down for CEO"
)

print(result)


category='Technical Issue' priority='High'


LangChain ensures:

* JSON validity
* Schema compliance
* Retry on failure

---

### LLM with Tools (Key Differentiator)


In [7]:

def ticket_count(source: str) -> int:
    return 120

llm = ChatOpenAI(model="gpt-4o-mini")

llm_with_tools = llm.bind_tools([ticket_count])

response = llm_with_tools.invoke(
    "How many tickets are there in Jira?"
)


LangChain:

* Converts function → JSON schema
* Parses tool calls
* Routes execution

---

### LLM in RAG (Retriever + LLM)



In [7]:
from langchain_classic.chains.retrieval_qa.base import RetrievalQA
qa = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=vectorstore.as_retriever()
)

qa.invoke("What is our password policy?")


NameError: name 'vectorstore' is not defined


LLM role:

* Read retrieved chunks
* Synthesize answer
* Cite sources (if enabled)

---

### Streaming Tokens


In [3]:
for chunk in llm.stream("Explain LangChain"):
    print(chunk.content, end="", flush=True)


LangChain is a framework designed to facilitate the development of applications that leverage large language models (LLMs). It provides a structured way to build applications that can utilize LLMs for various tasks, such as natural language understanding, text generation, and more. The framework is particularly useful for developers looking to create complex applications that require the integration of LLMs with other components, such as databases, APIs, and user interfaces.

### Key Features of LangChain:

1. **Modular Components**: LangChain is built around modular components that can be easily combined and reused. This allows developers to create custom workflows tailored to their specific needs.

2. **Chains**: At the core of LangChain are "chains," which are sequences of operations that can include prompts, model calls, and other processing steps. Chains can be simple or complex, depending on the application's requirements.

3. **Agents**: LangChain supports the concept of agents,

Useful for:

* Chat UIs
* SSE / WebSocket

---

### LLM Lifecycle in LangChain

```
Prompt
 → Validation
 → Model Call
 → Retry / Backoff
 → Output Parsing
 → Callbacks / Tracing
```

You get all of this **without writing glue code**.

---

### Why LangChain Uses LLM Abstractions

| Problem           | LangChain Solution    |
| ----------------- | --------------------- |
| Provider lock-in  | Unified LLM interface |
| Prompt reuse      | Prompt templates      |
| Complex workflows | LCEL                  |
| Tool calling      | Automatic schema      |
| Streaming         | Built-in              |
| Production safety | Output parsing        |

---

### Interview-Ready Summary

> “In LangChain, an LLM is a composable runtime component, not a standalone model. It is wired into prompts, retrievers, tools, and agents using LCEL to build deterministic, production-grade pipelines.”

---

### When to Use What

| Use Case          | LangChain LLM Pattern      |
| ----------------- | -------------------------- |
| Simple Q&A        | Prompt → LLM               |
| Chatbot           | ChatModel                  |
| RAG               | Retriever → LLM            |
| Automation        | Agent + Tools              |
| Structured output | `with_structured_output()` |
| Streaming UI      | `stream()`                 |


### LLM v/s ChatModel

> **`LLM` is a text-completion interface.
> `ChatModel` is a message-based, tool-aware conversational interface.**

LangChain treats them as **distinct abstractions**.

---

#### Abstraction Level

| Aspect           | LLM             | ChatModel           |
| ---------------- | --------------- | ------------------- |
| Input            | `str`           | `List[BaseMessage]` |
| Output           | `str`           | `AIMessage`         |
| Conversation     | ❌ No            | ✅ Yes               |
| System messages  | ❌ No            | ✅ Yes               |
| Tool calling     | ❌ No            | ✅ Yes               |
| Function calling | ❌ No            | ✅ Yes               |
| Agents           | ❌ Not supported | ✅ Required          |
| Streaming        | Limited         | Full                |
| Future-proof     | ❌ Legacy        | ✅ Primary           |

**LangChain recommends ChatModel for all new systems.**

---

### Conceptual Model

#### LLM

```
"text in" → model → "text out"
```

#### ChatModel

```
[System, Human, AI] → model → AIMessage
                      ↳ tool_calls
```

---

### LLM (Text Completion) – Demonstration

> **Legacy / compatibility abstraction**



In [None]:
from langchain_openai import OpenAI

llm = OpenAI(
    model="gpt-3.5-turbo-instruct",
    temperature=0.2
)

response = llm.invoke("Explain RAG in one sentence")
print(response)



### Limitations

* No system role
* No memory
* No tools
* No agents
* No structured output

Used only for:

* Migration
* Simple batch tasks
* Non-chat legacy models

---

### ChatModel – Demonstration

> **Primary LangChain abstraction**



In [9]:
from langchain_openai import ChatOpenAI

chat = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.2
)

response = chat.invoke("Explain RAG in one sentence")
print(response.content)


RAG, or Retrieval-Augmented Generation, is a machine learning approach that combines retrieval of relevant information from a knowledge base with generative models to produce more accurate and contextually relevant responses.


### What you gain immediately

* Message roles
* Tool calling
* Function schemas
* Streaming
* Agents
* Multi-turn chat

---

### Message Objects

ChatModels operate on **messages**, not strings.



In [10]:
from langchain_core.messages import SystemMessage, HumanMessage

messages = [
    SystemMessage(content="You are an IT support assistant"),
    HumanMessage(content="Email is down")
]

response = chat.invoke(messages)



This enables:

* Instruction hierarchy
* Context control
* Safety rules

---

### Tool Calling (ChatModel-only)



In [11]:

def ticket_count(source: str) -> int:
    return 120

chat_with_tools = chat.bind_tools([ticket_count])

response = chat_with_tools.invoke(
    "How many tickets are there in Jira?"
)

print(response.tool_calls)


[{'name': 'ticket_count', 'args': {'source': 'Jira'}, 'id': 'call_PdonKV34RhsI1eJZxSkrES4h', 'type': 'tool_call'}]




LLMs **cannot do this**.

---

### Structured Output (ChatModel)



In [12]:

from pydantic import BaseModel

class Ticket(BaseModel):
    category: str
    priority: str

structured_chat = chat.with_structured_output(Ticket)

result = structured_chat.invoke("CEO cannot login to VPN")
print(result)


category='Technical Support' priority='High'




LangChain:

* Generates schema
* Validates output
* Retries on failure

---

### Agents Require ChatModels



In [16]:
def ticket_count(source: str) -> int:
    return 120

In [35]:
from langchain_classic.agents import (
    AgentExecutor,
    create_openai_tools_agent,
)
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain_core.tools import Tool

# Define the tool
def ticket_count(source: str) -> int:
    """Returns the number of tickets in the system."""
    return 120

ticket_tool = Tool(
    name="ticket_count",
    func=ticket_count,
    description="Counts the number of tickets in the system."
)

# Initialize the Chat Model
chat = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# Define the prompt template
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an assistant. Use tools if needed."),
    ("human", "{input}"),
    ("assistant", "{agent_scratchpad}")
])

# Create the agent
agent = create_openai_tools_agent(
    llm=chat,
    tools=[ticket_tool],
    prompt=prompt
)

# Create the executor
executor = AgentExecutor(
    agent=agent,
    tools=[ticket_tool],
    verbose=True
)

# Invoke the agent
response = executor.invoke(
    {"input": "How many tickets are there in Jira?"}
)

print(response["output"])



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `ticket_count` with `Jira`


[0m[36;1m[1;3m120[0m[32;1m[1;3mThere are 120 tickets in Jira.[0m

[1m> Finished chain.[0m
There are 120 tickets in Jira.




**Agents will not work with LLMs.**

---

###  Streaming Comparison

**LLM (limited)**



In [37]:
llm.stream("Hello")

<generator object BaseChatModel.stream at 0x000002988EAC6020>

### ChatModel (full)


In [38]:
for chunk in chat.stream("Explain LangChain"):
    print(chunk.content, end="")


LangChain is a framework designed to facilitate the development of applications that utilize large language models (LLMs). It provides a structured way to build applications that can leverage the capabilities of LLMs for various tasks, such as natural language understanding, text generation, and more. Here are some key components and features of LangChain:

1. **Modular Design**: LangChain is built with a modular architecture, allowing developers to easily integrate different components such as LLMs, data sources, and tools. This modularity makes it easier to customize and extend applications.

2. **Chains**: At the core of LangChain is the concept of "chains," which are sequences of operations that can be executed in order. For example, a chain might involve taking user input, processing it with an LLM, and then returning a response. Chains can be simple or complex, depending on the application's requirements.

3. **Agents**: LangChain supports the creation of agents that can make dec



ChatModels support:

* Token streaming
* Tool streaming
* Partial responses

---

- LCEL Compatibility

| Feature   | LLM        | ChatModel |
| --------- | ---------- | --------- |
| Runnable  | ⚠️ Limited | ✅ Full    |
| Parallel  | ❌          | ✅         |
| Retry     | ❌          | ✅         |
| Fallbacks | ❌          | ✅         |

---

### When Should You Use LLM?

**Rare cases only**:

* Old completion-only models
* Static text generation
* Offline batch processing
* Migration support

---

### When Should You Use ChatModel?

**Always, if:**

* Chatbot
* RAG
* Tools
* Agents
* Streaming UI
* Production systems

---

### Why LangChain Split Them

LangChain reflects **model evolution**:

| Era      | API Type        |
| -------- | --------------- |
| Pre-2023 | Text completion |
| 2023+    | Chat / messages |
| 2024+    | Tools + agents  |

---

**Interview-Ready Answer**

> “LLM is a legacy text-completion abstraction. ChatModel is LangChain’s primary interface for modern, conversational, tool-aware, agentic LLM systems. All agent, RAG, and production workflows require ChatModels.”

---
