```{contents}
```

## Model Parameters

**Model parameters** are **runtime controls** that influence *how* an LLM generates responses.
They do **not change the model weights**—only the **generation behavior**.

In LangChain, these parameters are passed to the **ChatModel / LLM abstraction** and are **provider-agnostic** (as much as possible).

---

### Why Model Parameters Matter

Model parameters control:

* Creativity vs determinism
* Factuality vs diversity
* Latency and cost
* Safety and verbosity

Poor tuning leads to:

* Hallucinations
* Inconsistent answers
* Higher cost
* Slow responses

---

### Core Model Parameters (Most Important)

#### temperature

Controls **randomness** of token selection.

**Behavior**

* `0.0` → deterministic, factual
* `0.7` → balanced
* `>1.0` → creative, risky

**Demonstration**



In [None]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.0
)

llm.invoke("Explain RAG").content



**Rule of thumb**

* RAG / classification → `0.0–0.2`
* Chat / ideation → `0.6–0.9`

---

### top_p (Nucleus Sampling)

Limits token selection to the **smallest probability mass**.

**Behavior**

* `top_p=1.0` → consider all tokens
* `top_p=0.9` → consider only top 90% probability mass

**Demonstration**



In [None]:
llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.7,
    top_p=0.9
)

llm.invoke("Explain RAG").content




**Important**

* Use **either** `temperature` **or** `top_p`
* Avoid tuning both aggressively

---

### max_tokens

Limits the **length of the generated output**.

#### Demonstration



In [None]:

llm = ChatOpenAI(
    model="gpt-4o-mini",
    max_tokens=50
)




Controls:

* Cost
* Response verbosity
* Latency

---

### Token Control Parameters

#### stop

Stops generation when a token or sequence is encountered.





Used in:

* Structured outputs
* Tool responses
* Controlled generation

---

### frequency_penalty

Reduces repetition of tokens already used.

**Demonstration**

In [None]:

llm = ChatOpenAI(
    model="gpt-4o-mini",
    frequency_penalty=0.5
)




---

### presence_penalty

Encourages **new topics**.

**Demonstration**



In [None]:
llm = ChatOpenAI(
    model="gpt-4o-mini",
    presence_penalty=0.5
)



### seed (Determinism)

Ensures **repeatable outputs** (if provider supports it).


In [None]:
llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.7,
    seed=42
)




---

### response_format (Structured Output)

Used internally by LangChain for structured parsing.

```python
llm.with_structured_output(MySchema)
```

LangChain auto-manages this.

---

### Model Parameters in LangChain Chains



In [None]:

chain = prompt | ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.2,
    max_tokens=200
)




Parameters are bound to the **Runnable**, not the prompt.

---

### Model Parameters in Agents

```python
agent_llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0
)
```

Best practice:

* **Agents → temperature = 0**
* Deterministic tool selection

---

## Model Parameters in RAG

### Recommended RAG Settings

```python
ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.0,
    max_tokens=300
)
```

Why:

* Reduce hallucinations
* Improve grounding on retrieved context

---

## Common Parameter Mistakes

### High temperature in RAG

❌ Causes hallucinations

### Unlimited max_tokens

❌ Increases cost

### Tuning temperature + top_p together

❌ Unpredictable behavior

### Using defaults blindly

❌ Suboptimal results

---

## Parameter Interaction Summary

| Parameter         | Affects         | Typical Use        |
| ----------------- | --------------- | ------------------ |
| temperature       | Randomness      | Creativity control |
| top_p             | Diversity       | Token filtering    |
| max_tokens        | Length          | Cost & latency     |
| stop              | Termination     | Structured output  |
| frequency_penalty | Repetition      | Reduce loops       |
| presence_penalty  | Topic diversity | Brainstorming      |

---

### Recommended Presets (Production)

#### RAG / QA

```python
temperature=0.0
top_p=1.0
max_tokens=300
```

#### Chatbot

```python
temperature=0.7
top_p=0.9
max_tokens=500
```

#### Agents

```python
temperature=0.0
max_tokens=200
```

---

### How LangChain Helps with Parameters

* Unified API across providers
* Safe defaults
* Retry & fallback support
* Structured output enforcement
* Centralized tuning

---

### Interview-Ready Summary

> “Model parameters in LangChain control generation behavior at runtime. They tune creativity, determinism, length, and cost without changing the model. Proper parameter tuning is critical for reliable RAG, agents, and production systems.”

---

### Rule of Thumb

* **Accuracy → temperature ↓**
* **Creativity → temperature ↑**
* **Production → explicit parameters**
* **Never rely on defaults blindly**