# Lesson 2: Customizing Model Parameters in LangChain

Here is the markdown conversion for the provided content:

---

# Customizing Model Parameters in LangChain

Welcome to the next step in your LangChain journey! In the previous lesson, you mastered the basics of sending messages to an AI model. Now, it's time to unlock the full potential of your AI interactions by customizing model parameters. This crucial skill will empower you to tailor AI responses to meet your specific needs, making your AI more responsive and aligned with your goals. Get ready to dive deeper and enhance your AI experience by learning how to adjust these parameters effectively.

---

## Choosing Your AI Brain: The Model Parameter

The model parameter lets you select which AI "brain" will power your conversations:

```python
# Select a specific OpenAI model
chat = ChatOpenAI(
    model="gpt-4o-mini"  # Using the compact version of GPT-4o
)
```

This parameter determines the underlying AI model that processes your messages. Different models offer varying capabilities:

* **gpt-3.5-turbo**: A good balance of capability and cost
* **gpt-4o**: More advanced reasoning capabilities but higher cost
* **gpt-4o-mini**: A smaller, faster version of GPT-4o

Think of this like choosing between different experts for different tasks—some are more specialized, some more general, and they come with different "hiring costs".

---

## Controlling Creativity: The Temperature Dial

Once you've selected your model, you'll want to control how creative it gets. This is where temperature comes in:

```python
# Set the creativity level of the AI
chat = ChatOpenAI(
    temperature=0.7  # Balanced creativity setting
)
```

Temperature values typically range from **0** to **2**:

* **Low temperature (0-0.3)**: More deterministic, focused, and predictable responses
* **Medium temperature (0.4-0.7)**: Balanced creativity and coherence
* **High temperature (0.8-2.0)**: More random, creative, and diverse outputs

For factual questions or coding help, turn the dial down. For creative writing or brainstorming, crank it up. It's like adjusting between "strictly follow the recipe" and "improvise with the ingredients".

---

## Setting Boundaries: The Max Tokens Limit

While temperature controls how your AI thinks, max\_tokens controls how much it says:

```python
# Limit the length of the AI's response
chat = ChatOpenAI(
    max_tokens=150  # Caps the response at approximately 100-120 words
)
```

This parameter sets a hard ceiling on how many tokens the model can generate in its response:

* **Lower values (50-100)**: Results in brief responses that may be cut off before completing the thought
* **Medium values (150-500)**: Provides enough space for most explanations, but complex answers might still be truncated
* **Higher values (1000+)**: Allows for comprehensive responses with less risk of mid-sentence cutoffs

It's important to understand that the model has no awareness of this limit while generating its response. If a response would naturally exceed your **max\_tokens** setting, it will simply be cut off mid-sentence or mid-thought. Without setting this limit, models might generate very lengthy responses, potentially increasing your costs and overwhelming your users with too much information.

---

## Controlling Randomness: The Top P Sampler

Temperature isn't the only way to control randomness. **top\_p** offers a complementary approach:

```python
# Adjust how the AI selects its next words
chat = ChatOpenAI(
    top_p=0.9  # Consider only the most likely 90% of possible next words
)
```

This parameter determines how the model selects the next token in its response:

* **Value of 1.0**: Consider all possible next words
* **Value of 0.9**: Only consider the most likely words that add up to 90% probability
* **Lower values (0.5-0.7)**: More focused, less surprising responses

While temperature adjusts "how random" the selection is, **top\_p** filters "which options are even considered". They work well together—like controlling both the size of a menu and how randomly you select from it. Many developers find that adjusting **temperature** is sufficient, but **top\_p** gives you another dimension of control.

---

## Preventing Repetition: The Frequency Penalty

Even with temperature and top\_p set correctly, AI models can sometimes get stuck repeating themselves. That's where **frequency\_penalty** comes in:

```python
# Discourage the AI from repeating itself
chat = ChatOpenAI(
    frequency_penalty=0.5  # Moderate penalty for word repetition
)
```

This parameter discourages the model from repeating the same words or phrases:

* **Value of 0.0**: No penalty for repetition
* **Positive values (0.1-1.0)**: Increasingly penalize repeated tokens
* **Negative values (-1.0-0.0)**: Actually encourage repetition

A moderate frequency penalty creates more natural-sounding text by reducing the "stuck in a loop" effect that AI models sometimes exhibit. It's like telling someone "try not to use the same word twice". This is particularly useful for longer conversations or content generation.

---

## Encouraging Exploration: The Presence Penalty

While frequency\_penalty prevents repetition of specific words, **presence\_penalty** encourages broader topic exploration:

```python
# Encourage the AI to introduce new topics
chat = ChatOpenAI(
    presence_penalty=0.3  # Gently encourage topic exploration
)
```

This parameter influences how likely the model is to talk about new concepts:

* **Value of 0.0**: No special treatment for new topics
* **Positive values (0.1-1.0)**: Increasingly encourage discussion of new topics
* **Negative values (-1.0-0.0)**: Encourage the model to stay on existing topics

The presence penalty is useful when you want the AI to think more broadly rather than drilling down on what's already been mentioned—like encouraging someone to "think outside the box". This parameter works hand-in-hand with **frequency\_penalty** to create more dynamic, interesting conversations.

---

## Putting It All Together: Your Custom AI Recipe

Now that you understand each parameter individually, you can combine them to create your perfect AI recipe:

```python
from langchain_openai import ChatOpenAI

# Initialize ChatOpenAI with a complete set of custom parameters
chat = ChatOpenAI(
    model="gpt-4o-mini",      # Choose a compact but powerful model
    temperature=0.7,          # Balanced creativity setting
    max_tokens=50,            # Keep responses concise
    top_p=0.9,                # Consider 90% of probability mass
    frequency_penalty=0.5,    # Discourage repetition
    presence_penalty=0.3      # Gently encourage new topics
)

# Send a message to the AI model
response = chat.invoke("Hello, can you tell me a joke?")

# Print the AI's response
print("AI Response:")
print(response.content)
```

Each parameter adjustment contributes to the overall behavior of your AI, allowing you to fine-tune it for specific use cases. Like a chef combining ingredients, you'll develop your own preferred "recipes" for different situations.

---

## Learning Through Play: Experimenting with Parameters

The best way to understand these parameters is to experiment with them. Try adjusting one parameter at a time to observe its effect on the AI's responses:

* Set **temperature** to 0 vs. 1.5 for the same prompt.
* Compare **max\_tokens** of 50 vs. 500.
* Test different combinations of **frequency** and **presence penalties**.

Through experimentation, you'll develop an intuitive feel for how to configure the model for different use cases—just like a chef learns to adjust recipes by tasting as they go.

---

## Summary and Next Steps

In this lesson, we explored how to **customize model parameters** to tailor AI responses to your specific needs. You learned about key parameters such as:

* **Model selection**
* **Temperature**
* **Max tokens**
* **Top p**
* **Penalties for frequency and presence**

As you move on to the practice exercises, experiment with different parameter values to reinforce your understanding and achieve the desired AI behavior. This skill will be invaluable as you continue your journey into conversational AI with **LangChain**.

---

By learning how to adjust these settings, you unlock the true potential of **AI customization**, making your interactions more tailored to specific tasks. Experimenting with these parameters will give you a greater understanding of how to craft the ideal user experience for any given situation.


## Choosing Your AI Brain

In this exercise, you'll focus on the first key parameter — model selection. Your task is to modify the code to use the "gpt-4o-mini" model instead of the default model.

This simple change will give you hands-on experience with choosing which AI "brain" powers your conversations. After making this modification, run the code to see how the model responds to a greeting message.

By completing this exercise, you'll take your first step toward customizing AI behavior to better suit your specific needs — an essential skill for creating more effective AI applications.

```python
from langchain_openai import ChatOpenAI

# TODO: Modify the ChatOpenAI to use the "gpt-4o-mini" model
chat = ChatOpenAI()

# Send a message to the AI model
response = chat.invoke("Hello! How are you today?")

# Print the AI's response
print("AI Response:")
print(response.content)
```

To complete this exercise, you'll need to modify the `ChatOpenAI` initialization to use the `"gpt-4o-mini"` model instead of the default model. Here's the modified code:

```python
from langchain_openai import ChatOpenAI

# Modify the ChatOpenAI to use the "gpt-4o-mini" model
chat = ChatOpenAI(
    model="gpt-4o-mini"  # Use the "gpt-4o-mini" model
)

# Send a message to the AI model
response = chat.invoke("Hello! How are you today?")

# Print the AI's response
print("AI Response:")
print(response.content)
```

### Explanation:

* **`model="gpt-4o-mini"`**: This is the key modification. It specifies that the model used for processing the conversation will be **"gpt-4o-mini"** instead of the default model.

After running this code, the model will respond to the greeting, and you'll be able to observe how the **"gpt-4o-mini"** model handles the input. This is an essential skill to understand when customizing AI to suit specific needs, especially when considering the trade-offs between model size, capability, and cost.


## Limiting AI Responses with Max Tokens

Now that you've selected your AI model, let's explore how to control response length! In this exercise, you'll work with the max_tokens parameter, which limits how much your AI can say. If the model hasn't finished its response by the time it hits the token limit, it will abruptly stop mid-sentence.

Your task is to add the max_tokens parameter to the ChatOpenAI instance and set it to just 20 tokens. When you run the code, observe how the AI compresses or abruptly cuts off its response about artificial intelligence.

This technique is valuable when you need very brief answers for user interfaces with limited space or to reduce token usage costs.

```python
from langchain_openai import ChatOpenAI

# TODO: Add the max_tokens parameter set to 20 to force short responses
chat = ChatOpenAI(
    model="gpt-4o-mini"
)

# Send an open-ended question to the AI model
response = chat.invoke("Tell me about artificial intelligence")

# Print the AI's response
print("AI Response:")
print(response.content)

```

To complete this exercise, you'll need to add the `max_tokens` parameter to the `ChatOpenAI` instance and set it to **20 tokens**. Here's the modified code:

```python
from langchain_openai import ChatOpenAI

# Add the max_tokens parameter set to 20 to force short responses
chat = ChatOpenAI(
    model="gpt-4o-mini",
    max_tokens=20  # Limit the response to just 20 tokens
)

# Send an open-ended question to the AI model
response = chat.invoke("Tell me about artificial intelligence")

# Print the AI's response
print("AI Response:")
print(response.content)
```

### Explanation:

* **`max_tokens=20`**: This limits the AI's response to 20 tokens (approximately 15-20 words). If the response is longer than this, it will be cut off mid-sentence.
* This parameter is useful when you need brief responses, for instance, when displaying answers in small UI elements or when trying to minimize token usage for cost-saving.

When you run this code, you will observe how the AI compresses or cuts off its response due to the 20-token limit, providing a short and concise answer.


## Controlling AI Creativity with Temperature

With your AI model selected and response length set, let's explore how to control its creativity! In this exercise, you'll work with the temperature parameter, which acts as a creativity dial for your AI.

Your task is to add a high temperature value (1.5) to the ChatOpenAI instance that already has a model and max_tokens set. This setting will make the AI produce more creative and varied responses — perfect for generating fun, unexpected content.

After adding the parameter, run the code to see how the model responds to the request for a fun fact about dolphins. Notice how imaginative and diverse the answer is compared to what you might get with lower temperature settings.

By mastering temperature control, you'll be able to fine-tune your AI's responses for different scenarios — whether you need factual accuracy or creative brainstorming.

```python
from langchain_openai import ChatOpenAI

# TODO: Add a temperature parameter with value 1.5 to get creative, varied responses
chat = ChatOpenAI(
    model="gpt-4o-mini",
    max_tokens=150
)

# Ask the AI for a fun fact
response = chat.invoke("Tell me a fun fact about dolphins.")

# Print the AI's response
print("AI Response:")
print(response.content)

```

To complete this exercise, you'll need to add the **`temperature`** parameter with a value of **1.5** to the `ChatOpenAI` instance. This will make the AI generate more creative and diverse responses. Here's the modified code:

```python
from langchain_openai import ChatOpenAI

# Add a temperature parameter with value 1.5 to get creative, varied responses
chat = ChatOpenAI(
    model="gpt-4o-mini",
    max_tokens=150,
    temperature=1.5  # Increase creativity and diversity of responses
)

# Ask the AI for a fun fact
response = chat.invoke("Tell me a fun fact about dolphins.")

# Print the AI's response
print("AI Response:")
print(response.content)
```

### Explanation:

* **`temperature=1.5`**: This setting encourages the model to be more creative and random in its responses. With higher temperature values, the AI is more likely to generate unique and imaginative answers.
* **`max_tokens=150`**: Ensures the AI's response isn't cut off and provides a sufficient length for a fun fact.

When you run this code, you'll notice that the AI's response to "Tell me a fun fact about dolphins" will be more creative and diverse compared to lower temperature settings. This is a great technique for generating creative content, brainstorming ideas, or having fun with your AI.


## Expanding Vocabulary with Nucleus Sampling

You've learned about model selection, response length, and temperature control — now let's explore another dimension of AI creativity with the top_p parameter! This parameter (also called nucleus sampling) determines the range of words your AI considers when generating text.

Your task is to add the top_p parameter with a value of 0.9 to the ChatOpenAI instance. This setting instructs the model to consider the top 90% most probable words at each step, providing a good balance between focus and variety.

When you run the code, observe how the AI creates a poem about the ocean with a more varied vocabulary than it might with a lower top_p setting. This parameter works alongside temperature to give you fine-grained control over your AI's creative expression — a valuable tool when you need both creativity and coherence in your AI-generated content.

```python
from langchain_openai import ChatOpenAI

# TODO: Add the top_p parameter set to 0.9 to allow for more varied vocabulary
chat = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.7,
    max_tokens=150
)

# Ask the AI to write a creative poem
response = chat.invoke("Write a short poem about the ocean.")

# Print the AI's response
print("AI Response:")
print(response.content)


```

To complete this exercise, you'll need to add the **`top_p`** parameter with a value of **0.9** to the `ChatOpenAI` instance. This setting will instruct the model to consider the top 90% most probable words, enhancing the creativity and variety of the response. Here's the modified code:

```python
from langchain_openai import ChatOpenAI

# Add the top_p parameter set to 0.9 to allow for more varied vocabulary
chat = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.7,
    max_tokens=150,
    top_p=0.9  # Limit to the top 90% most probable words
)

# Ask the AI to write a creative poem
response = chat.invoke("Write a short poem about the ocean.")

# Print the AI's response
print("AI Response:")
print(response.content)
```

### Explanation:

* **`top_p=0.9`**: This value restricts the model to consider only the top 90% of the most likely next words when generating text. This method is also called **nucleus sampling**, and it provides a balance between randomness and coherence in the output.
* **`temperature=0.7`**: This provides a good level of creativity while still maintaining some structure and coherence in the response.
* **`max_tokens=150`**: Ensures the AI can generate a sufficient length of the poem without being cut off.

When you run this code, you will notice that the poem about the ocean is more varied in vocabulary and creative expression than if you had used a lower `top_p` setting. This parameter gives you additional control over the AI's creativity, working alongside temperature to adjust the variety of language used in the response.


## Reducing Repetition with Frequency Penalty

After exploring nucleus sampling with top_p, let's tackle another common issue in AI responses — repetitive language! In this exercise, you'll work with the frequency_penalty parameter, which helps the AI avoid repeating the same words and phrases.

Your task is to add a high frequency_penalty value (1.0) to the ChatOpenAI instance. This setting will strongly discourage the AI from reusing words, resulting in more varied and natural-sounding text.

When you run the code, notice how the AI explains the internet to a child in five different ways without falling into repetitive patterns. Each explanation uses fresh language and approaches the topic from different angles.

This parameter is especially useful when generating lists, multiple examples, or any content where a variety of expressions makes the AI sound more human and engaging.

```python
from langchain_openai import ChatOpenAI

# TODO: Add the frequency_penalty parameter set to 1.0 to reduce repetitive language
chat = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.7,
    max_tokens=350
)

# Ask the AI for content that might naturally become repetitive
response = chat.invoke("Explain how the internet works to a child in five short and different ways")

# Print the AI's response
print("AI Response:")
print(response.content)

```

To incorporate the `frequency_penalty` parameter in your code, you can modify the `ChatOpenAI` instance to include the penalty. Here's the updated code with the `frequency_penalty` set to `1.0`:

```python
from langchain_openai import ChatOpenAI

# Add the frequency_penalty parameter set to 1.0 to reduce repetitive language
chat = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.7,
    max_tokens=350,
    frequency_penalty=1.0  # Reducing repetitive language
)

# Ask the AI for content that might naturally become repetitive
response = chat.invoke("Explain how the internet works to a child in five short and different ways")

# Print the AI's response
print("AI Response:")
print(response.content)
```

This will make the AI more likely to use diverse language and avoid repetition, especially when generating explanations or examples.


## Encouraging Topic Diversity with Presence Penalty

Now let's explore the presence_penalty parameter, which encourages the AI to introduce new topics rather than focusing on what's already been mentioned.

Your task is to add a high presence_penalty value (0.8) to the ChatOpenAI instance. This setting will push the AI to generate more diverse content.

This parameter is valuable when you need responses that provide broad coverage instead of a single domain.

```python
from langchain_openai import ChatOpenAI

# TODO: Add the presence_penalty parameter set to 0.8 to encourage diversity
chat = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.7,
    max_tokens=250
)

# Ask the AI to list unique items
response = chat.invoke("List five unique items.")

# Print the AI's response
print("AI Response:")
print(response.content)
```

To incorporate the `presence_penalty` parameter, you simply need to add it to the `ChatOpenAI` instance with the value set to `0.8`. Here's the updated code:

```python
from langchain_openai import ChatOpenAI

# Add the presence_penalty parameter set to 0.8 to encourage diversity
chat = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.7,
    max_tokens=250,
    presence_penalty=0.8  # Encouraging the AI to introduce new topics
)

# Ask the AI to list unique items
response = chat.invoke("List five unique items.")

# Print the AI's response
print("AI Response:")
print(response.content)
```

With this setting, the AI will focus on generating responses that introduce new ideas and avoid repeating the same topics, which is ideal when you need more diverse and comprehensive content.
