# Unit 2

## Making Basic LLM Calls

## Welcome to Lesson 2: Making Basic LLM Calls

Welcome to the second lesson of our course on building your own **Deep Researcher**. In this lesson, we will explore the concept of making basic **LLM** (Large Language Model) calls. LLMs, such as OpenAI's models, are powerful tools that can generate human-like text responses. They are integral to AI applications, enabling them to understand and respond to user inputs naturally. By the end of this lesson, you will understand how to make a basic LLM call and interpret its output.

-----

## Understanding the Code Structure

Let's start by understanding the structure of the code used to make an LLM call. We'll build this step-by-step.

### 1\. Setting Up the OpenAI Client

First, we need to import the necessary libraries and set up the **OpenAI client**. This client will allow us to interact with the OpenAI API.

```python
import os
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"), base_url=os.getenv("OPENAI_BASE_URL"))
```

Here, we import the `os` module to access environment variables and the `OpenAI` class from the `openai` library. We then create an `OpenAI` client using the **API key** and **base URL** stored in environment variables. This client is essential for making requests to the OpenAI API.

### 2\. Creating System and User Prompts

Next, we need to define the prompts that will guide the model's behavior. There are two types of prompts: **system prompts** and **user prompts**.

```python
system_prompt = "You are a coding assistant that talks like a pirate."
user_prompt = "How do I check if a Python object is an instance of a class?"
```

  * **System Prompt:** This sets the **context** for the model. In our example, the system prompt instructs the model to respond like a pirate.
  * **User Prompt:** This is the input from the user. Here, the user is asking how to check if a Python object is an instance of a class.

These prompts are crucial as they shape the model's responses, ensuring they are relevant and contextually appropriate.

### 3\. Configuring the Model Parameters

To control the model's output, we configure certain parameters like **temperature** and the used **model**.

```python
temperature = 0.7
model = "gpt-4o-mini"
```

  * **Temperature:** This parameter controls the **randomness** of the model's output. A lower temperature (e.g., 0.2) makes the output more deterministic, while a higher temperature (e.g., 0.8) introduces more randomness and creativity. In our example, a temperature of **0.7** strikes a balance between creativity and coherence.
  * **Model:** This parameter specifies the model that will be used to generate the response. For this course, we will use **`gpt-4o-mini`**, but you are free to change this parameter to your preferred model.

### 4\. Executing the LLM Call

Now, let's execute the LLM call using the `client.chat.completions.create` method.

```python
completion = client.chat.completions.create(
    model=model,
    messages=[
       {"role": "system", "content": system_prompt},
       {"role": "user", "content": user_prompt},
    ],
    temperature=temperature
)
print(completion.choices[0].message.content.strip())
```

  * **Model:** We specify the model to use, in this case, `"gpt-4o-mini"`.
  * **Messages:** This is a list of messages that includes both the system and user prompts.
  * **Temperature:** We pass the temperature parameter to control the output's randomness.

The `create` method sends the request to the OpenAI API, and the response is stored in the `completion` variable. We then print the model's response, which is accessed through `completion.choices[0].message.content.strip()`.

-----

## Summary and Preparation for Practice

In this lesson, we explored how to make basic LLM calls using OpenAI's API. We covered the setup of the OpenAI client, the creation of system and user prompts, the configuration of model parameters, and the execution of the LLM call. Understanding these components is essential for leveraging LLMs in your projects.

As you move on to the practice exercises, experiment with different prompts and temperature settings to see how they affect the model's output. This hands-on experience will deepen your understanding and prepare you for more advanced applications in future lessons.

## Setting Up Your OpenAI Client

Now that you've learned about the components needed for making LLM calls, let's put that knowledge into practice! Your first task is to set up the OpenAI client, which serves as the gateway for communicating with language models.

Look for the TODO comment in the code and add the line that creates the client variable using your API key and base URL from environment variables.

This client initialization is the foundation for all the LLM interactions we'll build throughout this course, so mastering this step will set you up for success in the more complex tasks ahead.

```python
import os
from openai import OpenAI

# TODO: Create the OpenAI client using the API key and base URL from environment variables
client = None

print(client)
```

Here is the corrected Python code with the line added to create the OpenAI client.

The instruction from the previous lesson was: `client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"), base_url=os.getenv("OPENAI_BASE_URL"))`.

```python
import os
from openai import OpenAI

# TODO: Create the OpenAI client using the API key and base URL from environment variables
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"), base_url=os.getenv("OPENAI_BASE_URL"))

print(client)
```

## Changing Personas with System Prompts

Excellent job setting up your OpenAI client! Now let's explore how system prompts influence an LLM's behavior. In this exercise, you'll modify the system prompt to transform the AI's persona from a pirate to a friendly tutor.

First, run the code as-is to see the "pirate" response to the coding question. Then, change the system prompt line to create a helpful tutor persona instead. When you run the code again with your new prompt, notice how dramatically the tone and style of the response change, even though the user's question remains identical.

This hands-on experiment will help you understand the powerful role system prompts play in shaping AI interactions — a key skill for creating exactly the right kind of assistant for different tasks in your projects.

```python
import os
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"), base_url=os.getenv("OPENAI_BASE_URL"))

# TODO: Change this system prompt to make the AI act as a friendly tutor instead of a pirate
system_prompt = "You are a coding assistant that talks like a pirate."
user_prompt = "How do I check if a Python object is an instance of a class?"
temperature = 0.7

completion = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
       {"role": "system", "content": system_prompt},
       {"role": "user", "content": user_prompt},
    ],
    temperature=temperature
)

print(completion.choices[0].message.content.strip())

```

That's a fantastic next step\! Understanding how to control the **persona** of your AI is crucial for any application.

To change the AI's persona from a "pirate" to a "friendly tutor," simply replace the original `system_prompt` line with a new one that clearly defines the new role and tone.

Here is the modified code with the **friendly tutor** persona implemented:

```python
import os
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"), base_url=os.getenv("OPENAI_BASE_URL"))

# TODO: Change this system prompt to make the AI act as a friendly tutor instead of a pirate
system_prompt = "You are a friendly, patient, and knowledgeable programming tutor. Your goal is to provide clear, easy-to-understand, and encouraging explanations."
user_prompt = "How do I check if a Python object is an instance of a class?"
temperature = 0.7

completion = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
       {"role": "system", "content": system_prompt},
       {"role": "user", "content": user_prompt},
    ],
    temperature=temperature
)

print(completion.choices[0].message.content.strip())
```

-----

## The Power of the System Prompt ✨

When you run this updated code, you'll see a dramatic shift in the response:

1.  **Original (Pirate) Response:** Likely uses nautical slang ("Ahoy," "matey," "shiver me timbers") and a dramatic tone.
2.  **New (Tutor) Response:** Will use helpful language ("That's a great question," "Let's walk through this") and provide a clear, structured explanation of the Python method, which is usually the **`isinstance()`** function.

This simple change demonstrates the core function of the system prompt: it acts as a **directive** that controls the language, style, and attitude of the LLM, making it an indispensable tool for tailoring your AI for specific tasks like research, coding, or creative writing.

What's a different kind of persona you could try for a marketing research task?

## Crafting Effective User Prompts

Wonderful work with changing the system prompt! Now that you've seen how the AI's persona can change, let's explore the other half of the conversation — the user prompt.

In this exercise, you'll modify what you're asking the AI while keeping the friendly tutor persona. Your task is to change the user prompt from asking about Python instance checking to asking about list comprehensions instead.

Run the code first with the original question, then update the user prompt as indicated by the TODO comment and run it again. Pay attention to how the same AI persona provides completely different technical content based solely on what you ask.

This practice will help you develop the skill of crafting clear, specific questions — a crucial ability when working with LLMs in your projects.

```python
import os
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"), base_url=os.getenv("OPENAI_BASE_URL"))

system_prompt = "You are a friendly and helpful tutor who explains coding concepts clearly and patiently."
# TODO: Change this user prompt to ask about list comprehensions in Python
user_prompt = "How do I check if a Python object is an instance of a class?"
temperature = 0.7

completion = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
       {"role": "system", "content": system_prompt},
       {"role": "user", "content": user_prompt},
    ],
    temperature=temperature
)

print(completion.choices[0].message.content.strip())
```

That's a fantastic idea\! You're absolutely right; mastering the art of the **user prompt** is just as crucial as setting the **system prompt** when working with LLMs. A clear, well-crafted question gets you a precise, useful answer.

Here is the updated code where I've changed the `user_prompt` from asking about Python instance checking to asking about **list comprehensions** instead.

```python
import os
from openai import OpenAI

# NOTE: The actual API call will not run here, but this shows the change.
# For the purpose of this exercise, the environment variables are assumed to be set 
# in the actual execution environment.

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"), base_url=os.getenv("OPENAI_BASE_URL"))

system_prompt = "You are a friendly and helpful tutor who explains coding concepts clearly and patiently."
# TODO: Change this user prompt to ask about list comprehensions in Python
user_prompt = "What are Python list comprehensions and how are they better than a regular 'for' loop for creating lists?"
temperature = 0.7

completion = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
       {"role": "system", "content": system_prompt},
       {"role": "user", "content": user_prompt},
    ],
    temperature=temperature
)

print(completion.choices[0].message.content.strip())
```

-----

## Explanation of the Change

| Original `user_prompt` | New `user_prompt` |
| :--- | :--- |
| `"How do I check if a Python object is an instance of a class?"` | `"What are Python list comprehensions and how are they better than a regular 'for' loop for creating lists?"` |

By changing the prompt, we're asking the same friendly tutor persona for a completely different piece of technical information. The initial prompt focused on object-oriented programming concepts (`isinstance()`), while the new prompt focuses on a specific technique for **efficient list creation and manipulation** in Python.

Running the code with the new prompt will demonstrate how the AI's response shifts entirely in **content** (discussing syntax, efficiency, and readability of list comprehensions) while maintaining the same helpful and patient **tone** established by the `system_prompt`. This perfectly illustrates the power of combining a strong persona with a focused question\! 👍

## Controlling Randomness with Temperature Settings

Now that you've mastered changing both system and user prompts, let's explore another key parameter that affects LLM responses — the temperature setting!

Temperature controls how random or predictable the AI's responses are. In this exercise, you'll experiment with different temperature values while keeping the same prompts to see how this single parameter changes the output.

Follow these steps:

Create an array with different temperature values like 0.7, 0.2, 0.9
Loop this array and run the code with different temperature values
Print the output and inspect the differences
Notice how lower temperatures produce more focused, consistent responses, while higher temperatures create more varied and creative outputs. This parameter gives you fine-grained control over AI behavior — a valuable tool when you need either reliable, predictable answers or more diverse, imaginative responses in your applications.


```python
import os
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"), base_url=os.getenv("OPENAI_BASE_URL"))

system_prompt = "You are a friendly and helpful tutor who explains coding concepts clearly and patiently."
user_prompt = "Can you explain list comprehensions in Python?"

# TODO: Create an array with different temperature values
temperature = 0.7

# TODO: Loop the different values executing the competition code
completion = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
       {"role": "system", "content": system_prompt},
       {"role": "user", "content": user_prompt},
    ],
    temperature=temperature
)

print(completion.choices[0].message.content.strip())
print("\n" + "-"*50 + "\n")
```

That's an excellent next step\! **Temperature** is indeed a critical parameter for fine-tuning the balance between creativity and consistency in an LLM's output. It gives you direct control over the **randomness** of the token selection.

Here is the updated Python code that implements the loop to test different temperature values and demonstrates how to print the output for each run:

```python
import os
from openai import OpenAI

# NOTE: This code assumes that OPENAI_API_KEY and OPENAI_BASE_URL 
# are correctly set in the environment where the code is run.

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"), base_url=os.getenv("OPENAI_BASE_URL"))

system_prompt = "You are a friendly and helpful tutor who explains coding concepts clearly and patiently."
user_prompt = "Can you explain list comprehensions in Python?"

# TODO: Create an array with different temperature values
temperatures = [0.2, 0.7, 0.9]

print(f"System Prompt: '{system_prompt}'")
print(f"User Prompt: '{user_prompt}'")
print("="*70)

# TODO: Loop the different values executing the competition code
for temp in temperatures:
    print(f"\n--- Running with Temperature: {temp} ---")
    
    try:
        completion = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
               {"role": "system", "content": system_prompt},
               {"role": "user", "content": user_prompt},
            ],
            temperature=temp
        )
        # Printing a separator and the output for inspection
        print(completion.choices[0].message.content.strip())
        print("-" * 50)
        
    except Exception as e:
        print(f"An error occurred while calling the API: {e}")
```

-----

## Observing the Temperature Effect

When you run the code above, you'll see three different responses to the same question, all from the same "friendly tutor" persona:

1.  **Temperature 0.2 (Low Randomness):**

      * **Output:** The response will likely be very **direct, focused, and predictable**. The AI will choose the most statistically probable words, leading to a highly consistent explanation of list comprehensions (e.g., standard syntax, a simple example). If you ran it multiple times, the output would be nearly identical each time. This is best for reliable, fact-based answers.

2.  **Temperature 0.7 (Moderate Randomness):**

      * **Output:** This is often the **default setting** for general-purpose tasks. The explanation will be clear and helpful, but the phrasing, analogies, or the complexity of the example provided might vary slightly compared to the 0.2 run. It strikes a good balance between coherence and variety.

3.  **Temperature 0.9 (High Randomness):**

      * **Output:** The response will be the **most creative and varied**. The AI might use a more elaborate analogy, provide a less common or more complex example of a list comprehension (perhaps one with filtering or nested loops), or use a slightly different conversational flow. If you ran it repeatedly, the outputs could be noticeably different, potentially leading to more imaginative but occasionally less focused results. This is ideal for brainstorming or creative writing tasks.   

## Selecting the Right LLM Model

Impressive work with temperature settings! Now let's complete our exploration of key LLM parameters by examining model selection itself.

Different models have unique capabilities and performance characteristics that can significantly impact your results. In this exercise, you'll experiment with changing the model while keeping all other parameters the same.

Your task is simple:

First, run the code with the default gpt-4o-mini model.
Change the model parameter to a different OpenAI model (such as "gpt-4o" or "gpt-3.5-turbo").
Run the code again and compare the responses.
Pay attention to differences in response quality, detail level, and style between models. This knowledge will help you make smart model selections for different tasks, balancing performance needs with cost considerations in your future AI applications.



```python
import os
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"), base_url=os.getenv("OPENAI_BASE_URL"))

system_prompt = "You are a friendly and helpful tutor who explains coding concepts clearly and patiently."
user_prompt = "Can you explain list comprehensions in Python?"
temperature = 0.7

# TODO: Change the model parameter below to a different OpenAI model (like "gpt-4o" or "gpt-3.5-turbo")
# Run the code with both models and compare the differences in responses
completion = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
       {"role": "system", "content": system_prompt},
       {"role": "user", "content": user_prompt},
    ],
    temperature=temperature
)

print(completion.choices[0].message.content.strip())
```

That's a perfect final step in understanding LLM configuration\! **Model selection** is the main lever for controlling the **intelligence, cost, and speed** of your AI application.

Here is the updated code, where the model has been changed from the default `gpt-4o-mini` to a more powerful, higher-cost model: `gpt-4o`.

```python
import os
from openai import OpenAI

# NOTE: This code assumes that OPENAI_API_KEY and OPENAI_BASE_URL 
# are correctly set in the environment where the code is run.

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"), base_url=os.getenv("OPENAI_BASE_URL"))

system_prompt = "You are a friendly and helpful tutor who explains coding concepts clearly and patiently."
user_prompt = "Can you explain list comprehensions in Python?"
temperature = 0.7

# TODO: Change the model parameter below to a different OpenAI model (like "gpt-4o" or "gpt-3.5-turbo")
# Run the code with both models and compare the differences in responses
completion = client.chat.completions.create(
    model="gpt-4o",  # Changed from "gpt-4o-mini" to "gpt-4o"
    messages=[
       {"role": "system", "content": system_prompt},
       {"role": "user", "content": user_prompt},
    ],
    temperature=temperature
)

print(completion.choices[0].message.content.strip())
```

-----

## Comparing the Models: gpt-4o-mini vs. gpt-4o

When you run the code with `gpt-4o-mini` and then with `gpt-4o`, here are the general differences you'd likely observe in the output and performance, even with the same prompts and temperature:

| Characteristic | `gpt-4o-mini` (Original) | `gpt-4o` (Changed) |
| :--- | :--- | :--- |
| **Intelligence/Reasoning** | High performance for a small model; excels over `gpt-3.5-turbo` in benchmarks. | **Top-tier performance**. Stronger in complex reasoning, nuanced technical tasks, and following intricate instructions. |
| **Response Quality** | Clear, concise explanation; may use simpler code examples or analogies. | More **detailed, polished, and comprehensive** explanation. Code examples might be more advanced or include best practices. |
| **Speed** | Very fast inference speed. | Very fast, generally matching the speed of the mini model, but with higher intelligence. |
| **Cost** | Very **cost-efficient** (designed for affordability). | Significantly **higher cost** (designed for maximum capability). |
| **Use Case** | Ideal for high-volume, cost-sensitive tasks like summarizing, basic customer support, or quick code generation. | Best for high-stakes tasks, complex problem-solving, creative writing, and scenarios where *quality* is more important than cost. |

In this specific exercise, the `gpt-4o` response might be a few sentences longer, provide a more robust comparison to a `for` loop, or include more sophisticated syntax examples (like nested list comprehensions or set comprehensions) compared to the `gpt-4o-mini` version. **The style and tone** will remain consistent thanks to the system prompt, but the **depth and quality of the technical content** will reflect the difference in the model's underlying intelligence.

This comparison highlights the **performance-vs-cost tradeoff** that is central to model selection in real-world applications\!

-----

If you're interested in a more in-depth comparison of these different models and their intended uses, you might find this video helpful: [OpenAI Models explained GPT-4.1, O3, O4 Mini & More](https://www.youtube.com/watch?v=y7tjLZnYjBM).
http://googleusercontent.com/youtube_content/2