# Get Started With Reasoning Models

This is your first introduction to working with reasoning models, code-first. In this notebook, we'll primarily use the o4-mini model. However, we will also explore the `o1` model briefly, to understand how the API works for both. Use [this table](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/reasoning?tabs=python-secure%2Cpy#api--feature-support) for the latest information on supported features. And visit the [Reasoning Models](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/reasoning?tabs=python%2Cpy) documentation page for API details and code snippets. Here's a quick summary for convenience:

<br/>

| Characteristic | o1 | o4-mini |
|:--- |:---|:---|
| Developer Messages    | ✅ | ✅ |
| Structured Outputs    | ✅ | ✅ |
| Context Window Input  | 200K | 100K |
| Context Window Output | 200K | 100K |
| Reasoning Effort      | ✅ | ✅ |
| Vision Support        | ✅ | ✅ |
| Chat Completions API  | ✅ | ✅ |
| Responses API         | ✅ |    |
| Functions / Tools     | ✅ | ✅ |
| max_completion_tokens | ✅ |    |
| System messages       | ✅ | ✅ |
| Reasoning summary     | ✅ | |
| Streaming             | ✅ | |
| Model Card | [o4-mini](https://ai.azure.com/explore/models/o4-mini/version/2025-04-16/registry/azure-openai) | [o1](https://ai.azure.com/explore/models/o1/version/2024-12-17/registry/azure-openai)  |
| api_version | 2025-04-01-preview | 2025-03-01-preview |

---

## 1. Quickstart

In this section, we'll do a quick check to make sure we have the right dependencies installed. We'll also test both o1 and o4-mini models but we'll primarily use the `o4-mini` unless explicitly stated otherwise.

### 1.1 Install Python Dependencies

In [None]:
# # Install and Upgrade Pip
# !pip install --upgrade pip --quiet

# # Install Required Packages
# !pip install -q openai azure-identity python-dotenv --quiet

# # You may need to updated your OpenAI Python library
# !pip install openai --upgrade --quiet


### 1.2 Check Env Variables

In [1]:
import os
from dotenv import load_dotenv
load_dotenv()

if not os.getenv("AZURE_OPENAI_ENDPOINT"):
    print("Missing env: AZURE_OPENAI_ENDPOINT")
elif not os.getenv("AZURE_OPENAI_API_KEY"):
    print("Missing env: AZURE_OPENAI_API_KEY")
else: 
    print("Azure OpenAI endpoint and key are set.")

Azure OpenAI endpoint and key are set.


### 1.3 Define Utility Functions

In [2]:
# Print stats for the response
def pretty_print(response, response_time):
    """
    Prints the response details in a formatted manner.
    Args:
        response (openai.types.chat.chat_completion.ChatCompletion): The response object containing the generated completion.
        response_time (float): The time taken to get the response.
    """
    print(".........................")
    print(f"Response time: {response_time:.2f} seconds")
    print(f"Total Tokens: {response.usage.total_tokens}")
    print(f"Prompt Tokens: {response.usage.prompt_tokens}")
    print(f"Completion Tokens: {response.usage.completion_tokens}")
    print(f"Reasoning Tokens: {response.usage.completion_tokens_details.reasoning_tokens}")
    print(f"Output Tokens: {response.usage.total_tokens - response.usage.completion_tokens_details.reasoning_tokens}")
    print(f"\nResponse:\n {response.choices[0].message.content}")
    print(f".........................")


In [3]:

from openai import AzureOpenAI
import time

# Chat with o1
def o1_chat(prompt="hi", reasoning_level="medium", developer_message="You are a helpful assistant"):
    """
    Sends a chat completion request to the Azure OpenAI API o1 model.
    Args:
        prompt (str): The input prompt to generate a response for.
        reasoning_level (str): The reasoning effort level ('low', 'medium', 'high').
        developer_message (str): The developer message to set the context for the assistant.
    Returns:
        response (openai.types.chat.chat_completion.ChatCompletion): The response object containing the generated completion.
    """
    client = AzureOpenAI(
        azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), 
        api_key=os.getenv("AZURE_OPENAI_KEY"),  
        api_version="2025-04-01-preview"
    )
    try:
        request_time = time.time()
        response = client.chat.completions.create(
            model="o1",
            messages=[
                {"role": "developer", "content": developer_message},
                {"role": "user", "content": prompt},
            ],
            max_completion_tokens=5000,
            reasoning_effort=reasoning_level
        )
        response_time = time.time() - request_time
        pretty_print(response, response_time)
        return response

    except Exception as e:
        print(f"[ERROR] {e}")
        return None


In [4]:
# Chat with o4_mini
def o4mini_chat(prompt="hi", reasoning_level="medium", developer_message="You are a helpful assistant", response_format=None):
    """
    Sends a chat completion request to the Azure OpenAI API o4-mini model.
    Args:
        prompt (str): The input prompt to generate a response for.
        reasoning_level (str): The reasoning effort level ('low', 'medium', 'high').
        developer_message (str): The developer message to set the context for the assistant.
        response_format (BaseModel, optional): The expected structured output format for the response.
    Returns:
        response (openai.types.chat.chat_completion.ChatCompletion): The response object containing the generated completion.
    """
    client = AzureOpenAI(
        azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), 
        api_key=os.getenv("AZURE_OPENAI_KEY"),  
        api_version="2025-04-01-preview"
    )
    try:
        request_time = time.time()
        response = client.chat.completions.create(
            model="o4-mini",
            messages=[
                {"role": "developer", "content": developer_message},
                {"role": "user", "content": prompt},
            ],
            max_completion_tokens=5000,
            reasoning_effort=reasoning_level,
            response_format=response_format
        )
        response_time = time.time() - request_time
        pretty_print(response, response_time)
        return response

    except Exception as e:
        print(f"[ERROR] {e}")
        return None

## 2.Test Models

In [5]:
# Test o1
response = o1_chat(
    prompt="How many p's in hippopotamus?",
    reasoning_level="medium",
    developer_message="You are a helpful assistant."
)

.........................
Response time: 3.77 seconds
Total Tokens: 258
Prompt Tokens: 25
Completion Tokens: 233
Reasoning Tokens: 192
Output Tokens: 66

Response:
 There are three “p”s in the word “hippopotamus.”
.........................


In [6]:
# Test o4mini
response = o4mini_chat(
    prompt="How many p's in hippopotamus?",
    reasoning_level="low",
    developer_message="You are a helpful assistant."
)

.........................
Response time: 7.60 seconds
Total Tokens: 188
Prompt Tokens: 25
Completion Tokens: 163
Reasoning Tokens: 128
Output Tokens: 60

Response:
 The word “hippopotamus” contains 3 letter p’s.
.........................


---

## 2. Let's Prompt!

In [7]:
# -------------------
# MATH EXAMPLE
# -------------------
response = o4mini_chat(
    prompt="A train travels at 60 mph. How long does it take to travel 90 miles?",
    reasoning_level="low",
    developer_message="You are a math tutor. Explain the solution"
)
print(response.choices[0].message.content)

.........................
Response time: 1.95 seconds
Total Tokens: 255
Prompt Tokens: 38
Completion Tokens: 217
Reasoning Tokens: 64
Output Tokens: 191

Response:
 To find the time t it takes to go 90 miles at 60 mph, use the relation  
   time = distance ÷ speed.  

Here, distance = 90 mi, speed = 60 mi/h, so  
   t = 90 mi ÷ 60 mi/h = 1.5 h.  

In hours and minutes, 1.5 h = 1 hour + 0.5 hour  
0.5 hour = 0.5×60 min = 30 min.  

Answer: It takes 1 hour 30 minutes.
.........................
To find the time t it takes to go 90 miles at 60 mph, use the relation  
   time = distance ÷ speed.  

Here, distance = 90 mi, speed = 60 mi/h, so  
   t = 90 mi ÷ 60 mi/h = 1.5 h.  

In hours and minutes, 1.5 h = 1 hour + 0.5 hour  
0.5 hour = 0.5×60 min = 30 min.  

Answer: It takes 1 hour 30 minutes.


In [8]:
# -------------------
#  MATH REASONING
# -------------------
prompt = "Jane has twice as many apples as Tom. Together they have 18 apples. How many does each person have?"
response = o4mini_chat(prompt)
print(response.choices[0].message.content)

.........................
Response time: 10.56 seconds
Total Tokens: 239
Prompt Tokens: 38
Completion Tokens: 201
Reasoning Tokens: 128
Output Tokens: 111

Response:
 Let Tom have T apples. Then Jane has 2T, and together  
T + 2T = 18  
3T = 18  
T = 6  

So Tom has 6 apples and Jane has 2·6 = 12 apples.
.........................
Let Tom have T apples. Then Jane has 2T, and together  
T + 2T = 18  
3T = 18  
T = 6  

So Tom has 6 apples and Jane has 2·6 = 12 apples.


In [9]:
# -------------------
#  SCIENCE REASONING
# -------------------
prompt = "Why does a metal spoon feel colder than a wooden spoon when left in the same room?"
response = o4mini_chat(prompt)
print(response.choices[0].message.content)

.........................
Response time: 4.16 seconds
Total Tokens: 652
Prompt Tokens: 33
Completion Tokens: 619
Reasoning Tokens: 320
Output Tokens: 332

Response:
 Although both spoons settle at the same room temperature, your hand “feels” temperature by sensing heat flow. Metal and wood differ sharply in how quickly they conduct heat away from (or toward) your skin:

1. Thermal conductivity  
   – Metals (e.g. steel, silver) have thermal conductivities tens to hundreds of times higher than wood.  
   – At the moment you touch the metal spoon, it draws heat from your warmer hand much faster than the wood does.

2. Heat flux and sensation  
   – Thermoreceptors in your skin respond to the rate of temperature change (how fast heat is lost), not just the absolute temperature difference.  
   – A contact surface that pulls heat away more rapidly feels “colder,” even if its actual temperature is identical.

3. Thermal effusivity  
   – More generally, a material’s thermal effusivity (√[k·

In [10]:
# -------------------
#  MULTI-STEP PLANNING
# -------------------
prompt = "Design a basic weekly study schedule to learn Python in 6 weeks, assuming 1 hour per weekday."
response = o4mini_chat(prompt)
print(response.choices[0].message.content)

.........................
Response time: 47.22 seconds
Total Tokens: 1542
Prompt Tokens: 36
Completion Tokens: 1506
Reasoning Tokens: 640
Output Tokens: 902

Response:
 Here’s a 6-week, Monday–Friday (1 hr/day) plan to take you from zero to a simple Python project in about 30 hours total.

Week 1 – Python Fundamentals  
 Mon  Setup & Hello World  
 • Install Python & an editor/IDE (e.g. VS Code, PyCharm)  
 • Run the REPL, write your first “Hello, world!” script  
 Tue  Variables & Data Types  
 • int, float, str, bool  
 • Assignment, basic arithmetic, type conversion  
 Wed  Strings  
 • Indexing, slicing, methods (len, .upper(), .split(), etc.)  
 • f-strings & .format()  
 Thu  Basic I/O & Comments  
 • input(), print(), formatting output  
 • Writing clear comments and docstrings  
 Fri  Mini Practice  
 • Write a script that asks for name/age and prints a greeting & birth year  

Week 2 – Flow Control  
 Mon  Booleans & Comparisons  
 • ==, !=, >, <, >=, <=, and, or, not  
 Tue  

In [11]:
# --------------------------------
# CONSTRAINT-BASED SCHEDULING
# --------------------------------
prompt = "Three employees—Alex, Sam, and Riley—must each work one 4-hour shift today. Only Alex and Riley can work before noon, and Sam can’t work past 4 PM. Create a valid schedule."
response = o4mini_chat(prompt)
print(response.choices[0].message.content)

.........................
Response time: 5.80 seconds
Total Tokens: 826
Prompt Tokens: 59
Completion Tokens: 767
Reasoning Tokens: 640
Output Tokens: 186

Response:
 Here’s one way to do it using three back‐to‐back, non‐overlapping 4-hour shifts:

• Alex:  8 AM – 12 PM  
• Sam: 12 PM – 4 PM  
• Riley: 4 PM – 8 PM  

– Only Alex (8–12) and Riley (4–8) ever start before noon.  
– Sam’s shift ends at 4 PM, so he never works past 4.
.........................
Here’s one way to do it using three back‐to‐back, non‐overlapping 4-hour shifts:

• Alex:  8 AM – 12 PM  
• Sam: 12 PM – 4 PM  
• Riley: 4 PM – 8 PM  

– Only Alex (8–12) and Riley (4–8) ever start before noon.  
– Sam’s shift ends at 4 PM, so he never works past 4.


In [None]:
# -------------------
# TRY YOUR OWN
# -------------------

# Write your own prompt here
prompt = ""  

# Change these if you want to experiment
reasoning_level = "medium"
developer_message = "You are a helpful assistant."
response = o4mini_chat(prompt, reasoning_level, developer_message)
print(response.choices[0].message.content)