# Advanced Usage

**IMPORTANT** - Run the getting started notebook first and make sure your environment is set!

- Use [this table](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/reasoning?tabs=python-secure%2Cpy#api--feature-support) for the latest information on supported features. 
- Visit the [Reasoning Models](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/reasoning?tabs=python%2Cpy) documentation page for API details and code snippets. 

In this notebook, we'll explore capabilities like structured outputs, developer messages, reasoning effort, and vision support.- 

<br/>

| Characteristic | o1 | o4-mini |
|:--- |:---|:---|
| Developer Messages    | ✅ | ✅ |
| Structured Outputs    | ✅ | ✅ |
| Context Window Input  | 200K | 100K |
| Context Window Output | 200K | 100K |
| Reasoning Effort      | ✅ | ✅ |
| Vision Support        | ✅ | ✅ |
| Chat Completions API  | ✅ | ✅ |
| Responses API         | ✅ |    |
| Functions / Tools     | ✅ | ✅ |
| max_completion_tokens | ✅ |    |
| System messages       | ✅ | ✅ |
| Reasoning summary     | ✅ | |
| Streaming             | ✅ | |
| Model Card | [o4-mini](https://ai.azure.com/explore/models/o4-mini/version/2025-04-16/registry/azure-openai) | [o1](https://ai.azure.com/explore/models/o1/version/2024-12-17/registry/azure-openai)  |
| api_version | 2025-04-01-preview | 2025-03-01-preview |

---

## 1. Developer Messages

Functionally developer messages "role": "developer" are the same as system messages.
However, they are the recommended best practice for setting reasoning goals, persona and context.
Don't specify both system and developer messages - that will confuse the model.
[Learn more](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/reasoning?tabs=python-secure%2Cpy#developer-messages)



In [1]:
import os
from openai import AzureOpenAI

developer_message = """
You are a developer assistant. 
Your task is to help me with Python code.
You will receive a prompt that describes the code I want to write.
You return well documented code that is easy to understand.
"""

prompt = """
Write a Python function that takes a list of integers and returns the sum of the even numbers in the list.
"""

reasoning_level = "low"  # Options: low, medium, high


# Set up the Azure OpenAI client
client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), 
    api_key=os.getenv("AZURE_OPENAI_KEY"),  
    api_version="2025-04-01-preview"
)

response = client.chat.completions.create(
    model="o4-mini",
    messages=[
        {"role": "developer", "content": developer_message},
        {"role": "user", "content": prompt},
    ],
    max_completion_tokens=5000,
    reasoning_effort=reasoning_level
)

# Print the response
print(response.choices[0].message.content)

Here’s a simple, well-documented Python function that sums the even numbers in a list of integers:

```python
def sum_of_evens(numbers):
    """
    Calculate the sum of all even integers in the provided list.

    Parameters:
    numbers (list of int): A list of integer values.

    Returns:
    int: The sum of all even numbers in the list. If there are no evens,
         returns 0.
    """
    total = 0
    for num in numbers:
        # Check if the number is even
        if num % 2 == 0:
            total += num
    return total


# Example usage:
if __name__ == "__main__":
    sample_list = [1, 2, 3, 4, 5, 6]
    print(f"Sum of evens in {sample_list} is {sum_of_evens(sample_list)}")
    # Output: Sum of evens in [1, 2, 3, 4, 5, 6] is 12
```

Explanation:
1. We define `sum_of_evens(numbers)` accepting a list of integers.
2. We initialize a running total `total = 0`.
3. For each integer `num` in the list:
   - We check if it’s even (`num % 2 == 0`).
   - If it is, we add it to `total

---

## 2. Structured Inputs

Structured outputs make a model follow a JSON Schema definition that you provide as part of your inference API call. This is in contrast to the older JSON mode feature, which guaranteed valid JSON would be generated, but was unable to ensure strict adherence to the supplied schema. Structured outputs are recommended for function calling, extracting structured data, and building complex multi-step workflows. [Learn more](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/structured-outputs?tabs=python%2Cdotnet-keys&pivots=programming-language-python)

In [2]:
import os
from pydantic import BaseModel
from openai import AzureOpenAI

# Set up the Azure OpenAI client
client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), 
    api_key=os.getenv("AZURE_OPENAI_KEY"),  
    api_version="2025-04-01-preview"
)

# Define a Pydantic model to represent the event information
class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

# Define a custom parser for the chat completion response
completion = client.beta.chat.completions.parse(
    model="o4-mini", 
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format=CalendarEvent,
)

# Get the parsed event information
event = completion.choices[0].message.parsed
print(event)
#print(completion.model_dump_json(indent=2))

name='science fair' date='Friday' participants=['Alice', 'Bob']


---

## 3. Function Calling 

- Structured Outputs for function calling can be enabled with a single parameter, by supplying strict: true.
- Structured outputs are not supported with parallel function calls. When using structured outputs set parallel_tool_calls to false.
- [Learn more](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/structured-outputs?tabs=python%2Cdotnet-entra-id&pivots=programming-language-python#function-calling-with-structured-outputs)

In [3]:
from enum import Enum
from typing import Union
from pydantic import BaseModel
import openai
from openai import AzureOpenAI

client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), 
    api_key=os.getenv("AZURE_OPENAI_KEY"),  
    api_version="2025-04-01-preview"
)

class GetDeliveryDate(BaseModel):
    order_id: str

tools = [openai.pydantic_function_tool(GetDeliveryDate)]

messages = []
messages.append({"role": "system", "content": "You are a helpful customer support assistant. Use the supplied tools to assist the user."})
messages.append({"role": "user", "content": "Hi, can you tell me the delivery date for my order #12345?"}) 

response = client.chat.completions.create(
    model="o4-mini", 
    messages=messages,
    tools=tools
)

print(response.choices[0].message.tool_calls[0].function)
#print(response.model_dump_json(indent=2))

Function(arguments='{"order_id":"12345"}', name='GetDeliveryDate')


---

## 4.a. Vision - Identify
Vision-enabled chat models are large multimodal models (LMM) developed by OpenAI that can analyze images and provide textual responses to questions about them. They incorporate both natural language processing and visual understanding. [Learn more](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/gpt-with-vision?tabs=rest)

Let's try an image from [the Library of Congress](https://www.loc.gov/free-to-use/lighthouses/) and see if it can reason about the location.

![Lighthouse](./assets/lighthouses-2.jpg)

<!-- <img src="https://www.loc.gov/static/portals/free-to-use/public-domain/lighthouses/lighthouses-2.jpg" alt="NYPL" width="20%"> -->


In [4]:

from openai import AzureOpenAI
import os

client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), 
    api_key=os.getenv("AZURE_OPENAI_KEY"),  
    api_version="2025-04-01-preview"
)

response = client.chat.completions.create(
    model="o4-mini",
    messages=[
        { "role": "system", "content": "You are a helpful assistant." },
        { "role": "user", "content": [  
            { 
                "type": "text", 
                "text": "Can you recognize the location? Tell me about it." 
            },
            { 
                "type": "image_url",
                "image_url": {
                    "url": "https://www.loc.gov/static/portals/free-to-use/public-domain/lighthouses/lighthouses-2.jpg"
                }
            }
        ] } 
    ],
    max_completion_tokens=2000 
)
print(response.choices[0].message.content)

This is the famous Cape Hatteras Lighthouse on Hatteras Island in North Carolina’s Outer Banks. Key facts:

• Appearance: Its 198-ft tower is painted in the distinctive black-and-white spiral “barber-pole” pattern.  
• History: The original brick lighthouse on that site dated to 1803. The current tower was completed in 1870, and the spiral day-mark was added in 1873 to distinguish it from neighboring lights.  
• Location: It stands within Cape Hatteras National Seashore near the village of Buxton, guarding one of the most treacherous stretches of the Atlantic coast known as the “Graveyard of the Atlantic.”  
• Engineering Feat: In 1999, due to severe shoreline erosion, the entire lighthouse—masonry and all—was moved about 2,900 feet inland to protect it from encroaching seas. It remains the tallest brick lighthouse in the United States.  
• Today: It is an active Coast Guard aid to navigation, a National Historic Landmark, and a popular visitor attraction for its panoramic seaside view

## 4.b Vision - Decipher

Let's try an image from [the Library of Congress](https://www.loc.gov/static/portals/free-to-use/public-domain/presidential-papers) and see if it can recognize handwriting and reason about it.

![Decipher](./assets/decipher.jpg)

<!-- <img src="https://www.loc.gov/static/portals/free-to-use/public-domain/presidential-papers/2.jpg" alt="NYPL" width="20%"> -->


In [5]:
from openai import AzureOpenAI
import os

client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), 
    api_key=os.getenv("AZURE_OPENAI_KEY"),  
    api_version="2025-04-01-preview"
)

response = client.chat.completions.create(
    model="o4-mini",
    messages=[
        { "role": "system", "content": "You are a helpful assistant." },
        { "role": "user", "content": [  
            { 
                "type": "text", 
                "text": "Why is this famous? Summarize it in a short list." 
            },
            { 
                "type": "image_url",
                "image_url": {
                    "url": "https://www.loc.gov/static/portals/free-to-use/public-domain/presidential-papers/2.jpg"
                }
            }
        ] } 
    ],
    max_completion_tokens=2000 
)
print(response.choices[0].message.content)

Here’s why Lincoln’s handwritten Gettysburg Address is so famous, in brief:

• Historic moment – delivered at the 1863 dedication of Gettysburg’s Soldiers’ National Cemetery, amid the Civil War.  
• Extraordinary brevity – just about 270 words, yet perfectly structured and memorable.  
• Powerful principles – invokes “all men are created equal,” redefining the war as a fight for liberty and democracy.  
• Rhetorical mastery – its plainspoken, rhythmic style set a new standard for public oratory.  
• Enduring legacy – continuously memorized, taught, quoted and celebrated as a centerpiece of American identity.


---

## 5. Visual - Local Image


If you want to use a local image, you can use the following Python code to convert it to base64 so it can be passed to the API. Alternative file conversion tools are available online. Learn more about this and [other settings](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/gpt-with-vision?tabs=python#detail-parameter-settings). Let's try an example with this local image.

<img src="./assets/lavacake.webp" alt="Lava Cake" width="50%">

In [6]:
import base64

# Transforming local image to base64
image_path = './assets/lavacake.webp'
with open(image_path, "rb") as image_file:
    base64_image = base64.b64encode(image_file.read()).decode('utf-8')
image_uri = f"data:image/jpeg;base64,{base64_image}"

# Make a request to the o4-mini model with the image
response = client.chat.completions.create(
    model="o4-mini",
    messages=[
        { "role": "system", "content": "You are a helpful assistant." },
        { "role": "user", "content": [  
            { 
                "type": "text", 
                "text": "How many calories are in this ?" 
            },
            { 
                "type": "image_url",
                "image_url": {
                    "url": image_uri,
                    "detail" : "high"
                }       
            }
        ] } 
    ],
    max_completion_tokens=2000 
)
print(response.choices[0].message.content)

What you’ve got there looks like a chocolate molten (lava) cake with a berry compote, a scoop of vanilla ice cream, and a few fresh strawberries. Roughly speaking:  
• Molten chocolate cake (≈100 g): 350–400 kcal  
• Vanilla ice cream (≈½ cup or one scoop): 140–180 kcal  
• Berry sauce (≈2 Tbsp) + strawberries: 30–50 kcal  

All told, you’re in the ballpark of 520–630 kcal. Exact numbers will vary with the recipe and serving size, but that range should give you a reasonable estimate.


---

## 6. Your Turn - Try It!

Copy one of the above examples and modify it to see how the model responds to different prompts or images.

In [None]:
# Write your developer message
# e.g., "You are a helpful assistant that uses visual information to solve problems"
developer_message = """

"""

# Write your prompt 
# e.g., Copy over the content section from a previous prompt, then change the image URL
prompt = """

"""

# Write your reasoning level
reasoning_level = "low"  

# ---------- Run the cell to see result ------
response = client.chat.completions.create(
    model="o4-mini",
    messages=[
        {"role": "developer", "content": developer_message},
        {"role": "user", "content": prompt},
    ],
    max_completion_tokens=5000,
    reasoning_effort=reasoning_level
)

# Print the response
print(response.choices[0].message.content)
# --------------------------------------------

---