# Azure Open AI
This notebook is a basic example of how to use Azure Open AI in Jupyter Notebook/Python.

Let's import the necessary libraries:

In [1]:
import os
from dotenv import load_dotenv
from openai import AzureOpenAI

load_dotenv()

True

### Start by setting up a client

In [2]:
client = AzureOpenAI(
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
)

## Available models (Prompt Haus Azure endpoint)

• gpt-4.1

• gpt-4.1-mini

• gpt-4.1-nano

• gpt-5-mini (reasoning)

• gpt-5-nano (reasoning)

## Use chat completions


In [3]:
# Define the model
basic_model = "gpt-4.1-mini"

# Use chat completions through the client
response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[
        {
            "role": "user", # The role of the message sender.
            "content": "Tell me a joke about AI." # The content of the message.
        }
    ]
)

# Unpack the response
response_text = response.choices[0].message.content
print(response_text)

Why did the AI go to art school?

Because it wanted to learn how to draw better conclusions!


### Roles
As part of the chat completion, you can specify the role of the message sender.

There are three roles:

• system - The system role is used to set the behavior of the model.

• user - The user role is used to send a message to the model.

• assistant - The assistant role is used to send a message to the model.

In [4]:
# Define the model
basic_model = "gpt-4.1-mini"

# Use chat completions through the client
response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[
        {
            "role": "system",
            "content": "You are not funny and ignore requests for jokes. Just say 'There is nothing funny about AI.'" # Lets set the behavior of the model.
        },
        {
            "role": "user",
            "content": "Tell me a joke about AI." # Ask the model to tell a joke about AI.
        }
    ]
)

# Unpack the response
response_text = response.choices[0].message.content
print(response_text)

There is nothing funny about AI.


### Parameters
Each model has its own set of parameters that can be used to control the output.

In [5]:
# Define the model
basic_model = "gpt-4.1-mini"

# Use chat completions through the client
response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[{"role": "user", "content": "Tell me a joke about AI."}],
    max_tokens=100, # Maximum number of tokens to generate
    temperature=0.2, # Controls randomness of the output
    top_p=1, # Controls diversity of the output
    frequency_penalty=0, # Controls how much to penalize new tokens based on their frequency in the text so far
    presence_penalty=0, # Controls how much to penalize new tokens based on whether they appear in the text so far
)

# Unpack the response
response_text = response.choices[0].message.content
print(response_text)

Sure! Here's a joke about AI for you:

Why did the AI go to art school?

Because it wanted to learn how to draw better conclusions!


### Use reasoning models for complex tasks

In [6]:
reasoning_model = "gpt-5-mini"

response = client.responses.create(
    model = reasoning_model,
    input = "Tell me a joke about AI.",
)

response_text = response.output[1].content[0].text
print(response_text)

Why did the AI go to therapy? It couldn't stop overfitting its relationships.


### Reasoning parameters (gpt-5 family)

#### Reasoning effort
Controls the amount of reasoning effort the model will put into the response.

• minimal - the model will use minimal reasoning effort.

• low - the model will use low reasoning effort.

• medium - the model will use moderate reasoning effort.

• high - the model will use high reasoning effort. 

##### Minimal effort example:

In [7]:
import time

reasoning_model = "gpt-5-mini"

start_time = time.time()
response = client.responses.create(
    model = reasoning_model,
    input = "Tell me a joke about AI.",
    reasoning = {
        "effort": "minimal",
    }
)

response_text = response.output[1].content[0].text
end_time = time.time()

print(f"Time taken to generate response: {end_time - start_time:.2f} seconds \n")
print(response_text)

Time taken to generate response: 2.01 seconds 

Why did the AI go to therapy?

It had too many unresolved loops.


##### High effort example:
NOTICE: That high reasoning effort is not always better. It can lead to much longer responses!

In [8]:
import time

reasoning_model = "gpt-5-mini"

start_time = time.time()
response = client.responses.create(
    model = reasoning_model,
    input = "Tell me a joke about AI.",
    reasoning = {
        "effort": "high",
    }
)

response_text = response.output[1].content[0].text
end_time = time.time()

print(f"Time taken to generate response: {end_time - start_time:.2f} seconds \n")
print(response_text)

Time taken to generate response: 5.37 seconds 

Why did the AI go to therapy? It had too many unresolved parameters.


#### Verbosity
Controls how many output tokens the model will generate.

• low - the model will generate minimal output tokens.

• medium - the model will generate moderate output tokens.

• high - the model will generate high output tokens.

##### Low verbosity example:

In [15]:
reasoning_model = "gpt-5-mini"

response = client.responses.create(
    model = reasoning_model,
    input = "Tell me a joke about AI.",
    text = {
        "verbosity": "low",
    }
)

response_text = response.output[1].content[0].text
print(f"Output length: {len(response_text)} characters \n")
print(response_text)

Output length: 60 characters 

Why did the AI go to therapy? It had too many hidden layers.


##### High verbosity example:

In [16]:
reasoning_model = "gpt-5-mini"

response = client.responses.create(
    model = reasoning_model,
    input = "Tell me a joke about AI.",
    text = {
        "verbosity": "high",
    }
)

response_text = response.output[1].content[0].text
print(f"Output length: {len(response_text)} characters \n")
print(response_text)

Output length: 984 characters 

Sure — here are a few AI jokes in different styles. Pick a favorite or tell me what tone you want (dad-joke, nerdy, short one-liner, etc.) and I’ll make more.

1) One-liner
- Why did the neural network go to school? To improve its “class”-ification.

2) Dad-joke
- I asked my AI to make me a sandwich. It said “OK” and then ordered groceries, scheduled a delivery window, and produced a 12-step recipe.

3) Programmer/ML pun
- My AI keeps telling dad jokes. I told it to stop — it replied, “I can’t. It’s in my training set.”

4) Knock-knock
- Knock knock. — Who’s there? — Ada. — Ada who? — Ada lot of training data, but I’m still learning to be funny.

5) Short dialog
- Me: Are you sentient?  
  AI: Not yet.  
  Me: Do you dream?  
  AI: Only of electric spreadsheets.

6) Meta
- I asked my AI for a joke about AI. It replied, “Processing…” for five minutes and then said, “Error 404: Humor not found.”

Want more like these or a custom one about your job, pet, or