In [1]:
#%pip install python-dotenv
#%pip install openai

## First OpenAI API Call

We first show the most basic way to make an API call to OpenAI's GPT model. We'll use a restaurant analogy to explain how API calls work:

- **You (Customer)**: The person making the request
- **OpenAI API (Waiter)**: The messenger that takes your request to the AI
- **GPT Model (Chef)**: The system that processes your request and creates the response

In this example, we'll:
1. Set up the connection to OpenAI (like finding a restaurant)
2. Create a simple prompt (like placing an order)
3. Get the response (like receiving your meal)

This is the simplest form of an API call - no streaming, no complex parameters, just a basic request and response.

Before making your first API call, follow these steps:

1. **Install the OpenAI Python package:**
   ```bash
   pip install openai
   ```
2. **Store your API key securely:**
   - Create a file named `.env` in your project directory.
   - Add your API key to this file like so:
     ```
     OPENAI_API_KEY=sk-...your-key-here...
     ```
3. **Load your API key in Python:**
   - Install the `python-dotenv` package if you haven't already:
     ```bash
     pip install python-dotenv
     ```
   - In your Python code, load the `.env` file and access the API key:
     ```python
     from dotenv import load_dotenv
     import os
     load_dotenv()
     api_key = os.getenv("OPENAI_API_KEY")
     ```

Now you're ready to use the OpenAI API securely in your code!

In [1]:
# Waiter: This is the OpenAI API. You talk to it using the 'openai' Python package.
from openai import OpenAI
import os
from dotenv import load_dotenv

load_dotenv()

# Set your OpenAI API key (replace with your actual key or use an environment variable)
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

In [3]:
# Customer: This is YOU (or your app). You decide what to ask.
prompt = "Explain photosynthesis in simple terms."

# Chef: This is the AI model (like GPT-4). It prepares the response based on your request.
response = client.responses.create(
    model="gpt-4.1",
    input=prompt
)

# The response is delivered back to the customer (you)
result = response.output_text
print("Response from the AI (Chef):")
print(result)


Response from the AI (Chef):
Photosynthesis is the process plants use to make their own food. Here’s how it works in simple terms:

1. **Plants take in sunlight** through their leaves.
2. **They absorb water** from the soil through their roots.
3. **They take in carbon dioxide** from the air.

Using sunlight as energy, plants combine the water and carbon dioxide to make **sugar** (their food) and release **oxygen** back into the air.

So, photosynthesis is how plants turn sunlight, water, and carbon dioxide into food and oxygen!


## Looping — “Ordering Again and Again”

### In the Restaurant Metaphor:

Imagine you're really hungry and want to **order multiple dishes**, one after another:

* First: you ask for spaghetti.
* Then: you ask for a drink.
* Then: dessert.

That’s **looping** — doing something **over and over again**, usually **with slight changes**.

### In Programming/API Terms:

Looping is when your program:

* Sends **multiple API requests** in a row.
* Often in a **`for` loop** or a **`while` loop**.
* Each request might ask a different question or use different data.

#### Why It’s Useful:

* Process a list of texts automatically (e.g., summarizing 100 articles).
* Translate a batch of messages.
* Chat with the model in turns.


In [4]:
# Basic Python loop — no API yet
questions = [
    "What is 1 + 1?",
    "What is the opposite of up?",
    "What is the capital of France?"
]

for q in questions:
    print("Pretend we're asking the AI...")
    print("Question:", q, "\n")

Pretend we're asking the AI...
Question: What is 1 + 1? 

Pretend we're asking the AI...
Question: What is the opposite of up? 

Pretend we're asking the AI...
Question: What is the capital of France? 



In [5]:
# Assume client is already set up with your OpenAI API key
questions = [
    "What is 1 + 1?",
    "What is the opposite of up?",
    "What is the capital of France?"
]

for question in questions:
    print(f"\nCustomer asks: {question}")

    # Send question to the OpenAI model (Chef prepares dish)
    response = client.responses.create(
        model="gpt-4.1",
        input=question
    )

    # Get the model's answer (Waiter brings it back)
    result = response.output_text
    print("AI (Chef) replies:")
    print(result)


Customer asks: What is 1 + 1?
AI (Chef) replies:
1 + 1 = **2**.

Customer asks: What is the opposite of up?
AI (Chef) replies:
The opposite of **up** is **down**.

Customer asks: What is the capital of France?
AI (Chef) replies:
The capital of France is **Paris**.


**[WARNING]:** Keep in mind that the API does not 'remember' your previous question

In [6]:
# Demonstrating that the API does NOT have memory between calls

# First call: ask a question
response1 = client.responses.create(
    model="gpt-4.1",
    input="My favorite color is blue."
)
print("First response:")
print(response1.output_text)

# Second call: refer to the previous message, but without context
response2 = client.responses.create(
    model="gpt-4.1",
    input="What is my favorite color?"
)
print("\nSecond response (no memory):")
print(response2.output_text)

First response:
That's awesome! Blue is a calming and serene color, often associated with the sky and the ocean. Do you like a specific shade of blue, like navy, royal, or turquoise? Or do you just love blue in general?

Second response (no memory):
I don't have access to your personal preferences or history unless you tell me! If you'd like to share your favorite color, I’d love to hear it and can remember it for our conversation. What is your favorite color?


## Endpoints — “Different Sections of the Menu”

### In the Restaurant Metaphor:

The menu has **sections**:

* Appetizers
* Main courses
* Desserts
  
Each has its own list of items.

These sections are like **endpoints** — **different areas** of the API that handle different **types of requests**.

### In API Terms:

An **endpoint** is a **URL** where you send your request.

For example, with the OpenAI API:

* `https://api.openai.com/v1/responses` → Talk with ChatGPT, like we just did
* `.../embeddings` → Turn text into numbers (useful for search).
* `.../images/generations` → Generate images from text.
* `.../audio/speech` → Create speech
* `.../audio/transcriptions` → Create transcriptions
* `.../audio/translations` → Create translations

examples are:
* response = client.**responses**.create()
* response = client.**audio.speech**.create()

Each one does **something different**, but they all follow the same rules of ordering.


In [7]:
# Using the speech endpoint

from pathlib import Path

voices = ["echo", "nova", "shimmer"]
input_text = "Today, we are testing the OpenAI API. At the moment, we are testing the audio API, during a workshop of Tilburg.ai"
#input_text = "Vandaag testen we de OpenAI API, tijdens een workshop van Tilburg.ai"

for voice in voices:
    speech_file_path = Path.cwd() / f"speech_{voice}.mp3"
    with client.audio.speech.with_streaming_response.create(
        model="gpt-4o-mini-tts",
        voice=voice,
        input=input_text
    ) as response:
        response.stream_to_file(speech_file_path)

In [8]:
# Using the transcription endpoint

audio_file = open("speech_echo.mp3", "rb")
transcript = client.audio.transcriptions.create(
  model="gpt-4o-transcribe",
  file=audio_file,
  #language="nl",
  prompt="Everytime you hear Tilburg AI, note it as Tilburg.ai"
)

print(transcript.text)

Today we are testing the OpenAI API. At the moment, we are testing the audio API during a workshop of Tilburg.ai.


## What Are Embeddings?

Embeddings are a way to turn text (words, sentences, or even whole documents) into numbers so that computers can understand and work with them. Each piece of text is converted into a long list of numbers (called a vector) that captures its meaning and context.

- **Why are embeddings useful?**
  - They let computers compare how similar two pieces of text are (e.g., "cat" and "kitten" will have similar embeddings).
  - They are used for search, recommendations, clustering, and many other AI tasks.
  - Embeddings make it possible to do math with language, like finding analogies or grouping similar ideas together.

In the OpenAI API, you can use the embeddings endpoint to get these number representations for your text.

In [9]:
# Using the embeddings endpoint

response =client.embeddings.create(
  model="text-embedding-ada-002",
  input="The food was delicious and the waiter...",
  encoding_format="float"
)

print(response.data[0].embedding[:50])
print(f"The embedding is a list of {len(response.data[0].embedding)} floats")

[0.0022756963, -0.009305916, 0.015742613, -0.0077253063, -0.0047450014, 0.014917395, -0.009807394, -0.038264707, -0.0069127847, -0.028590616, 0.025251659, 0.018116701, -0.0036309576, -0.02554366, 0.00055543496, -0.016428178, 0.02828592, 0.0054083494, 0.009610611, -0.016415482, -0.015412526, 0.004272088, 0.0069953064, -0.007223828, -0.0039007403, 0.018573744, 0.008734611, -0.022699833, 0.011508612, 0.023893224, 0.015602961, -0.0035706533, -0.034963835, -0.0041514793, -0.026178442, -0.02150644, -0.0057066972, 0.011768873, 0.008455306, 0.004129262, 0.019157745, -0.014358787, 0.008982176, 0.0063605234, -0.04570436, 0.017900875, -0.005570219, -0.0007716578, -0.02215392, -0.0039229575]
The embedding is a list of 1536 floats


## Tokens — “How Much You’re Saying”

### In the Restaurant Metaphor:

Imagine you're paying **per word** of your order instead of per item.

Saying:

> “I want spaghetti.”

Costs fewer tokens than:

> “Hello kind waiter, I would like a steaming plate of your finest spaghetti, with extra parmesan on top, please.”

The **longer** or more **complex** your request, the **more tokens** it costs.

### In OpenAI Terms:

* **Tokens = Chunks of text**, usually a few characters long.
* “Hello” → 1 token.
* “Artificial intelligence is amazing!” → \~5 tokens.

#### Why Tokens Matter:

* **You pay per token** (for input *and* output).
* There’s a **limit per request** (e.g., 128.000 tokens, depending on the model).
* Efficient prompts = better performance and lower cost.

In [10]:
import tiktoken

prompt = "Explain API calls in simple terms, using the customer - waiter - chef metaphor."

# Make the API call (Chef prepares the meal)
response = client.responses.create(
    model="gpt-4.1",
    input=prompt
)

output_text = response.output_text

# Show token usage
print("\nToken usage:")
print("Input tokens:", response.usage.input_tokens)
print("Output tokens:", response.usage.output_tokens)
print("Total tokens:", response.usage.total_tokens)

# Choose the encoding for your model (e.g., "cl100k_base" for GPT-4/3.5-turbo)
encoding = tiktoken.get_encoding("cl100k_base")

# Tokenize the output text
tokens = encoding.encode(output_text)

# To see the actual strings for those tokens:
print("First 5 token strings:", [encoding.decode([t]) for t in tokens[2:10]])


Token usage:
Input tokens: 23
Output tokens: 365
Total tokens: 388
First 5 token strings: [' Let', '’s', ' use', ' the', ' **', 'customer', '-w', 'ait']


In [11]:
# Add cost calculation, $2.00 / 1M input tokens, $8.00 / 1M output tokens
cost_per_million_input_tokens = 2
cost_per_million_output_tokens = 8

total_cost = (response.usage.input_tokens / 1000000) * cost_per_million_input_tokens + \
             (response.usage.output_tokens / 1000000) * cost_per_million_output_tokens

print(f"\nTotal cost: ${total_cost:.6f}")


Total cost: $0.002966


## Azure OpenAI

Azure OpenAI is Microsoft's cloud service that provides access to OpenAI's models (like GPT-4) through Azure's infrastructure.

- **Data Privacy**: Your data stays within your Azure environment and isn't used to train models
- **Security**: Integration with Azure's security features and compliance certifications
- **Regional Availability**: You can deploy models in specific Azure regions to meet data residency requirements, so deploying it in a European region.

In [2]:
from openai import AzureOpenAI

load_dotenv()
    
client = AzureOpenAI(
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
    api_version=os.getenv("API_VERSION"),
    azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
    )
    
deployment_name=os.getenv("MODEL_VERSION")
    
# For all possible arguments see https://platform.openai.com/docs/api-reference/chat-completions/create
response = client.chat.completions.create(
    model=deployment_name,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "assistant", "content": "Knock knock."},
        {"role": "user", "content": "Who's there?"},
        {"role": "assistant", "content": "Lettuce."},
        {"role": "user", "content": "Give me an example of the stelling van Pythagoras and how to use it in real life"}
    ]
)

print(f"{response.choices[0].message.role}: {response.choices[0].message.content}")

assistant: Absolutely! The **stelling van Pythagoras** (the Pythagorean theorem) states:

In a right triangle, the square of the length of the hypotenuse (the side opposite the right angle) is equal to the sum of the squares of the lengths of the other two sides.

In formula form:  
\[ a^2 + b^2 = c^2 \]  
where **a** and **b** are the sides that form the right angle, and **c** is the hypotenuse.

## Example

Suppose you have a right triangle with sides **a = 3 meters** and **b = 4 meters**. What is the length of the hypotenuse (**c**)?

**Solution:**

\[
a^2 + b^2 = c^2 \\
3^2 + 4^2 = c^2 \\
9 + 16 = c^2 \\
25 = c^2 \\
c = \sqrt{25} = 5
\]

So, the hypotenuse is **5 meters**.

---

## Real-life example: Ladder against a wall

Imagine you need to place a ladder against a wall. The foot of the ladder is **3 meters** away from the wall, and you want the ladder to reach a window that is **4 meters** high. How long does the ladder need to be?

- The distance from the base of the wall to th