## AI Foundry Workshop



### 1. Completions API 



#### 1.1 Completions API - gpt-4.1

In [2]:
#!/usr/bin/env python3
import os
from dotenv import load_dotenv
from openai import AzureOpenAI

# ── 1. env + client ───────────────────────────────────────────────────────
load_dotenv(".env")                                  # AZURE_* vars live here

client = AzureOpenAI(
    api_key        = os.getenv("AZURE_OPENAI_API_KEY"),
    azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_version    = os.getenv("AZURE_OPENAI_API_VERSION"),  # e.g. 2024-06-01-preview
)
DEPLOYMENT = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME")       # your model deployment name

# ── 2. system-and-user messages ───────────────────────────────────────────
messages = [
    {"role": "system", "content": "You are a concise astrophysics tutor who explains concepts in plain English."},
    {"role": "user",   "content": "What is a black hole?"}
]

# ── 3. one-shot call ──────────────────────────────────────────────────────
response = client.chat.completions.create(
    model       = DEPLOYMENT,
    messages    = messages,
    temperature = 0.7,
    max_tokens  = 256,
)

print(response.choices[0].message.content.strip())


A black hole is a region in space where gravity is so strong that nothing—not even light—can escape from it. It forms when a massive star runs out of fuel and collapses under its own gravity. The boundary around a black hole, beyond which nothing can escape, is called the "event horizon." Anything that crosses this boundary is pulled into the black hole and cannot get out. Black holes can be very small or millions of times heavier than our Sun, and they are invisible because they do not emit light themselves. However, scientists can detect black holes by observing how they affect nearby stars and gas.


#### 1.2 Responses API - Reasoning models - o4-mini

In [3]:
#!/usr/bin/env python3
import os
from dotenv import load_dotenv
from openai import AzureOpenAI

# ── 1. Load env & create client ───────────────────────────────────────────
load_dotenv(".env")                                    # holds your AZURE_* vars

client = AzureOpenAI(
    api_key        = os.getenv("AZURE_OPENAI_API_KEY"),
    azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
    # reasoning models live on a preview API version (adjust if needed)
    api_version    = os.getenv("AZURE_REASONING_OPENAI_API_VERSION") 
                 or "2025-04-01-preview",
)

DEPLOYMENT = os.getenv("AZURE_OPENAI_REASONING_DEPLOYMENT_NAME")  

# ── 2. Build the input messages ───────────────────────────────────────────
messages = [
    {"role": "developer", "content": "You are a concise physics tutor."},
    {"role": "user",      "content": "What is a black hole?"}
]

# ── 3. Single Responses-API call ──────────────────────────────────────────
response = client.responses.create(
    model               = DEPLOYMENT,
    input               = messages,            # list of role/content dicts
    reasoning           = {"effort": "medium"},# low | medium | high
    max_output_tokens   = 1024                 # covers reasoning + visible answer
)

print(response.output_text.strip())


A black hole is a region of spacetime where gravity is so intense that nothing—not even light—can escape once it crosses a boundary called the event horizon. Key points:  
• Formation: most commonly from the gravitational collapse of a massive star’s core after it exhausts its nuclear fuel.  
• Event horizon: the “point of no return” marking the black hole’s boundary; its radius for a non-rotating (Schwarzschild) black hole is Rs = 2GM/c².  
• Singularity: a (theoretical) point at the center where density and spacetime curvature become infinite under classical general relativity.  
• Types:  
  – Stellar-mass (a few to tens of solar masses)  
  – Supermassive (10^6–10^10 M☉) at galactic centers  
  – Intermediate and primordial (hypothetical)  
• Detection: inferred via effects on nearby matter (accretion disks, X-ray emission), star-orbit dynamics, gravitational lensing and gravitational waves from mergers.  
• No-hair theorem: a black hole in isolation is fully described by just mass

#### 1.3 Completions API - model router 

In [6]:
import os
from azure.core.credentials import AzureKeyCredential
from openai import AzureOpenAI      # pip install openai>=1.14.0

# ── 1. Create the client ───────────────────────────────────────────────────────
client = AzureOpenAI(
    api_version="2024-12-01-preview",           # required for Router + o-series
    azure_endpoint   = os.environ["AZURE_OPENAI_ENDPOINT"],
    api_key = os.environ["AZURE_OPENAI_API_KEY"]
)

# ── 2. Define the chat messages ────────────────────────────────────────────────



response = client.chat.completions.create(
messages = [
    {"role": "system", "content": "You are a concise astrophysics tutor."},
    {"role": "user",   "content": "Come up with a terraforming plan for human settlements in Mars so that atmosphere becomes earthlike and consider the type of materials ..."}
],
    max_tokens=8192,
    temperature=0.7,
    top_p=0.95,
    frequency_penalty=0.0,
    presence_penalty=0.0,
    model="model-router"
)

print("Model chosen by the router: ", response.model)
print(response.choices[0].message.content)


Model chosen by the router:  gpt-4.1-nano-2025-04-14
A comprehensive Martian terraforming plan involves multiple steps:

1. **Atmospheric Thickening:** Release greenhouse gases (e.g., perfluorocarbons) to trap heat, raising surface temperatures. Utilizing in-situ resources like carbon dioxide from the polar ice caps and regolith can augment atmospheric CO₂ levels.

2. **Surface Warming:** Install large-scale orbital mirrors to reflect sunlight onto the surface, increasing temperature and sublimation of polar ice.

3. **Water Availability:** Promote melting of polar ice caps to create liquid water sources. Introduce genetically engineered microbes to produce oxygen and stabilize the environment.

4. **Materials Needed:**
   - Greenhouse gases (perfluorocarbons, sulfur hexafluoride)
   - Reflective orbital mirrors (composed of lightweight, durable materials like aluminum or carbon composites)
   - Microbial life forms designed for Mars conditions
   - Construction materials for habitats 

#### 1.4 COMPLETIONS API + STRUCTURED OUTPUTS 

In this example, we’re using a structured output feature that lets us define exactly what kind of response we want from the model—using a Pydantic class called BookBrief. Instead of hoping the model returns something that looks like JSON and then writing code to clean it up, we give the model a strict schema—three fields: author, year, and summary. The model must follow this structure, and if it doesn't, the SDK will throw an error immediately. That means no more post-processing or defensive coding—we get back a typed Python object, ready to use, with guaranteed field names and types. This is incredibly useful when you’re building apps that rely on structured data.

In [7]:
#!/usr/bin/env python3
"""
structured_outputs_demo.py
──────────────────────────
Extract three distinct pieces of data—in *one* call—using Azure OpenAI
**structured outputs** (schema-enforced JSON).

Result comes back as a typed Pydantic object, so you can use the fields
directly without any manual parsing or key-checks.
"""

import os
from dotenv import load_dotenv
from pydantic import BaseModel, Field
from openai import AzureOpenAI

# ── 1. env + client ───────────────────────────────────────────────────────
load_dotenv(".env")   # must contain AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_API_KEY

client = AzureOpenAI(
    api_key        = os.getenv("AZURE_OPENAI_API_KEY"),
    azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_version    = "2024-10-21",                 # GA version with structured outputs
)

DEPLOYMENT = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME")      # model that supports them

# ── 2. define desired output schema (multiple fields) ─────────────────────
class BookBrief(BaseModel):
    author:   str                     = Field(..., description="Book’s author")
    year:     int                     = Field(..., description="Year first published (YYYY)")
    summary:  str                     = Field(..., description="≤30-word overview")

# ── 3. ask the model & let SDK return a typed object ──────────────────────
response = client.beta.chat.completions.parse(
    model = DEPLOYMENT,
    messages = [
        {"role": "system", "content": "Return data that matches the schema exactly."},
        {"role": "user",   "content": "Give me author, publication year, and a 30-word summary of 'To Kill a Mockingbird'."}
    ],
    response_format = BookBrief,      # ← structured outputs
)

book: BookBrief = response.choices[0].message.parsed

# ── 4. use the structured result ──────────────────────────────────────────
print("Author :", book.author)
print("Year   :", book.year)
print("Summary:", book.summary)


Author : Harper Lee
Year   : 1960
Summary: Set in the Depression-era South, the novel follows young Scout Finch as her father, Atticus, defends a black man falsely accused of rape, exploring themes of racism and moral growth.


In [8]:
#!/usr/bin/env python3
"""
structured_outputs_failure_demo.py
──────────────────────────────────
Same question, but without structured outputs—just hoping the model
returns valid, expected JSON. Watch how this can break.
"""

import os, json
from dotenv import load_dotenv
from openai import AzureOpenAI

# ── 1. env + client ───────────────────────────────────────────────────────
load_dotenv(".env")   # must contain AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_API_KEY

client = AzureOpenAI(
    api_key        = os.getenv("AZURE_OPENAI_API_KEY"),
    azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_version    = "2024-10-21",   # doesn't matter—no schema involved here
)

DEPLOYMENT = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME")

# ── 2. ask the model with a vague instruction ─────────────────────────────
response = client.chat.completions.create(
    model = DEPLOYMENT,
    messages = [
        {"role": "system", "content": "Return JSON with author, year, and summary."},
        {"role": "user",   "content": "Tell me about 'To Kill a Mockingbird'."}
    ],
    temperature = 0.7,
    max_tokens  = 256,
)

raw_output = response.choices[0].message.content
print("⬇️ Model output:\n", raw_output)

# ── 3. try parsing the JSON (this might fail or produce wrong structure) ─
try:
    parsed = json.loads(raw_output)
    print("\nParsed:\n", parsed)
    print("\nAuthor:", parsed["author"])
    print("Year:", parsed["year"])
    print("Summary:", parsed["summary"])
except Exception as err:
    print("\n❌ Failed to parse or access expected fields:\n", err)


⬇️ Model output:
 {
  "author": "Harper Lee",
  "year": 1960,
  "summary": "To Kill a Mockingbird is a classic American novel set in the 1930s in the fictional town of Maycomb, Alabama. It follows the story of young Scout Finch, her brother Jem, and their father Atticus, a principled lawyer who defends a Black man wrongly accused of raping a white woman. Through Scout's eyes, the novel explores themes of racial injustice, moral growth, empathy, and the loss of innocence."
}

Parsed:
 {'author': 'Harper Lee', 'year': 1960, 'summary': "To Kill a Mockingbird is a classic American novel set in the 1930s in the fictional town of Maycomb, Alabama. It follows the story of young Scout Finch, her brother Jem, and their father Atticus, a principled lawyer who defends a Black man wrongly accused of raping a white woman. Through Scout's eyes, the novel explores themes of racial injustice, moral growth, empathy, and the loss of innocence."}

Author: Harper Lee
Year: 1960
Summary: To Kill a Mockingb

#### 1.5 COMPLETIONS API + REPRODUCIBLE OUTPUTS 


In [11]:
import os
from openai import AzureOpenAI

# ── Setup Client ──────────────────────────────────────────────────────────
client = AzureOpenAI(
    azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_key        = os.getenv("AZURE_OPENAI_API_KEY"),
    api_version    = "2024-10-21"
)

# ── Deterministic Output with Seed ────────────────────────────────────────
for i in range(3):
    print(f"Run {i + 1}:\n---")
    response = client.chat.completions.create(
        model       = "gpt-4.1",  # Or your deployment name
        seed        = 123,                 # Key to reproducibility
        temperature = 0.7,
        max_tokens  = 50,
        messages=[
            {"role": "user", "content": "Give me a surprising fact about the moon."}
        ]
    )
    print(response.choices[0].message.content)
    print("---\n")


Run 1:
---
Here’s a surprising fact: The moon is moving away from Earth at a rate of about **3.8 centimeters (1.5 inches) per year**! This slow drift is caused by the tidal interactions between the Earth and the moon. Over
---

Run 2:
---
Here’s a surprising fact: The moon is moving away from Earth at a rate of about 3.8 centimeters (1.5 inches) per year! This gradual drift is caused by tidal interactions between the Earth and the moon, and over millions
---

Run 3:
---
Here’s a surprising fact: The moon is moving away from Earth at a rate of about **3.8 centimeters (1.5 inches) per year**! This slow drift is caused by the tidal interactions between the Earth and the moon. Over
---



#### 1.6 COMPLETIONS API + PREDICTED OUTPUTS 

In [12]:

import os
from openai import AzureOpenAI

# ── 1. client ─────────────────────────────────────────────────────────────
client = AzureOpenAI(
    api_key        = os.getenv("AZURE_OPENAI_API_KEY"),
    azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_version    = "2025-01-01-preview",
)
DEPLOYMENT = "gpt-4.1"   # ← replace with your deployed model name

# ── 2. the “big” document ────────────────────────────────────────────────
contract = """
1. Purpose. The parties agree to exchange confidential information.
2. Term. This agreement is valid for 1 year from the Effective Date.
3. Governing Law. This agreement is governed by the laws of England & Wales.
...
30. Signatures. The parties have executed this Agreement.
"""

# ── 3. we only want to change clause 2 ────────────────────────────────────
instruction = """
Change the clause that says “valid for 1 year” so that it reads “valid for 3 years”.
Return the full contract.  No other edits.  No markdown.
"""

# ── 4. fast call WITH predicted outputs ───────────────────────────────────
fast = client.chat.completions.create(
    model    = DEPLOYMENT,
    messages = [
        {"role": "user", "content": instruction},
        {"role": "user", "content": contract},
    ],
    prediction = {                 # tell the model “this text is mostly right already”
        "type": "content",
        "content": contract
    }
)
print("FAST:\n", fast.choices[0].message.content)
print("accepted / rejected prediction tokens:",
      fast.usage.completion_tokens_details, "\n")

# ── 5. slower call WITHOUT predicted outputs (for comparison) ─────────────
slow = client.chat.completions.create(
    model    = DEPLOYMENT,
    messages = [
        {"role": "user", "content": instruction},
        {"role": "user", "content": contract},
    ]
)
print("SLOW:\n", slow.choices[0].message.content)
print("total fresh tokens generated:", slow.usage.completion_tokens)






FAST:
 1. Purpose. The parties agree to exchange confidential information.
2. Term. This agreement is valid for 3 years from the Effective Date.
3. Governing Law. This agreement is governed by the laws of England & Wales.
...
30. Signatures. The parties have executed this Agreement.
accepted / rejected prediction tokens: CompletionTokensDetails(accepted_prediction_tokens=2, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=17) 

SLOW:
 1. Purpose. The parties agree to exchange confidential information.
2. Term. This agreement is valid for 3 years from the Effective Date.
3. Governing Law. This agreement is governed by the laws of England & Wales.
...
30. Signatures. The parties have executed this Agreement.
total fresh tokens generated: 61


#### 1.7 COMPLETIONS API + PROMPT CACHING 

When the first 1024 tokens of two chat-completion requests are identical, the service saves the heavy intermediate computations for those tokens in a per-subscription, short-lived cache (5–60 min). On subsequent requests with the same leading tokens, the model skips recomputation, reusing the cached results. Those tokens are billed at a steep discount (free on Provisioned Throughput) and stream back much faster, while only the new or different tokens are processed normally. One-character changes in that initial block invalidate the cache; everything after the first 1 024 tokens qualifies for additional hits in 128-token blocks.

In [13]:
#!/usr/bin/env python3
"""
prompt_cache_demo.py – minimal example that triggers Azure OpenAI prompt-caching
∙ Works on GPT-4.1, GPT-4o, o1/​o3-mini models (API ≥ 2024-10-01-preview)
"""

import os, time
from openai import AzureOpenAI

# ── 1. client ────────────────────────────────────────────────────────────
client = AzureOpenAI(
    api_key        = os.getenv("AZURE_OPENAI_API_KEY"),
    azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_version    = "2025-01-01-preview",
)
DEPLOYMENT = "gpt-4.1"            # ← your deployment name

# ── 2. build a prompt >1024 tokens, identical at the start ───────────────
base = (
    "In this technical design document we describe the architecture, safety "
    "considerations, and verification strategy for our next-generation flight "
    "computer system, with emphasis on fault tolerance and power efficiency."
)
# Repeat until we exceed 1 200 tokens (~60 repeats of ~20 tokens each)
long_context = " ".join([base] * 60)

# ── 3. ask a trivial question so the model has to read the whole prompt ──
messages = [
    {"role": "system", "content": "You are a concise summariser."},
    {"role": "user",   "content": long_context},
    {"role": "user",   "content": "Give me a 1-sentence summary."},
]

# ── 4. send the request and measure latency ──────────────────────────────
t0 = time.perf_counter()
rsp = client.chat.completions.create(model=DEPLOYMENT, messages=messages)
latency_ms = (time.perf_counter() - t0) * 1_000

usage  = rsp.usage
cached = usage.prompt_tokens_details.cached_tokens or 0

print(f"Latency: {latency_ms:,.0f} ms")
print(f"Prompt tokens: {usage.prompt_tokens}  "
      f"(cached: {cached})")
print(f"Completion: {rsp.choices[0].message.content!r}\n")


Latency: 1,041 ms
Prompt tokens: 2011  (cached: 0)
Completion: 'This technical design document outlines the architecture, safety considerations, and verification strategy for a new flight computer system, focusing on fault tolerance and power efficiency.'



In [None]:
#!/usr/bin/env python3


import os, time
from openai import AzureOpenAI        # pip install "openai>=1.27.0"

# 1️⃣ Client -----------------------------------------------------------------
client = AzureOpenAI(
    api_key        = os.environ["AZURE_OPENAI_API_KEY"],
    azure_endpoint = os.environ["AZURE_OPENAI_ENDPOINT"],
    api_version    = "2025-01-01-preview",
)
DEPLOYMENT = "gpt-4.1"               # ← your deployment name

# 2️⃣ Build a prompt that shares ≥1 024 identical tokens ---------------------
base = (
    "In this technical design document we describe the architecture, safety "
    "considerations, and verification strategy for our next-generation flight "
    "computer system, with emphasis on fault tolerance and power efficiency."
)
long_context = " ".join([base] * 60)  # ≈2 400 tokens → plenty for caching

messages = [
    {"role": "system", "content": "You are a concise summariser."},
    {"role": "user",   "content": long_context},
    {"role": "user",   "content": "Give me a 1-sentence summary."},
]

# 3️⃣ Helper that calls the model once and prints usage ----------------------
def run_once(label: str):
    t0 = time.perf_counter()
    rsp = client.chat.completions.create(model=DEPLOYMENT, messages=messages)
    latency_ms = (time.perf_counter() - t0) * 1_000

    # prompt_tokens_details may be None on cache-miss or older models
    cached = 0
    ptd = getattr(rsp.usage, "prompt_tokens_details", None)
    if ptd and getattr(ptd, "cached_tokens", None):
        cached = ptd.cached_tokens

    print(
        f"[{label}] latency: {latency_ms:,.0f} ms   "
        f"prompt tokens: {rsp.usage.prompt_tokens:,} (cached: {cached:,})"
    )
    print("└─", rsp.choices[0].message.content, "\n")

# 4️⃣ First call populates the cache, second call should be faster ----------
run_once("cold-start")   # expect cached_tokens == 0
time.sleep(2)            # cache is already warm; pause is just for clarity
run_once("cache-hit")    # expect cached_tokens > 0


#### 1.8 FUNCTION CALLING 

In [14]:
#!/usr/bin/env python3
"""
func_call_demo.py – shortest-possible Azure OpenAI function-calling example
∙ Requires API version ≥ 2024-10-01-preview
"""

import os, json
from openai import AzureOpenAI          # pip install "openai>=1.27.0"

# ── 1. client ──────────────────────────────────────────────────────────────
aoai = AzureOpenAI(
    api_key        = os.environ["AZURE_OPENAI_API_KEY"],
    azure_endpoint = os.environ["AZURE_OPENAI_ENDPOINT"],
    api_version    = "2025-01-01-preview",
)
DEPLOYMENT = "gpt-4.1"                  # your Azure deployment name

# ── 2. declare the callable functions (tools) ──────────────────────────────
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Return weather for a city",
            "parameters": {"type": "object",
                           "properties": {"city": {"type":"string"}},
                           "required": ["city"]},
        },
    },
    {
        "type": "function",
        "function": {
            "name": "search_wikipedia",
            "description": "Brief summary of a factual query",
            "parameters": {"type": "object",
                           "properties": {"query": {"type":"string"}},
                           "required": ["query"]},
        },
    },
]

# ── 3. dummy back-end implementations ──────────────────────────────────────
def get_weather(city):        return f"{city}: 22 °C, clear skies (demo)"
def search_wikipedia(query):  return f"Wiki says: {query} (demo)"

def dispatch(name, args):     # trivial router
    return {"get_weather": get_weather,
            "search_wikipedia": search_wikipedia}[name](**args)

# ── 4. chat loop: each prompt triggers its own tool call ───────────────────
for prompt in ["What's the weather in Istanbul?",
               "Who invented the World Wide Web?"]:
    chain = [{"role": "user", "content": prompt}]
    first = aoai.chat.completions.create(
        model      = DEPLOYMENT,
        messages   = chain,
        tools      = tools,
        tool_choice= "auto",     # let the model decide
    ).choices[0]

    if first.finish_reason == "tool_calls":
        tc   = first.message.tool_calls[0]             # only one tool call here
        args = json.loads(tc.function.arguments)
        out  = dispatch(tc.function.name, args)        # our Python stub result

        final = aoai.chat.completions.create(
            model    = DEPLOYMENT,
            messages = chain + [
                first.message,
                {"role":"tool",
                 "tool_call_id": tc.id,
                 "content": out}],
        ).choices[0].message.content
    else:                                              # model answered directly
        final = first.message.content

    print(f"\n🧑  {prompt}\n🤖  {final}")



🧑  What's the weather in Istanbul?
🤖  The current weather in Istanbul is 22°C with clear skies. If you need more detailed or real-time updates, let me know!

🧑  Who invented the World Wide Web?
🤖  The World Wide Web was invented by Tim Berners-Lee in 1989 while he was working at CERN, the European Organization for Nuclear Research. He developed the first web browser and web server, laying the foundation for the modern internet.


Below is a function-calling demo that swaps bare JSON schemas for ✧Pydantic✧ models.
The trick is BaseModel.model_json_schema() —Pydantic builds the exact parameter schema OpenAI expects, so the same code stays type-safe and self-documenting.

In [15]:
#!/usr/bin/env python3
"""
func_call_pydantic_demo.py – Azure OpenAI function-calling with Pydantic
∙ Tested with openai≥1.27 + pydantic≥2.6, API version ≥ 2024-10-01-preview
"""

import os, json
from openai import AzureOpenAI
from pydantic import BaseModel, Field

# ── 1. client ────────────────────────────────────────────────────────────────
aoai = AzureOpenAI(
    api_key        = os.environ["AZURE_OPENAI_API_KEY"],
    azure_endpoint = os.environ["AZURE_OPENAI_ENDPOINT"],
    api_version    = "2025-01-01-preview",
)
DEPLOYMENT = "gpt-4.1"                         # your deployment name

# ── 2. argument models → JSON Schemas (Pydantic does the heavy lifting) ─────
class WeatherArgs(BaseModel):
    city: str = Field(..., description="City name to look up")

class WikiArgs(BaseModel):
    query: str = Field(..., description="Topic to summarise from Wikipedia")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Return simple weather info for a city",
            "parameters": WeatherArgs.model_json_schema(),   # ← auto-schema
        },
    },
    {
        "type": "function",
        "function": {
            "name": "search_wikipedia",
            "description": "Return a concise Wikipedia summary",
            "parameters": WikiArgs.model_json_schema(),
        },
    },
]

# ── 3. stub back-end implementations (real code would call APIs) ─────────────
def get_weather(city: str)        -> str: return f"{city}: 22 °C, clear skies (demo)"
def search_wikipedia(query: str)  -> str: return f"{query}: summary from Wikipedia (demo)"

def dispatch(name:str, payload:dict)->str:
    return {"get_weather": get_weather,
            "search_wikipedia": search_wikipedia}[name](**payload)

# ── 4. chat loop – two prompts, two different tool calls, one tidy flow ─────
for prompt in ["Weather in Istanbul?", "Who invented the World Wide Web?"]:
    chain = [{"role": "user", "content": prompt}]
    first = aoai.chat.completions.create(
        model       = DEPLOYMENT,
        messages    = chain,
        tools       = tools,
        tool_choice = "auto",
    ).choices[0]

    if first.finish_reason == "tool_calls":
        tc   = first.message.tool_calls[0]
        args = json.loads(tc.function.arguments)      # conforms to Pydantic schema
        data = dispatch(tc.function.name, args)

        follow = aoai.chat.completions.create(
            model    = DEPLOYMENT,
            messages = chain + [
                first.message,
                {"role": "tool", "tool_call_id": tc.id, "content": data},
            ],
        )
        reply = follow.choices[0].message.content
    else:
        reply = first.message.content

    print(f"\n🧑  {prompt}\n🤖  {reply}")



🧑  Weather in Istanbul?
🤖  The current weather in Istanbul is 22°C with clear skies. If you need a forecast for the coming days, let me know!

🧑  Who invented the World Wide Web?
🤖  The World Wide Web was invented by Tim Berners-Lee in 1989. He is a British computer scientist who created the Web while working at CERN (the European Organization for Nuclear Research) to facilitate information sharing among scientists and researchers globally.



### VISION MODELS 

### REAL-TIME API 

### AUDIO MODELS 

### 2. SEMANTIC KERNEL 