# Model Router

**Model router** for Azure AI Foundry is a deployable AI chat model that is trained to select the best large language model (LLM) to respond to a given prompt in real time. By evaluating factors like query complexity, cost, and performance, it intelligently routes requests to the most suitable model. Thus, it delivers high performance while saving on compute costs where possible, all packaged as a single model deployment.

**Model router intelligently selects the best underlying model for a given prompt to optimize costs** while maintaining quality. Smaller and cheaper models are used when they're sufficient for the task, but larger and more expensive models are available for more complex tasks. Also, reasoning models are available for tasks that require complex reasoning, and non-reasoning models are used otherwise. Model router provides a single deployment and chat experience that combines the best features from all of the underlying chat models.

Model router will use these models:
- 2025-05-19	GPT-4.1 (2025-04-14)
- GPT-4.1-mini (2025-04-14)
- GPT-4.1-nano (2025-04-14)
- o4-mini (2025-04-16)

In [1]:
#%pip install azure-identity
#%pip install python-dotenv openai

In [2]:
import base64
import os

from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from dotenv import load_dotenv
from openai import AzureOpenAI

In [3]:
load_dotenv()

endpoint = os.getenv("azure_openai_endpoint")
model = "model-router-2"

print(f"Using Azure OpenAI endpoint: {endpoint}")

Using Azure OpenAI endpoint: https://aq-ai-foundry-sweden-central.cognitiveservices.azure.com/


In [4]:
# Initialize Azure OpenAI client with Entra ID authentication
token_provider = get_bearer_token_provider(DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default")

client = AzureOpenAI(
    azure_endpoint=endpoint,
    azure_ad_token_provider=token_provider,
    api_version="2025-01-01-preview",
)

In [5]:
def modelrouter(prompt):
    """
    Generates a response from the chat model based on the provided prompt.

    Args:
        prompt (str): The user's input to which the chat model will respond.

    Returns:
        response: The complete response object from the chat model, including the generated message and model details.
    """
    try:
        response = client.chat.completions.create(
            model=model,
            messages=[
                {
                    "role": "system",
                    "content": "You are a helpful assistant."
                },
                {
                    "role": "user",
                    "content": prompt,
                }
            ],
            temperature=0.7,
        )
        print()
        print("\033[1;31;34m")
        print(response.choices[0].message.content)
        print("\033[1;31;32m")
        print()
        print(f"This model was automatically selected: {response.model}")
        return response

    except Exception as e:
        print(f"An error occurred: {e}")
        return None


## Examples

### Basic prompt

In [6]:
prompt = "What is FIFA?"

modelrouter(prompt)


[1;31;34m
FIFA (Fédération Internationale de Football Association) is the international governing body for association football (soccer), as well as futsal and beach soccer. Key points:

- Founded in 1904 and headquartered in Zurich, Switzerland.
- Membership: 211 national football associations (more than the United Nations).
- Main roles:
  - Organizes major international competitions, most notably the men’s FIFA World Cup (first held in 1930) and the FIFA Women’s World Cup, plus youth, futsal and beach-soccer world championships and the FIFA Club World Cup.
  - Oversees international match calendars and tournament rules at the global level.
  - Distributes development funding, coaching and referee education, and other support to member associations.
  - Works with regional confederations (e.g., UEFA, CONMEBOL, CONCACAF, CAF, AFC, OFC) to run continental competitions and development programs.
- Relationship to the Laws of the Game: The Laws are maintained by the International Footba

ChatCompletion(id='chatcmpl-C3PZ3zVo2EjULpDupocuMEKhjVbdo', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='FIFA (Fédération Internationale de Football Association) is the international governing body for association football (soccer), as well as futsal and beach soccer. Key points:\n\n- Founded in 1904 and headquartered in Zurich, Switzerland.\n- Membership: 211 national football associations (more than the United Nations).\n- Main roles:\n  - Organizes major international competitions, most notably the men’s FIFA World Cup (first held in 1930) and the FIFA Women’s World Cup, plus youth, futsal and beach-soccer world championships and the FIFA Club World Cup.\n  - Oversees international match calendars and tournament rules at the global level.\n  - Distributes development funding, coaching and referee education, and other support to member associations.\n  - Works with regional confederations (e.g., UEFA, CONMEBOL, CONCACAF, CAF, AF

### Medium complex prompt

In [7]:
prompt = "Who  won the last world cup? Explain the details for the country in terms of financial aspects."

modelrouter(prompt)


[1;31;34m
The most recent FIFA World Cup was held in 2022, and Argentina won the tournament. Argentina defeated France in the final, securing their third World Cup title.

### Financial Aspects for Argentina:

1. **Prize Money:**
   - Argentina, as the champion of the 2022 World Cup, received a significant prize money reward from FIFA. The total prize money for the winner was approximately $42 million.
   - This prize money is distributed among the national football association, players, coaching staff, and other stakeholders.

2. **Economic Impact:**
   - Winning the World Cup often boosts a country's football economy. For Argentina, the victory increased interest in football domestically, leading to higher merchandise sales, increased sponsorship deals, and greater investment in football infrastructure.
   - The success also enhances the marketability of Argentine players internationally, potentially increasing their transfer values and salaries.

3. **Sponsorship and Endorsements:

ChatCompletion(id='chatcmpl-C3PZ8pbmBFmMNm7JoVVjUjCHhe1K1', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="The most recent FIFA World Cup was held in 2022, and Argentina won the tournament. Argentina defeated France in the final, securing their third World Cup title.\n\n### Financial Aspects for Argentina:\n\n1. **Prize Money:**\n   - Argentina, as the champion of the 2022 World Cup, received a significant prize money reward from FIFA. The total prize money for the winner was approximately $42 million.\n   - This prize money is distributed among the national football association, players, coaching staff, and other stakeholders.\n\n2. **Economic Impact:**\n   - Winning the World Cup often boosts a country's football economy. For Argentina, the victory increased interest in football domestically, leading to higher merchandise sales, increased sponsorship deals, and greater investment in football infrastructure.\n   - The success also 

### Reasoning prompt

In [8]:
prompt = """
Who won the 2022 FIFA World Cup? Consider different perspectives, analyze the impact of the victory on global football, and discuss the team’s tactical approach in key matches.
"""

modelrouter(prompt)


[1;31;34m
The 2022 FIFA World Cup was won by **Argentina**.

### Different Perspectives on Argentina’s Victory:

- **From Argentina’s Perspective:**
  The victory was monumental, marking their third World Cup title and their first since 1986. It was a crowning achievement for Lionel Messi, often regarded as one of the greatest footballers of all time, who finally lifted the trophy that had eluded him. The win sparked massive celebrations across Argentina, symbolizing national pride and unity.

- **Global Football Community:**
  Argentina’s triumph was widely celebrated and viewed as a testament to the enduring legacy of traditional footballing powerhouses. It also reinforced the narrative of Messi’s illustrious career. For many fans worldwide, it was an emotional and dramatic tournament, culminating in a thrilling final that showcased the beauty and unpredictability of football.

- **For Competing Teams:**
  Teams such as France, the runners-up, and other strong contenders like Brazi

ChatCompletion(id='chatcmpl-C3PZE3S4usVYG9GIA1thPycZWDyWJ', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The 2022 FIFA World Cup was won by **Argentina**.\n\n### Different Perspectives on Argentina’s Victory:\n\n- **From Argentina’s Perspective:**\n  The victory was monumental, marking their third World Cup title and their first since 1986. It was a crowning achievement for Lionel Messi, often regarded as one of the greatest footballers of all time, who finally lifted the trophy that had eluded him. The win sparked massive celebrations across Argentina, symbolizing national pride and unity.\n\n- **Global Football Community:**\n  Argentina’s triumph was widely celebrated and viewed as a testament to the enduring legacy of traditional footballing powerhouses. It also reinforced the narrative of Messi’s illustrious career. For many fans worldwide, it was an emotional and dramatic tournament, culminating in a thrilling final that show

In [9]:
# question for GPT-5 series models
prompt = "Convert this product title list into lowercase kebab-case (comma-separated): ‘Wireless Headphones Pro, Bluetooth Speaker X3, Smartwatch Series 7’."

modelrouter(prompt)


[1;31;34m
Sure! Here’s your list converted into lowercase kebab-case and separated by commas:

```
wireless-headphones-pro, bluetooth-speaker-x3, smartwatch-series-7
```
[1;31;32m

This model was automatically selected: gpt-5-chat-latest


ChatCompletion(id='chatcmpl-C3PZOqHQvEQxdaixroqa1F4wL9iCf', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Sure! Here’s your list converted into lowercase kebab-case and separated by commas:\n\n```\nwireless-headphones-pro, bluetooth-speaker-x3, smartwatch-series-7\n```', refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None), content_filter_results={'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}})], created=1754929310, model='gpt-5-chat-latest', object='chat.completion', service_tier=None, system_fingerprint='fp_7560107781', usage=CompletionUsage(completion_tokens=38, prompt_tokens=48, total_tokens=86, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_predictio

In [10]:
# EASY question to test the model router
prompt = """
Who was the first President of Mexico?  
"""
modelrouter(prompt)



[1;31;34m
The first constitutional President of Mexico was Guadalupe Victoria (born José Miguel Ramón Adaucto Fernández y Félix). He served as President from October 10, 1824 to March 31, 1829.

(If you mean the first head of an independent Mexico overall, Agustín de Iturbide ruled first as Emperor from 1822–1823, but he was not a president.)
[1;31;32m

This model was automatically selected: gpt-5-mini-2025-08-07


ChatCompletion(id='chatcmpl-C3PZP3g5iqIBGjO9HP9dgqBg8sKUf', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The first constitutional President of Mexico was Guadalupe Victoria (born José Miguel Ramón Adaucto Fernández y Félix). He served as President from October 10, 1824 to March 31, 1829.\n\n(If you mean the first head of an independent Mexico overall, Agustín de Iturbide ruled first as Emperor from 1822–1823, but he was not a president.)', refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None), content_filter_results={'hate': {'filtered': False, 'severity': 'safe'}, 'protected_material_code': {'filtered': False, 'detected': False}, 'protected_material_text': {'filtered': False, 'detected': False}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}})], created=1754929311, model='gpt-5-m

In [11]:
# question for gpt-5 reasoning models
prompt = """
You are solving an optimization problem: maximize f(x,y)=3x+4y subject to 2x+3y ≤ 12, x,y≥0. Show step-by-step reasoning, derive the solution, and explain why it’s optimal.
"""
modelrouter(prompt)


[1;31;34m
Let's solve the optimization problem step-by-step:

---

**Problem:**

Maximize \( f(x,y) = 3x + 4y \)

subject to the constraints:

\[
\begin{cases}
2x + 3y \leq 12 \\
x \geq 0 \\
y \geq 0
\end{cases}
\]

---

### Step 1: Understand the problem

We want to find values of \(x\) and \(y\) that maximize the linear function \(f(x,y) = 3x + 4y\), but these values must satisfy the constraints.

The constraints form a feasible region in the first quadrant (since \(x,y \geq 0\)) bounded by the line \(2x + 3y = 12\).

---

### Step 2: Identify the feasible region

The inequality \(2x + 3y \leq 12\) can be rewritten as:

\[
3y \leq 12 - 2x \implies y \leq \frac{12 - 2x}{3}
\]

Since \(x,y \geq 0\), the feasible region is the set of points \((x,y)\) in the first quadrant under or on the line \(2x + 3y = 12\).

---

### Step 3: Find the vertices of the feasible region

The feasible region is a polygon bounded by:

- \(x = 0\)
- \(y = 0\)
- \(2x + 3y = 12\)

The vertices are where the 

ChatCompletion(id='chatcmpl-C3PZSn6FWkJ2QcBp4aXa0C1g2QSFZ', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Let's solve the optimization problem step-by-step:\n\n---\n\n**Problem:**\n\nMaximize \\( f(x,y) = 3x + 4y \\)\n\nsubject to the constraints:\n\n\\[\n\\begin{cases}\n2x + 3y \\leq 12 \\\\\nx \\geq 0 \\\\\ny \\geq 0\n\\end{cases}\n\\]\n\n---\n\n### Step 1: Understand the problem\n\nWe want to find values of \\(x\\) and \\(y\\) that maximize the linear function \\(f(x,y) = 3x + 4y\\), but these values must satisfy the constraints.\n\nThe constraints form a feasible region in the first quadrant (since \\(x,y \\geq 0\\)) bounded by the line \\(2x + 3y = 12\\).\n\n---\n\n### Step 2: Identify the feasible region\n\nThe inequality \\(2x + 3y \\leq 12\\) can be rewritten as:\n\n\\[\n3y \\leq 12 - 2x \\implies y \\leq \\frac{12 - 2x}{3}\n\\]\n\nSince \\(x,y \\geq 0\\), the feasible region is the set of points \\((x,y)\\) in the first qu

In [12]:
# question for GPT-4.1 series models
prompt = """
Draft a 900–1200 word white paper explaining how federated learning can be applied to hospital data across different regulatory jurisdictions. Include privacy challenges, a proposed architecture, and a three-step rollout plan.
"""
modelrouter(prompt)


[1;31;34m
# Leveraging Federated Learning for Hospital Data Across Regulatory Jurisdictions: Privacy, Architecture, and Implementation

## Executive Summary

Hospitals generate vast amounts of sensitive patient data critical for improving clinical outcomes, advancing medical research, and optimizing healthcare delivery. However, sharing this data across institutions, especially those located in different regulatory jurisdictions, poses significant privacy, legal, and technical challenges. Federated Learning (FL), a decentralized machine learning paradigm, offers a promising approach to collaboratively train models on distributed hospital data without exposing raw patient information. This white paper explores how FL can be effectively applied to hospital data spanning multiple regulatory environments, addressing privacy challenges, proposing a secure and compliant architecture, and outlining a pragmatic three-step rollout plan.

---

## Introduction

Healthcare institutions worldwide

ChatCompletion(id='chatcmpl-C3PZe6kUTIidLgvg12R22ybqj9ibm', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='# Leveraging Federated Learning for Hospital Data Across Regulatory Jurisdictions: Privacy, Architecture, and Implementation\n\n## Executive Summary\n\nHospitals generate vast amounts of sensitive patient data critical for improving clinical outcomes, advancing medical research, and optimizing healthcare delivery. However, sharing this data across institutions, especially those located in different regulatory jurisdictions, poses significant privacy, legal, and technical challenges. Federated Learning (FL), a decentralized machine learning paradigm, offers a promising approach to collaboratively train models on distributed hospital data without exposing raw patient information. This white paper explores how FL can be effectively applied to hospital data spanning multiple regulatory environments, addressing privacy challenges, p

In [13]:
# HARD question to test the model router
prompt = """
The new model router model in azure openAI can route between these new models: 
GPT-4.1 series
o4-mini
gpt-5 reasoning models
gpt-5-chat

what questions do you suggest I ask so that I can see the model router selecting each and every model of the list?
"""
modelrouter(prompt)


[1;31;34m
Below are suggested test prompts you can send through the Azure OpenAI model router so you reliably trigger selection of each option (GPT-4.1 series, o4-mini, gpt-5 reasoning, gpt-5-chat). For each prompt I include the reason the router should prefer that model and one or two variants to increase the chance you’ll hit every model in the router’s decision logic.

1) Target: gpt-5 reasoning (high-complexity, deep multi-step reasoning)
- Prompt: “You are a logic tutor. Step-by-step, solve this puzzle and show your reasoning: A and B each pick a number 1–100. A says ‘my number is prime.’ B says ‘my number is a multiple of 3.’ After two further truthful yes/no statements by both, they can deduce each other’s numbers. Find one possible pair and explain each deduction step.”
- Why: gpt-5 reasoning is optimized for extended, rigorous chain-of-thought-style reasoning and complex multi-step problems.
- Variants: Hard math proof request, long causal chain in debugging a complex algori

ChatCompletion(id='chatcmpl-C3Pa3gR84nsWO1ywSGh1oShr7i8AM', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Below are suggested test prompts you can send through the Azure OpenAI model router so you reliably trigger selection of each option (GPT-4.1 series, o4-mini, gpt-5 reasoning, gpt-5-chat). For each prompt I include the reason the router should prefer that model and one or two variants to increase the chance you’ll hit every model in the router’s decision logic.\n\n1) Target: gpt-5 reasoning (high-complexity, deep multi-step reasoning)\n- Prompt: “You are a logic tutor. Step-by-step, solve this puzzle and show your reasoning: A and B each pick a number 1–100. A says ‘my number is prime.’ B says ‘my number is a multiple of 3.’ After two further truthful yes/no statements by both, they can deduce each other’s numbers. Find one possible pair and explain each deduction step.”\n- Why: gpt-5 reasoning is optimized for extended, rigoro