# Azure AI Inference with Phi-4 Models

> **Note:** This notebook uses the latest Azure AI Inference SDK. As of May 2025, the correct way to make completion calls is through `ChatCompletionsClient` with appropriate message objects as shown in this notebook.

This notebook demonstrates how to use Azure AI Inference SDK to access Azure-hosted models, specifically Microsoft's Phi-4 models deployed on Azure.

## Setup and Configuration

First, we'll configure our environment and install necessary dependencies.

# Azure AI Inference with Phi-4 Models

This notebook demonstrates how to use Azure AI Inference SDK to access Azure-hosted models, specifically Microsoft's Phi-4 models deployed on Azure.

> **Important Note:** This notebook uses the latest Azure AI Inference SDK methods for completion calls. The API follows this pattern:
>
> ```python
> from azure.ai.inference import ChatCompletionsClient
> from azure.ai.inference.models import SystemMessage, UserMessage, AssistantMessage
> from azure.core.credentials import AzureKeyCredential
>
> client = ChatCompletionsClient(
>     endpoint="your_endpoint",
>     credential=AzureKeyCredential("your_api_key")
> )
>
> response = client.complete(
>     model="your_model_name",
>     messages=[SystemMessage(content="system message"), UserMessage(content="user message")],
>     temperature=0.7,
>     max_tokens=800
> )
> ```

## Setup and Configuration

First, we'll configure our environment and install necessary dependencies.

In [None]:
# Install required packages
!pip install azure-ai-inference python-dotenv matplotlib pillow

### Setting up the local.env File

Before running this notebook, you need to create a `local.env` file in the same directory as this notebook with the following variables:

```
# GitHub Configuration
GITHUB_TOKEN=your_personal_access_token_here
GITHUB_INFERENCE_ENDPOINT=https://models.github.ai/inference
GITHUB_MODEL=microsoft/Phi-4-reasoning

# Azure OpenAI Configuration
AZURE_API_KEY=your_azure_api_key_here
AZURE_OPENAI_ENDPOINT=your_azure_endpoint_here
AZURE_OPENAI_MODEL=Phi-4-reasoning
```

**Instructions:**

1. Create a new file named `local.env` in the same folder as this notebook
2. Add the three environment variables shown above
3. Replace `your_personal_access_token_here` with your GitHub Personal Access Token
4. You can optionally change the model to `microsoft/Phi-4-mini-reasoning` for a smaller model

**Note:** The GitHub token requires appropriate permissions to access the AI models service.
**Note:** Keep your Azure API key secure and don't share it in public repositories or notebooks.

## Load Environment Variables

Let's load our environment variables from the `local.env` file which contains our Azure AI credentials.

In [None]:
import os
import json
import requests
from dotenv import load_dotenv
from azure.ai.inference import ChatCompletionsClient
from azure.ai.inference.models import AssistantMessage, SystemMessage, UserMessage
from azure.core.credentials import AzureKeyCredential

# Load variables from local.env file
load_dotenv('local.env')

# Access the environment variables - using values from local.env file
api_key = os.getenv("AZURE_API")
endpoint = os.getenv("AZURE_ENDPOINT")
model_name = os.getenv("AZURE_MODEL", "Phi-4")

# Check if required environment variables are available
if not api_key or not endpoint:
    raise ValueError("AZURE_API and AZURE_ENDPOINT must be set in the local.env file.")

print(f"Azure AI Endpoint: {endpoint}")
print(f"Azure AI Model: {model_name}")
print(f"API Key available: {'Yes' if api_key else 'No'}")

# Initialize the Azure AI Inference client
try:
    # Print the endpoint for debugging
    print(f"Debug - Full endpoint URL: {endpoint}")
    
    # Create the chat completions client
    client = ChatCompletionsClient(
        endpoint=endpoint,
        credential=AzureKeyCredential(api_key)
    )
    print("Azure AI Inference client initialized successfully")
except Exception as e:
    print(f"Error initializing client: {str(e)}")
    print("If you're seeing initialization errors, please check your endpoint URL and API key.")

## Helper Functions for Model Inference

Let's create helper functions to interact with the Azure AI models using the Inference SDK.

In [37]:
def generate_chat_completion(messages, temperature=0.7, max_tokens=20000, model=model_name):
    """Generate a completion using Azure AI Inference SDK's chat completions API"""
    # List of model names to try in order if the primary model fails
    model_alternatives = [model, "Phi-4", "phi-4", "gpt-4"]
    
    for attempt_model in model_alternatives:
        try:
            # Debug information
            print(f"Making API call with model: {attempt_model}")
            
            # Convert message dictionaries to proper message objects
            formatted_messages = []
            for msg in messages:
                if msg["role"] == "system":
                    formatted_messages.append(SystemMessage(content=msg["content"]))
                elif msg["role"] == "user":
                    formatted_messages.append(UserMessage(content=msg["content"]))
                elif msg["role"] == "assistant":
                    formatted_messages.append(AssistantMessage(content=msg["content"]))
            
            # Call the Azure AI Inference API
            response = client.complete(
                model=attempt_model,
                messages=formatted_messages,
                temperature=temperature,
                max_tokens=max_tokens
            )
            print(f"Success with model: {attempt_model}")
            return response.choices[0].message.content
        except Exception as e:
            error_message = str(e)
            print(f"Error during API call with {attempt_model}: {error_message}")
            
            # If this is the last model in our list, provide more detailed error information
            if attempt_model == model_alternatives[-1]:
                print("\nAll model attempts failed. Please check:")
                print("1. Your model deployment names in Azure AI Studio")
                print("2. Your API key and endpoint URL")
                print("3. That the models are actually deployed to your Azure AI resource")
                print("\nYou can modify your local.env file to use the correct model name.")
                return None
            
            # If NOT_FOUND error or other specific errors, try the next model
            if "NOT FOUND" in error_message or "model not found" in error_message.lower():
                print(f"Model {attempt_model} not found, trying alternative...")
                continue
            else:
                # For other errors, return the error
                return f"Error: {error_message}"
    
    # This line should not be reached due to the return in the final iteration above
    return None

def generate_completion(prompt, temperature=0.7, max_tokens=10000, model=model_name):
    """Generate a text completion using a simple user prompt"""
    # Simply use the chat completion function with a single user message
    return generate_chat_completion([{"role": "user", "content": prompt}], temperature, max_tokens, model)

## Example 1: How many strawberries for 9 r's?

Let's run our first inference example asking about strawberries and r's.

In [38]:
example1_messages = [
    {"role": "system", "content": "You are a helpful AI assistant that answers questions accurately and concisely."},
    {"role": "user", "content": "How many strawberries do I need to collect 9 r's?"}
]

print("Messages:")
for msg in example1_messages:
    print(f"{msg['role']}: {msg['content']}")
print("\nGenerating response...\n")

# Use the chat completion function with the messages
response1 = generate_chat_completion(example1_messages)
print("Response:")
print(response1)

Messages:
system: You are a helpful AI assistant that answers questions accurately and concisely.
user: How many strawberries do I need to collect 9 r's?

Generating response...

Making API call with model: Phi-4-reasoning-mbodz
Success with model: Phi-4-reasoning-mbodz
Response:
<think>User's question: "How many strawberries do I need to collect 9 r's?" It's ambiguous. Let's parse question: "How many strawberries do I need to collect 9 r's?" Possibly referring to a game or puzzle. Possibly "r's" means "red letters" or "r's" might be "r's" as in the letter "r". Possibly meaning "strawberries" might be something else. Alternatively, maybe "r's" means "r" required objects. The phrasing "collect 9 r's" might be a puzzle. Let me check: "strawberries" might be fruit in a game like "Strawberries" game? Possibly it's a riddle: "How many strawberries do I need to collect 9 r's?" Possibly it's a puzzle: "9 r's" might be in the word "strawberries" or in something else.

I need to check if there 

### Analysis of Example 1

In this example, the model needs to understand that the word "strawberry" contains three 'r' letters. If you run this on other llms they typically get this incorrect.

## Example 2: Solving a Riddle

Now let's try a more complex example - a pattern recognition riddle with multiple examples.

In [45]:
example2_messages = [
    {"role": "system", "content": "You are a helpful AI assistant that solves riddles and finds patterns in sequences."},
    {"role": "user", "content": "I will give you a riddle to solve with a few examples, and something to complete at the end"},
    {"role": "user", "content": "nuno Δημήτρης evif Issis 4"},
    {"role": "user", "content": "ntres Inez neves Margot 4"},
    {"role": "user", "content": "ndrei Jordan evlewt Μαρία 9"},
    {"role": "user", "content": "nπέντε Kang-Yuk xis-ytnewt Nubia 21"},
    {"role": "user", "content": "nπέντε Κώστας eerht-ytnewt Μανώλης 18"}, 
    {"role": "user", "content": "nminus one-point-two Satya eno Bill X."},
    {"role": "user", "content": "What is a likely completion for X that is consistent with examples above?"}
]

print("Messages:")
for msg in example2_messages:
    print(f"{msg['role']}: {msg['content'][:50]}...")
print("\nGenerating response...\n")

response2 = generate_chat_completion(example2_messages, temperature=0.2, max_tokens=30000)
print("Response:")
print(response2)

Messages:
system: You are a helpful AI assistant that solves riddles...
user: I will give you a riddle to solve with a few examp...
user: nuno Δημήτρης evif Issis 4...
user: ntres Inez neves Margot 4...
user: ndrei Jordan evlewt Μαρία 9...
user: nπέντε Kang-Yuk xis-ytnewt Nubia 21...
user: nπέντε Κώστας eerht-ytnewt Μανώλης 18...
user: nminus one-point-two Satya eno Bill X....
user: What is a likely completion for X that is consiste...

Generating response...

Making API call with model: Phi-4-reasoning-mbodz
Success with model: Phi-4-reasoning-mbodz
Response:
<think>We are given a riddle with examples. The conversation: "I will give you a riddle to solve with a few examples, and something to complete at the end". Then the examples are provided:

1. "nuno Δημήτρης evif Issis 4"
2. "ntres Inez neves Margot 4"
3. "ndrei Jordan evlewt Μαρία 9"
4. "nπέντε Kang-Yuk xis-ytnewt Nubia 21"
5. "nπέντε Κώστας eerht-ytnewt Μανώλης 18"
6. "nminus one-point-two Satya eno Bill X."

Then question: "Wh

### Analysis of Example 2

This riddle requires recognizing complex patterns across multiple languages and numerical representations. The model needs to understand various number representations in different languages, reversed spellings, and mathematical relationships.

## Create Your Own Example

You can create your own examples to test the Azure-hosted model's capabilities. Try modifying the prompts or creating entirely new scenarios below.

In [None]:
# Define your custom prompt here
custom_messages = [
    {"role": "system", "content": "You are a helpful AI assistant that can solve complex problems."},
    {"role": "user", "content": "Your custom prompt here"}
]

# Uncomment the lines below to run your custom prompt
# custom_response = generate_chat_completion(custom_messages)
# print("Response to custom prompt:")
# print(custom_response)

## Troubleshooting Azure AI Inference SDK

If you're experiencing issues with the Azure AI Inference SDK, here are some common troubleshooting steps:

1. **Check your endpoint format**: The Azure AI Inference SDK expects endpoints in the format `https://your-resource-name.models.ai.azure.com`

2. **Verify model deployment names**: Make sure the model name you're using is exactly the same as the one deployed in your Azure AI resource. Model names are case-sensitive.

3. **Check API keys**: Ensure your API key has the proper permissions and is still valid.

4. **Common errors**:
   - "NOT FOUND" - The model name doesn't exist or is misspelled
   - "Authentication failed" - API key is invalid
   - "Missing required arguments" - Incorrect parameter format in the API call

5. **Update your local.env file** with the correct values based on your Azure AI deployments.

For more information, refer to the [Azure AI Inference SDK documentation](https://learn.microsoft.com/en-gb/python/api/overview/azure/ai-inference-readme?view=azure-python-preview).

In [None]:
# This cell performs diagnostic tests specific to Azure AI Inference SDK

print("=== Azure AI Inference SDK Diagnostics ===")

# Display the current configuration
print(f"Endpoint: {endpoint}")
print(f"Model name: {model_name}")
print(f"API key available: {'Yes' if api_key else 'No'}")

# Try a minimal test request with the AI Inference SDK
try:
    # Create minimal chat messages for testing
    test_messages = [UserMessage(content="Hello, please respond with one word: 'Working'")]
    
    print("\nAttempting test request with Azure AI Inference SDK...")
    test_response = client.complete(
        model=model_name,
        messages=test_messages,
        temperature=0.7,
        max_tokens=10
    )
    print(f"Status: Success!")
    print(f"Response: {test_response.choices[0].message.content}")
    print("\nNote: For information about available models, please check the Azure AI Studio portal where you can see all deployed models for your endpoint.")
except Exception as e:
    print(f"\nTest request failed: {str(e)}")
    print("\nSuggestions:")
    print("1. Check that your Azure AI endpoint is correctly formatted")
    print("2. Verify that the model is deployed to your Azure AI resource")
    print("3. Ensure your API key is valid")
    print("4. Try a different model name if available")

print("\n=== End of Diagnostics ===\n")

## Conclusion

This notebook demonstrated how to use the Azure AI Inference SDK to interact with Azure-hosted AI models, specifically testing Microsoft's Phi-4 capabilities across different reasoning tasks. Key concepts covered:

1. Setting up authentication for the Azure AI endpoint
2. Formatting prompts in chat format
3. Testing the model's reasoning capabilities with various examples
4. Visualizing solutions to complex problems like maze navigation

The Azure AI Inference SDK provides a consistent interface for accessing various Azure AI models and will receive ongoing updates and support from Microsoft.