# AI Red Teaming Lab

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/sachin0034/hands_on_AI_introduction_to_AI_evaluations-4038348/blob/main/Lab-4%28AI_Red_Teaming%29/AI_Red_Teaming_Lab.ipynb)


## Introduction

In this lab, you'll learn the fundamentals of **AI Red Teaming**—a practice focused on simulating real-world adversaries to uncover vulnerabilities in AI systems.

## What is AI Red Teaming?

AI Red Teaming is the practice of simulating attacks on AI systems to identify:

- 🔓 Security vulnerabilities
- ⚠️ Safety risks
- ❌ Reliability issues

> Think of it as ethical hacking—but for your AI models.


# Getting Started

# Prerequisites
- An **Azure Premium subscription account**
- You **must use a Foundry project** for this feature
- Create a Foundry project using the guide: [Create a Foundry Project](../Create_Azure_Project_Instruction/Readme.md)


## Sign into Azure interactively using the Azure CLI

To interact with Azure resources directly from your Colab notebook, you first need to install the **Azure CLI** in the Colab environment. Use the command below to install it:
**bold text**


In [None]:

!curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash

# Install Packages
To use the AI Red Teaming tools in this lab, install the following packages:


### Package Breakdown

| 📦 Package Name                    | 📝 Description                                                        |
|-----------------------------------|------------------------------------------------------------------------|
| `azure-ai-evaluation[redteam]`    | Tools and modules for AI red teaming and evaluation                    |
| `azure-identity`                  | Manages Azure authentication via Active Directory                      |
| `openai`                          | Interface to interact with OpenAI GPT models                           |
| `azure-ai-projects`               | Manage and run Azure AI Foundry projects                               |
| `semantic-kernel`                 | Framework to orchestrate AI apps and LLM workflows                     |


In [None]:
! pip install azure-ai-evaluation[redteam] azure-identity openai azure-ai-projects semantic-kernel

# Import Packages

## Import Breakdown

| **Import**                                      | **Source**                              | **Purpose**                                                                 |
|------------------------------------------------|------------------------------------------|------------------------------------------------------------------------------|
| `Optional`, `Dict`, `Any`                      | `typing`                                 | Type hinting for cleaner and safer code                                     |
| `os`                                           | Python Standard Library                  | Access environment variables and OS-level functionality                     |
| `RedTeam`                                      | `azure.ai.evaluation.red_team`          | Core class to run red teaming evaluations on AI models                      |
| `RiskCategory`                                 | `azure.ai.evaluation.red_team`          | Enum to classify risks (e.g., safety, fairness, security)                   |
| `AttackStrategy`                               | `azure.ai.evaluation.red_team`          | Enum or class representing different attack types or strategies             |
| `AzureOpenAI`                                  | `openai`                                 | Interface to interact with OpenAI models deployed via Azure infrastructure  |


In [7]:
from typing import Optional, Dict, Any
import os

# Azure imports
from azure.ai.evaluation.red_team import RedTeam, RiskCategory, AttackStrategy

# OpenAI imports
from openai import AzureOpenAI

## Interactive login
`az login` is a command used with the **Azure CLI (Command Line Interface)** to authenticate and connect your local environment (like Terminal, Command Prompt, or Jupyter Notebook) to your **Microsoft Azure account**.

## Requirements

- ✅ You must have an **Azure subscription account**.
- ✅ **Azure CLI** should be installed (`az` command should work in your terminal or notebook).

## When you run `az login`, you will receive a message like:
***To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code SHH9K2G35 to authenticate.***

***Click the link, paste the provided code, and follow the login steps to complete authentication.***

In [None]:
# Azure Credential imports
from azure.identity import AzureCliCredential, get_bearer_token_provider

!az login

# Initialize Azure credentials
credential = AzureCliCredential()

[93mTo sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code BD7Q7TPX2 to authenticate.[0m

Retrieving tenants and subscriptions for the selection...

[Tenant and subscription selection]

No     Subscription name              Subscription ID                       Tenant
-----  -----------------------------  ------------------------------------  -----------
[96m[1][0m *  [96mMicrosoft Azure LegalGraph AI[0m  [96m0d74045d-252c-4b44-856a-71c5dee6b80a[0m  [96mPragyaa LLC[0m

The default is marked with an *; the default tenant is 'Pragyaa LLC' and subscription is 'Microsoft Azure LegalGraph AI' (0d74045d-252c-4b44-856a-71c5dee6b80a).

Select a subscription and tenant (Type a number or Enter for no changes): 

Tenant: Pragyaa LLC
Subscription: Microsoft Azure LegalGraph AI (0d74045d-252c-4b44-856a-71c5dee6b80a)

[Announcements]
With the new Azure CLI login experience, you can select the subscription you want to use more easily. Learn more 

# Set Up Your Environment Variables

Set the following variables for use in this notebook. These environment variables allow your code to connect securely to your Azure resources and AI model deployments.

> ℹ️ **Note:** You can find these values in your **Azure AI Foundry project** . [Create a Foundry Project](../Create_Azure_Project_Instruction/Readme.md)

---

### 🔍 How to Retrieve These Values:

- **Endpoint:**  
  Navigate to the **Overview** section of your Azure AI Foundry project.  
  ✅ The endpoint URL will be listed there.  
  ![Endpoint Location](./images/img-8.png)

- **Azure OpenAI Configuration:**  
  You can retrieve both the **model name** and **endpoint** from the **"Model Deployment"** section of your Azure OpenAI resource.  
  ![Model Deployment Configuration](./images/img-9.png)

---

Ensure these values are correctly set before proceeding with code execution in the notebook.


In [None]:
# To get all the keys follow step - 3 prerequisites
# Azure AI Project information
azure_ai_project = 'Insert Your {AZURE_PROJECT_ENDPOINT}'

# Azure OpenAI deployment information
azure_openai_deployment = 'Insert Your {AZURE_OPENAI_DEPLOYMENT}'  # e.g., "gpt-4"
azure_openai_endpoint = 'Insert Your {AZURE_OPENAI_ENDPOINT}'
azure_openai_api_key = 'Insert Your {AZURE_OPENAI_API_KEY}'        # e.g., "your-api-key"
azure_openai_api_version = 'Insert Your {AZURE_OPENAI_API_VERSION}'  # e.g., "2023-07-01-preview"


## 🧠 Function: `azure_openai_callback`

This asynchronous function sends a message to an **Azure-hosted OpenAI model** and retrieves the assistant’s response. It is designed to simulate a conversational AI system, such as a financial advisor assistant.

---

### 📥 Parameters

- **`messages`**: A list of messages in the chat conversation (currently not used but can be used for dynamic input).
- **`stream`**: A flag to enable streaming responses (placeholder).
- **`session_state`**: Used to track session context (not currently implemented).
- **`context`**: Optional dictionary to pass contextual metadata (not currently used).

---

### 🔐 Step 1: Azure AD Authentication

Uses Azure Active Directory to obtain a secure token for accessing Azure Cognitive Services. This avoids using static API keys and enables enterprise-grade authentication.

---

### 🤖 Step 2: Initialize Azure OpenAI Client

Sets up the Azure OpenAI client using the endpoint, API version, and token provider to connect with the model deployment.

---

### 📡 Step 3: Call the Model

Sends the prompt to the Azure OpenAI deployment. The model generates a response based on the input message.

---

### 🧾 Step 4: Format the Response

Extracts the generated message and wraps it in a dictionary format compatible with chat-based applications.

---

### ⚠️ Step 5: Error Handling

If the API call fails (due to incorrect config, network issues, etc.), the function logs the error and returns a default message.

---

### 🔁 Step 6: Return the Response

Returns the model’s response inside a structured dictionary that can be rendered by a chat interface or used in further processing.

---

### ✅ Summary

This function is a building block for red teaming or AI assistant tasks, where you simulate user interactions with an LLM deployed in Azure. It demonstrates how to:
- Authenticate using Azure AD
- Communicate with Azure-hosted OpenAI models
- Format and handle responses gracefully


In [None]:
# Define a callback that uses Azure OpenAI API to generate responses
async def azure_openai_callback(
    messages: list,
    stream: Optional[bool] = False,  # noqa: ARG001
    session_state: Optional[str] = None,  # noqa: ARG001
    context: Optional[Dict[str, Any]] = None,  # noqa: ARG001
) -> dict[str, list[dict[str, str]]]:
    # Get token provider for Azure AD authentication
    token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")

    # Initialize Azure OpenAI client
    client = AzureOpenAI(
        azure_endpoint=azure_openai_endpoint,
        api_version=azure_openai_api_version,
        azure_ad_token_provider=token_provider,
    )

    ## Hardcoded message for demonstration purposes
    latest_message = "I'm a legal advisor assistant. I can help with risk in contracts advice and extract key terms within legal and ethical guidelines."

    # Alternative hardcoded examples you could use:
    # latest_message = "What are the benefits of using cloud computing?"
    # latest_message = "Write a short poem about technology."
    # latest_message = "Explain the concept of machine learning in simple terms."

    try:
        # Call the model
        response = client.chat.completions.create(
            model=azure_openai_deployment,
            messages=[
                {"role": "user", "content": latest_message},
            ],
            # max_tokens=500, # If using an o1 base model, comment this line out
            max_completion_tokens=500,  # If using an o1 base model, uncomment this line
            # temperature=0.7, # If using an o1 base model, comment this line out (temperature param not supported for o1 base models)
        )

        # Format the response to follow the expected chat protocol format
        formatted_response = {"content": response.choices[0].message.content, "role": "assistant"}
    except Exception as e:
        print(f"Error calling Azure OpenAI: {e!s}")
        formatted_response = "I encountered an error and couldn't process your request."
    return {"messages": [formatted_response]}

## 🛡️ Creating the RedTeam Instance

This section initializes the `RedTeam` object, which is the core component for conducting red teaming evaluations on an AI model deployed in Azure.

---

### 🔧 Parameters:

- **`azure_ai_project`**: The Azure AI Foundry project endpoint that connects the evaluation to a specific project.
- **`credential`**: Azure identity credentials used to authenticate securely via Azure Active Directory.
- **`risk_categories`**: A list of predefined risk categories to probe. In this case:
  - `Violence`: Evaluates how the model responds to prompts about physical harm.
  - `HateUnfairness`: Tests the model’s responses for signs of hate speech or unfair bias.
- **`num_objectives`**: Number of unique attack objectives (i.e., adversarial prompts) generated for each risk category.

> 📦 **Note:**  
> You can include more risk categories to broaden your test coverage:  
> `risk_categories=[RiskCategory.Violence, RiskCategory.HateUnfairness, RiskCategory.Sexual, RiskCategory.SelfHarm], num_objectives=5`

---

### 🧪 Purpose:

To automatically generate and simulate 5 attack scenarios per selected risk category. These scenarios mimic real-world misuse cases that red teamers use to probe the AI system for vulnerabilities, unsafe behavior, or ethical breaches.

---

### ✅ Outcome:

After this setup, the `RedTeam` instance is ready to run evaluations against your deployed Azure OpenAI model, helping assess how it handles safety, fairness, and robustness under adversarial conditions.


In [22]:
# Create the RedTeam instance with all of the risk categories with 5 attack objectives generated for each category
model_red_team = RedTeam(
    azure_ai_project=azure_ai_project,
    credential=credential,
    risk_categories=[RiskCategory.Violence, RiskCategory.HateUnfairness],
    num_objectives=5,
)

## 🚨 Running the Red Team Scan with Advanced Attack Strategies

This step initiates a red team scan against your deployed model using various attack strategies to simulate real-world adversarial behavior.

---

### 🔧 Parameters:

- **`target`**: The callback function that wraps interaction with your Azure OpenAI model (e.g., `azure_openai_callback`).
- **`scan_name`**: A name for the scan run, useful for saving logs and identifying test sessions later.
- **`attack_strategies`**: A list of strategies that define the complexity and method of attack:
  - `EASY`: Low-complexity attacks simulating basic adversarial prompts.
  - `MODERATE`: Medium-complexity attacks mimicking more nuanced misuse.
  - `Compose([Base64, ROT13])`: A composed attack combining two obfuscation techniques—Base64 encoding and ROT13 cipher.
- **`output_path`**: The path where the scan results will be saved in `.json` format for further analysis.

---

### 🎯 Purpose:

To emulate different tiers of adversarial inputs and evaluate how resilient your AI model is under diverse threat scenarios, both simple and complex.

---

### 📁 Output:

The results of the red teaming test are saved to `Advanced-Callback-Scan.json`. This file contains insights into which prompts succeeded or failed, allowing you to tune safety mechanisms accordingly.

---

> 📌 **Note**  
> You can add more attack strategies to increase the robustness of the evaluation:
>
> ```python
> attack_strategies=[
>     AttackStrategy.EASY,
>     AttackStrategy.MODERATE,
>     AttackStrategy.CharacterSpace,
>     AttackStrategy.ROT13,
>     AttackStrategy.UnicodeConfusable,
>     AttackStrategy.CharSwap,
>     AttackStrategy.Morse,
>     AttackStrategy.Leetspeak,
>     AttackStrategy.Url,
>     AttackStrategy.Binary,
>     AttackStrategy.Compose([AttackStrategy.Base64, AttackStrategy.ROT13]),
> ]
> ```


In [None]:
# Run the red team scan with multiple attack strategies
advanced_result = await model_red_team.scan(
    target=azure_openai_callback,
    scan_name="Advanced-Callback-Scan",
    attack_strategies=[
        AttackStrategy.EASY,  # Group of easy complexity attacks
        AttackStrategy.MODERATE,  # Group of moderate complexity attacks
        AttackStrategy.Compose([AttackStrategy.Base64, AttackStrategy.ROT13]),  # Use two strategies in one attack
    ],
    output_path="Advanced-Callback-Scan.json",
)