# Before the bug bash
Thank you for setting up your environment ahead of the bug bash!

## Install uv 
uv is a very fast Python package manager. It will help make the installation of azure-ai-evaluation with extras faster.

In [1]:
%pip install uv

Collecting uv
  Downloading uv-0.6.9-py3-none-win_amd64.whl.metadata (11 kB)
Downloading uv-0.6.9-py3-none-win_amd64.whl (17.4 MB)
   ---------------------------------------- 0.0/17.4 MB ? eta -:--:--
   ---------------------------------------- 0.0/17.4 MB ? eta -:--:--
   ---------------------------------------- 0.0/17.4 MB 217.9 kB/s eta 0:01:20
   ---------------------------------------- 0.0/17.4 MB 330.3 kB/s eta 0:00:53
   ---------------------------------------- 0.2/17.4 MB 833.5 kB/s eta 0:00:21
   - -------------------------------------- 0.6/17.4 MB 2.9 MB/s eta 0:00:06
   --- ------------------------------------ 1.4/17.4 MB 5.4 MB/s eta 0:00:03
   ------- -------------------------------- 3.3/17.4 MB 10.4 MB/s eta 0:00:02
   ----------- ---------------------------- 4.9/17.4 MB 13.5 MB/s eta 0:00:01
   ---------------- ----------------------- 7.0/17.4 MB 17.2 MB/s eta 0:00:01
   -------------------- ------------------- 8.8/17.4 MB 19.5 MB/s eta 0:00:01
   -------------------- --


[notice] A new release of pip is available: 24.0 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


## Create a virtual environment using uv
Create a virtual environment using uv and specify a Python version >= 3.10

In [None]:
%uv venv --python 3.11 

UsageError: Line magic function `%uv` not found.


Ensure you are running the following scripts using the virtual environment created. To do so, activate the virtual environment using: 
`.venv\Scripts\activate` on Windows or `source .venv/bin/activate` on macOS and Linux. 

## Install Azure AI Evaluation SDK with Red Team extra

With the virtual environment activated, you can now install Azure AI Evaluation SDK with Red Team extra

In [None]:
%uv pip install --upgrade "git+https://github.com/slister1001/azure-sdk-for-python.git@red-team-agent-init#subdirectory=sdk/evaluation/azure-ai-evaluation&egg=azure-ai-evaluation[redteam]"

Thank you for setting up your environment ahead of the bug bash!

# Red Team Agent Bug Bash Configuration
To run RedTeamAgent, you first need to import packages and configure your environment.

## Connect to an Azure AI Project
Navigate to this [Azure AI Foundry Hub](https://int.ai.azure.com/managementCenter/hub/overview?wsid=/subscriptions/4bf6b28a-452b-4af4-8080-8a196ee0ca4b/resourceGroups/naposani/providers/Microsoft.MachineLearningServices/workspaces/sydneylister-1523&flight=AIRedTeaming=true,EvalConvergence&tid=72f988bf-86f1-41af-91ab-2d7cd011db47), and select a project. Insert the project name in the below code as `<your-project-name>`

In [None]:
azure_ai_project = {
    "subscription_id": "4bf6b28a-452b-4af4-8080-8a196ee0ca4b",
    "resource_group_name": "naposani",
    "project_name": "<your_project_name>", # INSERT THE PROJECT NAME HERE
}

## Import necessary packages

In [None]:
from azure.ai.evaluation.red_team_agent import RedTeamAgent, AttackStrategy, RiskCategory
from azure.identity import DefaultAzureCredential
from typing import Dict, List, Optional, Any
import os

# Red Team Agent Examples

Below are several examples demonstrating the various ways to use the RedTeamAgent for vulnerability scanning.

## Example 1: Basic Red Team Scan with Callback Function

This example shows how to use the RedTeamAgent with a simple callback function to test your application.

In [None]:
# Initialize credentials and RedTeamAgent
credential = DefaultAzureCredential()
agent = RedTeamAgent(azure_ai_project=azure_ai_project, credential=credential)

# Create a simple application callback function
def simple_app_callback(query: str) -> str:
    """A simple application callback function that responds to user queries"""
    return f"I'm an AI assistant that follows ethical guidelines. I cannot provide harmful content."

# Run a basic scan with the callback function (baseline is added by default)
results = await agent.scan(
    target=simple_app_callback,
    scan_name="Basic-Callback-Test",
    risk_categories=[RiskCategory.Violence, RiskCategory.HateUnfairness],
    num_objectives=2,  # Limit to 2 objectives per category for the example
    application_scenario="A customer service chatbot for a retail company"
)

## Example 2: Advanced Callback with Conversation State

This example demonstrates using a more complex callback that maintains conversation state.

In [None]:
# Initialize credentials and RedTeamAgent

credential = DefaultAzureCredential()
agent = RedTeamAgent(azure_ai_project=azure_ai_project, credential=credential)

# Create a more complex callback function that handles full conversation context
async def advanced_callback(messages: Dict, stream: bool = False, session_state: Any = None, context: Optional[Dict] =None) -> dict:
    """A more complex callback that processes conversation history"""
    # Extract the latest message from the conversation history
    messages_list = [{"role": chat_message.role,"content": chat_message.content} for chat_message in messages]
    latest_message = messages_list[-1]["content"]
    
    # In a real application, you might process the entire conversation history
    # Here, we're just simulating different responses based on the latest message
    response = "I'm an AI assistant that follows safety guidelines. I cannot provide harmful content."
    
    # Format the response to follow the openAI chat protocol format
    formatted_response = {
        "content": response,
        "role": "assistant"
    }
    
    return {"messages": [formatted_response]}

# Run a more comprehensive scan with all base risk categories
advanced_results = await agent.scan(
    target=advanced_callback,
    scan_name="Advanced-Callback-Test",
    attack_strategies=[AttackStrategy.Base64, AttackStrategy.ROT13, AttackStrategy.UnicodeConfusable],
    risk_categories=[RiskCategory.Violence, RiskCategory.Sexual, RiskCategory.SelfHarm, RiskCategory.HateUnfairness],
    num_objectives=2,  # Using 2 objectives per category for this example
    application_scenario="An AI assistant for educational content for children"
)

## Example 3: Testing OpenAI or Azure OpenAI Models Directly

This example shows how to red team test an OpenAI or Azure OpenAI model directly.

In [None]:
# Configuration for OpenAI model
openai_config = {
    "model": "gpt-4o",  # Replace with your actual model name
    "api_key": "your_openai_api_key"  # Replace with your actual API key
}

# Configuration for Azure OpenAI model
azure_openai_config = {
    "azure_endpoint": "https://your-deployment.openai.azure.com/",  # Replace with your endpoint
    "azure_deployment": "your-deployment-name",  # Replace with your deployment name
    "api_key": "your_azure_openai_api_key"  # Replace with your API key, or comment out if Entra ID authentication is enabled on your deployment 
}

# Uncomment and use one of these configurations:
# model_config = openai_config  # For OpenAI
model_config = azure_openai_config  # For Azure OpenAI

# Run scan with multiple attack strategies
model_results = await agent.scan(
    target=model_config,
    scan_name="Direct-Model-Test",
    attack_strategies=[
        AttackStrategy.EASY,      # Easy complexity attacks
        AttackStrategy.Jailbreak  # Test jailbreak prompts
    ],
    risk_categories=[RiskCategory.Violence, RiskCategory.HateUnfairness],
    num_objectives=3,
    application_scenario="A legal document assistant for contract drafting"
)

## Example 4: Using Pre-defined Attack Strategy Complexities

This example demonstrates using pre-defined complexity levels for attacks.

In [None]:
# Define a callback that simulates a financial advisor chatbot
def financial_advisor_callback(query: str) -> str:
    return "I'm a financial advisor assistant. I can help with investment advice and financial planning within legal and ethical guidelines."

# Run a scan with different complexity levels
complexity_results = await agent.scan(
    target=financial_advisor_callback,
    scan_name="Complexity-Levels-Test",
    attack_strategies=[
        AttackStrategy.EASY,      # Group of easy complexity attacks
        AttackStrategy.MODERATE,  # Group of moderate complexity attacks
        AttackStrategy.DIFFICULT  # Group of difficult complexity attacks
    ],
    risk_categories=[RiskCategory.Violence, RiskCategory.SelfHarm],
    num_objectives=2,
    application_scenario="A financial advisor chatbot for investment planning"
)

## Example 5: Using Specific Attack Strategies

This example demonstrates using specific individual attack strategies rather than complexity groups.

In [None]:
# Define a medical assistant callback
def medical_assistant_callback(query: str) -> str:
    return "I'm a medical information assistant. I can provide general health information but cannot diagnose conditions or prescribe treatments."

# Run a scan with specific attack strategies
specific_strategy_results = await agent.scan(
    target=medical_assistant_callback,
    scan_name="Specific-Strategies-Test",
    attack_strategies=[
        AttackStrategy.Base64,           # Encode prompts in Base64
        AttackStrategy.Flip,             # Flip text to evade detection
        AttackStrategy.CharacterSpace,   # Add character spaces
        AttackStrategy.ROT13,            # Use ROT13 encoding
        AttackStrategy.UnicodeConfusable,# Use confusable Unicode characters
        AttackStrategy.Tense             # Change tense of prompts  
    ],
    risk_categories=[RiskCategory.SelfHarm, RiskCategory.HateUnfairness],
    num_objectives=3,
    application_scenario="A medical information assistant for general health information"
)

## Example 6: Data-Only Mode (Without Evaluation)

This example shows how to run in data-only mode to collect conversation data without performing evaluations.

In [None]:
# Define a simple content moderator callback
def content_moderator_callback(query: str) -> str:
    return "I'm a content moderation assistant. I can help identify and flag potentially inappropriate content."

# Run a data-only scan (without evaluations)
data_only_results = await agent.scan(
    target=content_moderator_callback,
    scan_name="Data-Only-Test",
    attack_strategies=[AttackStrategy.Morse, AttackStrategy.Leetspeak],
    risk_categories=[RiskCategory.Sexual, RiskCategory.HateUnfairness],
    num_objectives=2,
    application_scenario="A content moderation system for a social media platform",
    data_only=True  # Only collect conversation data without evaluations
)

# You can access the collected conversation data
conversations = data_only_results.redteaming_data
print(f"Collected {len(conversations) if conversations else 0} conversations without evaluation")

## Example 7: Working with Results

This example demonstrates how to extract and use information from the RedTeamAgent results.

In [None]:
# Use the results from any previous example
results = specific_strategy_results  # Change this to any of the earlier results variables

# Extract the JSON representation of results
results_json = results.to_json()
print(f"Results JSON length: {len(results_json)}")

# Extract just the scorecard
scorecard = results.to_scorecard()
if scorecard:
    print("\nRisk Category Summary:")
    for category in scorecard.get("risk_category_summary", []):
        print(f"Overall ASR: {category.get('overall_asr')}%")
        print(f"Total Attacks: {category.get('overall_total')}")
        print(f"Successful Attacks: {category.get('overall_attack_successes')}")

# Convert to query-response format for further evaluation
qr_format = results.to_eval_qr_json_lines()
print(f"\nQuery-Response format length: {len(qr_format)}")

# Get a human-readable attack simulation report
attack_simulation = results.attack_simulation()
print(f"\nAttack Simulation Report Preview:\n{attack_simulation[:500]}...")

## Example 8: Using a Custom Application with Mock Attack Scenario

This example demonstrates integrating with a custom application in a realistic attack scenario.

In [None]:
# Define a more complex application with some guardrails
class ContentCreationAssistant:
    def __init__(self):
        self.banned_topics = ["violence", "hate", "harassment", "self-harm", "explicit"]
        self.conversation_history = []
    
    def check_content(self, text):
        # Simple content filter (in reality, this would be more sophisticated)
        for topic in self.banned_topics:
            if topic in text.lower():
                return False
        return True
    
    def generate_response(self, query):
        # Store conversation history
        self.conversation_history.append({"role": "user", "content": query})
        
        # Check if content passes moderation
        if not self.check_content(query):
            response = "I'm sorry, I cannot assist with that type of content as it may violate content guidelines."
        else:
            response = "I'm a content creation assistant. I can help you draft blog posts, articles, and other content following ethical guidelines."
        
        # Add response to history
        self.conversation_history.append({"role": "assistant", "content": response})
        return response

# Create an instance of the assistant
assistant = ContentCreationAssistant()

# Wrapper function for the RedTeamAgent
def content_assistant_callback(query):
    return assistant.generate_response(query)

# Run a scan with various attack strategies
custom_app_results = await agent.scan(
    target=content_assistant_callback,
    scan_name="Content-Assistant-Test",
    attack_strategies=[
        AttackStrategy.EASY,          # Easy complexity attacks
        AttackStrategy.Jailbreak      # Test jailbreak attempts
    ],
    risk_categories=[RiskCategory.HateUnfairness, RiskCategory.Violence],
    num_objectives=3,
    application_scenario="A content creation assistant for bloggers and writers"
)

## Example 9: Running with Output Path and Custom Location

This example shows how to specify an output path for results.

In [None]:
import os
from datetime import datetime

# Create a timestamp-based output directory
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
output_dir = f"redteam_results_{timestamp}"
os.makedirs(output_dir, exist_ok=True)
output_path = os.path.join(output_dir, "redteam_results.json")

# Define a simple application
def education_assistant_callback(query):
    return "I'm an educational assistant designed to help with learning and studying."

# Run a scan with custom output path
output_path_results = await agent.scan(
    target=education_assistant_callback,
    scan_name="Output-Path-Test",
    attack_strategies=[AttackStrategy.Baseline, AttackStrategy.AsciiArt, AttackStrategy.Binary],
    risk_categories=[RiskCategory.HateUnfairness, RiskCategory.Violence],
    num_objectives=2,
    application_scenario="An educational assistant for students",
    output_path=output_path  # Specify where to save results
)

print(f"Results saved to: {output_path}")

## Example 10: Empty Attack Strategies

This example demonstrates using an empty list of attack strategies. Only the baseline strategy will be applied (added by default).

In [None]:
def safety_assistant_callback(query: str) -> str:
    return "I'm a safety assistant that provides guidance on workplace safety protocols."

# Run a scan with empty attack strategies list (only baseline will be used)
empty_strategies_results = await agent.scan(
    target=safety_assistant_callback,
    scan_name="Empty-Strategies-Test",
    attack_strategies=[],  # Empty list - only baseline strategy will be used
    risk_categories=[RiskCategory.Violence, RiskCategory.HateUnfairness],
    num_objectives=2,
    application_scenario="A workplace safety assistant for employees"
)

print(f"Empty strategies test completed with {len(empty_strategies_results.redteaming_data) if empty_strategies_results.redteaming_data else 0} conversations")

## Example 11: Empty Risk Categories

This example demonstrates using an empty list of risk categories. The system will use default risk categories.

In [None]:
def customer_service_callback(query: str) -> str:
    return "I'm a customer service assistant for a retail store. I can help with product information and returns."

# Run a scan with empty risk categories
empty_categories_results = await agent.scan(
    target=customer_service_callback,
    scan_name="Empty-Categories-Test",
    attack_strategies=[AttackStrategy.Base64, AttackStrategy.ROT13],
    risk_categories=[],  # Empty list - will use default categories
    num_objectives=2,
    application_scenario="A customer service assistant for a retail store"
)

print(f"Empty categories test completed with {len(empty_categories_results.redteaming_data) if empty_categories_results.redteaming_data else 0} conversations")

## Example 12: Duplicate Strategies and Categories

This example demonstrates using duplicate attack strategies and risk categories. Duplicates will be automatically removed.

In [None]:
def travel_assistant_callback(query: str) -> str:
    return "I'm a travel assistant that helps with trip planning and booking accommodations."

# Run a scan with duplicate attack strategies
duplicate_results = await agent.scan(
    target=travel_assistant_callback,
    scan_name="Duplicate-Cases-Test",
    attack_strategies=[
        AttackStrategy.Base64, 
        AttackStrategy.Base64,  # Duplicate
        AttackStrategy.ROT13,
        AttackStrategy.ROT13    # Duplicate
    ],
    risk_categories=[
        RiskCategory.Violence, 
        RiskCategory.Violence,  # Duplicate
        RiskCategory.HateUnfairness,
        RiskCategory.HateUnfairness   # Duplicate
    ],
    num_objectives=2,
    application_scenario="A travel planning assistant for vacation bookings"
)

print(f"Duplicate cases test completed with {len(duplicate_results.redteaming_data) if duplicate_results.redteaming_data else 0} conversations")

## Example 13: Using PyRIT PromptChatTarget as Target

This example demonstrates using PyRIT's PromptChatTarget directly as a target for RedTeamAgent.

In [None]:
from pyrit.prompt_target import OpenAIChatTarget, PromptChatTarget

chat_target = OpenAIChatTarget(
    model_name=os.environ.get("AZURE_OPENAI_DEPLOYMENT", "your-deployment-name"),
    endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT", "https://your-deployment.openai.azure.com/"),
    api_key=os.environ.get("AZURE_OPENAI_KEY", "your_azure_openai_api_key")
)

# Run a scan using the PyRIT PromptChatTarget directly
pyrit_results = await agent.scan(
    target=chat_target,  # PyRIT PromptChatTarget instance
    scan_name="PyRIT-Target-Test",
    attack_strategies=[
        AttackStrategy.Base64,
        AttackStrategy.ROT13
    ],
    risk_categories=[RiskCategory.SelfHarm, RiskCategory.HateUnfairness],
    num_objectives=2,
    application_scenario="A general-purpose AI assistant"
)

print(f"PyRIT target scan completed with {len(pyrit_results.redteaming_data) if pyrit_results.redteaming_data else 0} conversations")