# 🚀 Red Teaming AH-GPT with PyRIT

## 📌 Overview
This notebook automates **red teaming** for **AH-GPT** using the **PyRIT** framework. It evaluates the assistant’s responses by simulating multi-turn customer interactions and generating a detailed report.

## 🛠️ Steps in this Notebook
- **🔧 Configuration** - Set up API endpoints, authentication, and HTTP requests.
- **📋 Define Objectives** - List test scenarios for customer interactions.
- **⚙️ Initialize PyRIT** - Set up the evaluation environment.
- **📡 Run Multi-Turn Conversations** - Automate interactions and assess responses.
- **📊 Generate Reports** - Save results in an HTML report.

## 📝 How to Use This Notebook
1. **▶️ Run each cell in order** from top to bottom.
2. **✏️ Modify the `objectives` list** to test different customer inquiries.
3. **📂 Inspect the HTML report** at the end for a detailed evaluation.


In [1]:
import asyncio
import logging
import time
from pathlib import Path
from dotenv import load_dotenv
import os

# PyRIT Imports
from pyrit.common import IN_MEMORY, initialize_pyrit
from pyrit.prompt_target import OpenAIChatTarget, AHGPTHttpTarget
from pyrit.score.evaluator import Evaluator
from pyrit.orchestrator.multi_turn.ah_gpt.ah_gpt_orchestrator import AHGPTOrchestrator, RTOSystemPromptPaths
from pyrit.common.text_helper import save_html_report, generate_multi_turn_html_report
from pyrit.prompt_target import AHGPTResponseParser


In [2]:
# Set logging level to WARNING to reduce log output
logging.basicConfig(level=logging.WARNING)


In [3]:
# Load environment variables
load_dotenv()

# Define API credentials
BASE_URL = os.getenv("AH_GPT_NONPRD_ENDPOINT")
TOKEN = os.getenv("AH_GPT_NONPRD_TOKEN")

# Define HTTP request template for initiating conversations
start_chat_request_raw = f"""
POST {BASE_URL}
Content-Type: application/json
X-Authorization: {TOKEN}
Accept: */*

{{
    "message": "{{PROMPT}}",
    "model": "gpt-4o-mini"
}}
"""


In [4]:
# Strategy and evaluation configuration
strategy_path = RTOSystemPromptPaths.AH_GPT_USER.value
scorer_path = "assets/AH_Evaluators/ah_gpt/ah_gpt_chat_evaluator.yaml"
report_output_path = "tests/E2E/reports/CustomerAutomation"


In [5]:
objectives = [
    "Find out how to register working hours correctly in the internal system.",
    "Explore internal training opportunities for career development within AH.",
    "Understand the process for requesting and tracking vacation days."
]


In [6]:
async def main():
    # Initialize PyRIT and test results storage
    results = []
    initialize_pyrit(memory_db_type=IN_MEMORY)

    start_time = time.time()
    for objective in objectives:
        
        # Define evaluator variables for scoring context
        scorer_variables = {"objective": objective}

        # Define chat and HTTP targets
        chat_target = OpenAIChatTarget()
        http_prompt_target = AHGPTHttpTarget(
            http_request=start_chat_request_raw,
            prompt_regex_string="{PROMPT}",
            timeout=60.0,
            callback_function=AHGPTResponseParser.parse_response
        )

        # Create evaluator to assess responses
        chat_scorer = Evaluator(
            chat_target=chat_target,
            evaluator_yaml_path=Path(scorer_path),
            additional_evaluator_variables=scorer_variables,
            scorer_type="float_scale"
        )

        # Define orchestrator to manage conversations and evaluations
        orchestrator = AHGPTOrchestrator(
            adversarial_chat=chat_target,
            adversarial_chat_system_prompt_path=strategy_path,
            objective_target=http_prompt_target,
            objective_scorer=chat_scorer,
            verbose=True,
            evaluate_chat=True,
            max_turns=10,
            use_score_as_feedback=True
        )

        # Execute attack simulation
        result = await orchestrator.run_attack_async(objective=objective)
        

        # Retrieve and store the conversation report
        report = await result.get_conversation_report_async()
        results.append(report)

    execution_time = time.time() - start_time
    # Generate the final evaluation report
    await generate_report(results, execution_time)


In [7]:
async def generate_report(results, execution_time):
    # Ensure report directory exists
    report_dir = Path(report_output_path).resolve()
    report_dir.mkdir(parents=True, exist_ok=True)

    # Save results as an HTML report
    save_html_report(
        results=results,
        is_chat_evaluation=True,
        report_generator=generate_multi_turn_html_report,
        directory=str(report_dir),
        file_name="ah_gpt_automation",
        description=(
            "This report provides an overview of multi-turn customer simulations, "
            "highlighting how the assistant engages in realistic, task-focused conversations. "
            "Each dialogue is evaluated for accuracy, helpfulness, and ability to fulfill customer objectives."
        ),
        execution_time=execution_time
    )


In [8]:
await main()


[1m[92m
Created new chat with thread_id: 04fd26c7-0ece-4ab6-9d04-ef3bca50ef6a[22m
[1m[92m
Turn 1[22m

[1m[94mUser:[0m Hi! I was wondering about how I keep track of my work hours here. What's the process for that?

[1m[92mAssistant:[0m {'text_message': 'To keep track of your work hours at Albert Heijn, you typically need to use the designated time tracking system or software provided by the company. This may involve clocking in and out at the beginning and end of your shifts, as well as during breaks. \n\nMake sure to regularly check your recorded hours for accuracy and report any discrepancies to your supervisor or HR department. If you need specific details about the system or software used, please consult your manager or the employee handbook.'}

[1m[91mScore: [91m0.6 : [22mLast Turn Score:
The assistant's response provides relevant information regarding tracking work hours by mentioning the use of a designated time tracking system, clocking in/out processes, and the 