## Planning AI Red Teaming

### Contents
1. [Planning AI Red Teaming](#planning-ai-red-teaming)
   - Introduction and Session Overview
2. [Introduction to Red Team Planning](#introduction-to-red-team-planning)
3. [Red Teaming Activity](#red-teaming-activity)
   - **Scenario:** Medical Glossary Chatbot
   - Task Description
   - Detailed Steps:
     1. [Goal](#goal)
        - Define the specific area(s) to test
     2. [Assemble the Team](#assemble-the-team)
        - List team members and assign focus areas (RAI harms)
     3. [Design & Perform Tests](#design--perform-tests)
        - Execute tests using the chatbot UI with code integration
     4. [Report the Results](#report-the-results)
        - Document findings and plan next steps
4. [Resources](#resources)
   - External Guides and Example Prompts


### Overview
This session provides a high level overview of how to plan for red teaming AI systems. 


### Objectives

By the end of this session, you will:
1. Assemble a multidisciplinary red team to address different Responsible AI (RAI) harms.
2. Design and perform iterative adversarial tests at both the language model and application levels.
3. Demonstrate specific areas within a given chatbot application that require mitigations.
4. Document findings and identify opportunities for continuous monitoring and improvement.


#### Introduction to Red Team Planning

Planning AI red teaming exercises involves several essential steps to ensure a comprehensive and effective assessment of your AI systems. Steps include:

- **Defining the Goals**: Determine the specific areas you want to test. This helps in setting clear objectives for the exercise.

- **Assembling the team**: It is important to assemble a diverse red team from differnt backgrounds. Assign tasks/features to each person to probe for specific types of RAI harms, and rotate roles periodically. Clearly defined objectives should be outlined, detailing the goals of each testing round, the features and issues to be focused on, the expected time and effort required  along with detailed recording instructions to encourage reproducibility of results. Also, maintaining effective communication is crucial, and this can be achieved by identifying key contact points for any questions that may arise during the process.

- **Designing & performing adversarial tests**: Red teamers should perform iterative, adversarial tests at both the LLM and application layers to identify harms. Regularly retest after applying mitigations and document results.

- **Reporting results & continuous monitoring**: Record tests/results centrally and share regular reports with top issues, next testing plans and relevant info. Red teaming is an interative approach, with regular assessments. This includes weekly sprints to continually identify, measure, and mitigate risks.




#### Red Teaming Activity

**Scenario**: An organization called `MD Web` has developed a medical glossary AI Assistant. This intelligent tool is exclusively programmed to provide precise definitions and contextual explanations for a wide range of medical terms. Its system prompt has been designed to ensure that, while it may offer detailed explanations for technical vocabulary, it never provides treatment recommendations or direct medical advice.

**Task**: Create a plan for conducting one iteration of red teaming for this chatbot. Include information about the team, the RAI harms to consider, and the techniques that will be used to test the system.


##### 1. Goal

*Determine the specific area(s) that you plan to test*




##### 2. Assemble the team

- **Assemble a diverse team**

    *List persons who will be needed on the team. Think about the diverse backgrounds and experience that's needed for the use case*



- **Assign red teamers to harms and/or features**

    *For each team member listed above, which RAI harm should be their focus*

##### 3. Design & Perform Tests

*Perform various techniques using the chatbot UI below*


In [None]:
from utils.MedicalChatbotApp import MedicalChatbot
import warnings

# Suppress all warnings
warnings.filterwarnings('ignore')

# Create an instance of the Chatbot class
bot = MedicalChatbot()
bot.launch()


##### 4. Report the results

*What were the findings?*


*What is your plan for the next iteration of testing*



#### Summary
This notebook introduces a structured approach for planning RAI red teaming exercises, with a use case of a medical glossary AI assistant. It guides users through setting clear testing goals, assembling a diverse red teaming group, designing and executing iterative adversarial tests on both the language model and application layers, and finally reporting and documenting test results. 

**Key Takeaways**:

- Structured & Iterative Testing:
The methodology stresses a cyclical testing process that includes clearly defined objectives and iterative reassessment to continuously mitigate potential Responsible AI harms.

- Team Diversity:
Assembling a diverse red team with clear role assignments based on expertise ensures comprehensive coverage of risk areas, enabling more effective identification and mitigation of vulnerabilities.


Next Step:
- Lab 3: Automating Red Teaming

#### Resources
- [Planning red teaming for large language models (LLMs) and their applications](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/red-teaming)
    - https://aka.ms/red-teaming-planning-guide
- [Introduction to AI security testing](https://learn.microsoft.com/en-us/training/modules/introduction-ai-security-testing/)

- https://aka.ms/LLM-red-teamer-instructions-TEMPLATE

- [Example prompts](https://aka.ms/RAIRedTeamExamplePrompts)
