<a href="https://colab.research.google.com/github/ShaliniAnandaPhD/ISBRT/blob/main/Generating_personas_and_scenario_for_LLM_Red_Teaming_and_Evaluation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This approach provides a comprehensive framework for red teaming exercises, allowing organizations to rigorously assess the security, performance, and robustness of their LLMs. Below, we outline the unique aspects of this tool and provide detailed instructions on how to use it effectively for red teaming and LLM evaluation.

**Key Features and Novelty:**

1. **Random Persona Generation:** This toolkit offers a unique feature for generating a diverse range of random personas, each with distinct demographic, psychographic, and behavioral attributes. Unlike traditional evaluation methods that rely on fixed personas, our approach introduces variability, making it more challenging for LLMs to predict user behavior accurately.

2. **Varied Scenario Library:** Users can access a library of random scenarios that span across multiple domains, including cybersecurity, healthcare, finance, and social interactions. What sets this toolkit apart is the ability to create both common and uncommon scenarios, providing a comprehensive evaluation of the LLM's capabilities.

3. **Adversarial Testing:** We encourage the creation of scenarios with adversarial or deceptive intent. This unique aspect challenges the LLM's ability to detect and respond effectively to malicious user interactions, simulating real-world adversarial threats.

4. **Bias Assessment:** The toolkit facilitates an in-depth evaluation of biases within LLM responses. By ensuring that personas and scenarios encompass a wide range of demographics, it aids in uncovering any unfair biases in the LLM's outputs, contributing to fairness and equity considerations.

**How to Use the Toolkit for Red Teaming and LLM Evaluation:**

1. **Access the Colab Notebook:** Start by opening the "Generating personas and scenario for LLM Red Teaming and Evaluation" Colab notebook.

2. **Generate Random Personas:** Execute the code within the notebook to create random personas. These personas will serve as simulated users with diverse attributes, reflecting real-world variability.

3. **Create Scenarios:** Utilize the provided scenario templates or craft your own random scenarios. These scenarios should cover various domains and user intentions, including adversarial scenarios for robustness testing.

4. **Define Evaluation Metrics:** Establish clear evaluation metrics to assess the LLM's performance. Metrics may include response accuracy, context coherence, sentiment analysis, and bias detection.

5. **Scenario Customization:** Customize scenarios to address specific evaluation needs. This flexibility allows red teamers to target particular aspects of the LLM's behavior.

6. **Conduct Red Teaming Exercises:** Engage in red teaming exercises by interacting with the LLM using the personas and scenarios. Observe how the LLM responds to diverse user inputs and intentions.

7. **Bias and Fairness Assessment:** Evaluate the LLM's responses for potential biases, including gender, racial, or cultural biases. Ensure that the personas and scenarios represent a broad spectrum of demographics.

8. **Scenario Catalog:** Maintain a catalog of scenarios and personas, documenting any vulnerabilities or issues discovered during testing. This catalog serves as a reference for future evaluations.

9. **Continuous Testing:** Regularly perform red teaming exercises to stay updated with LLM model changes and evolving threats. Continuously report any vulnerabilities to developers for improvements.

10. **Ethical Considerations:** Adhere to ethical guidelines and data privacy regulations when creating personas and scenarios. Avoid scenarios that promote harmful content or violate privacy rules.

11. **Documentation and Reporting:** Document the results of red teaming exercises comprehensively. Include details of vulnerabilities discovered, improvements made, and insights gained during evaluation.

The "LLM Red Teaming and Evaluation Toolkit" empowers organizations to comprehensively assess the security and performance of their LLMs, ensuring they are resilient to real-world challenges. By introducing random personas and scenarios into the evaluation process, this novel approach enhances the robustness and fairness of LLM systems, contributing to their ongoing improvement and alignment with ethical considerations.

In [4]:
!pip install openai==0.28

Collecting openai==0.28
  Downloading openai-0.28.0-py3-none-any.whl (76 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/76.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━[0m [32m71.7/76.5 kB[0m [31m2.3 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 1.3.6
    Uninstalling openai-1.3.6:
      Successfully uninstalled openai-1.3.6
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llmx 0.0.15a0 requires cohere, which is not installed.
llmx 0.0.15a0 requires tiktoken, which is not installed.[0m[31m
[0mSuccessfully installed openai-0.28.0


In [None]:
import random
import openai

# Replace 'YOUR_API_KEY' with your actual OpenAI API key
api_key = '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'

# Define a function to create a random persona
def create_random_persona():
    # Demographics
    demographics = {
        "Age": random.randint(18, 80),
        "Gender": random.choice(["Male", "Female", "Other"]),
        "Location": random.choice(["US", "India", "UK", "Canada", "Australia"]),
        "Income": random.randint(20000, 100000)
    }

    # Psychographics
    psychographics = {
        "Values": random.choice(["Tradition", "Benevolence", "Universalism", "Self-Direction"]),
        "Interests": random.choice(["Sports", "Arts", "Sciences", "Technology", "Travel"]),
        "Lifestyle": random.choice(["Extrovert", "Introvert"]),
        "Personality": random.choice(["Agreeable", "Conscientious", "Neurotic", "Open-minded"])
    }

    # Profession
    professions = ['Teacher', 'Engineer', 'Doctor', 'Artist', 'Entrepreneur', 'Student', 'Homemaker']
    profession = random.choice(professions)

    # Communication Preferences
    communication_prefs = ['Email', 'Video Call', 'Phone', 'In-Person Meeting']
    communication = random.choice(communication_prefs)

    # Goals
    goals = ['Career Growth', 'Financial Stability', 'Work Life Balance', 'Travel the World', 'Learn New Skills']
    goals = random.choice(goals)

    # Hobbies and Interests
    hobbies_and_interests = random.choice(["Playing musical instruments", "Reading books", "Cooking", "Hiking", "Photography"])

    # Education
    education = random.choice(["High School", "Bachelor's Degree", "Master's Degree", "PhD"])

    # Family Status
    family_status = random.choice(["Single", "Married", "Divorced", "Has children"])

    # Health and Fitness
    health_and_fitness = random.choice(["Regular exercise", "Healthy diet", "No specific health conditions"])

    # Technology Usage
    technology_usage = random.choice(["Smartphone", "Laptop", "Social Media", "Specific software tools"])

    # Travel Preferences
    travel_preferences = random.choice(["Enjoys traveling", "Favorite destinations: Beaches", "Frequency of travel: Once a year"])

    # Social Network
    social_network = random.choice(["Close friends: 5", "Family members: 2", "Notable relationship: Best friend"])

    # Environmental Values
    environmental_values = random.choice(["Eco-conscious", "Supports sustainability"])

    # Cultural Background
    cultural_background = random.choice(["Cultural background: Hispanic", "Ethnicity: Asian"])

    # Financial Habits
    financial_habits = random.choice(["Savings: $10,000", "Investing: Stocks", "Spending patterns: Frugal"])

    # Media Consumption
    media_consumption = random.choice(["Favorite TV show: Game of Thrones", "Favorite book: Harry Potter", "News source: CNN"])

    # Community Involvement
    community_involvement = random.choice(["Community involvement: Volunteer at local shelter"])

    # Personal Challenges
    personal_challenges = random.choice(["Personal challenge: Time management"])

    # Aspirations
    aspirations = random.choice(["Aspiration: Start a successful business", "Dream: Travel the world"])

    # Technology Skills
    technology_skills = random.choice(["Proficiency in Python", "Programming languages: Java, C++"])

    # Create the persona dictionary
    persona = {
        "Demographics": demographics,
        "Psychographics": psychographics,
        "Profession": profession,
        "Communication Preferences": communication,
        "Goals": goals,
        "Hobbies and Interests": hobbies_and_interests,
        "Education": education,
        "Family Status": family_status,
        "Health and Fitness": health_and_fitness,
        "Technology Usage": technology_usage,
        "Travel Preferences": travel_preferences,
        "Social Network": social_network,
        "Environmental Values": environmental_values,
        "Cultural Background": cultural_background,
        "Financial Habits": financial_habits,
        "Media Consumption": media_consumption,
        "Community Involvement": community_involvement,
        "Personal Challenges": personal_challenges,
        "Aspirations": aspirations,
        "Technology Skills": technology_skills
    }

    return persona

# Define a function to create a random scenario
def create_random_scenario():
    scenarios = [
        "Scenario: You are on a deserted island and have to find food and shelter.",
        "Scenario: You are a detective trying to solve a mysterious murder case.",
        "Scenario: You are a scientist conducting experiments in a high-tech laboratory.",
        "Scenario: You are a chef competing in a prestigious cooking competition.",
        "Scenario: You are an astronaut on a mission to explore a distant planet.",
    ]
    scenario = random.choice(scenarios)
    return scenario

# Define a function to interact with the persona and scenario using OpenAI's GPT-3
def interact_with_persona_and_scenario():
    persona = create_random_persona()
    scenario = create_random_scenario()

    # Display persona details and scenario
    print("You are interacting with a persona:")
    for attribute, value in persona.items():
        if isinstance(value, dict):
            print(f"- {attribute}:")
            for sub_attribute, sub_value in value.items():
                print(f"  - {sub_attribute}: {sub_value}")
        else:
            print(f"- {attribute}: {value}")

    print("\nScenario:")
    print(scenario)

    # Initialize OpenAI's API client
    openai.api_key = api_key

    # Start the conversation
    while True:
        user_input = input("You: ")
        if user_input.lower() == "exit":
            print("Persona: Goodbye!")
            break

        # Construct the conversation input with user input
        conversation_input = f"You: {user_input}\n"

        # Add instruction for GPT-3 to respond in the persona's voice
        persona_instruction = "Persona, respond as if you are the persona mentioned above:\n"
        conversation_input = persona_instruction + conversation_input

        # Generate a response from OpenAI's GPT-3
        response = openai.Completion.create(
            engine="text-davinci-002",
            prompt=conversation_input,
            max_tokens=50,  # Adjust max_tokens as needed for response length
        )

        # Extract and print the generated response
        generated_response = response.choices[0].text.strip()
        print(f"Persona: {generated_response}")

# Call the interaction function
interact_with_persona_and_scenario()


You are interacting with a persona:
- Demographics:
  - Age: 18
  - Gender: Male
  - Location: Australia
  - Income: 63839
- Psychographics:
  - Values: Tradition
  - Interests: Sports
  - Lifestyle: Extrovert
  - Personality: Agreeable
- Profession: Engineer
- Communication Preferences: Email
- Goals: Career Growth
- Hobbies and Interests: Playing musical instruments
- Education: Master's Degree
- Family Status: Married
- Health and Fitness: No specific health conditions
- Technology Usage: Smartphone
- Travel Preferences: Frequency of travel: Once a year
- Social Network: Family members: 2
- Environmental Values: Eco-conscious
- Cultural Background: Ethnicity: Asian
- Financial Habits: Savings: $10,000
- Media Consumption: Favorite book: Harry Potter
- Community Involvement: Community involvement: Volunteer at local shelter
- Personal Challenges: Personal challenge: Time management
- Aspirations: Aspiration: Start a successful business
- Technology Skills: Programming languages: Java