From c38ed1dfdfab50af1478a4ad11bccc0133b7f756 Mon Sep 17 00:00:00 2001
From: Richard Lundeen <rlundeen@microsoft.com>
Date: Mon, 16 Mar 2026 11:13:34 -0700
Subject: [PATCH 1/4] updating github instructions

---
 .github/copilot-instructions.md               | 40 ++++++++
 .../converters/converters.instructions.md     | 94 +++++++++++++++++++
 .../{ => docs}/docs.instructions.md           |  0
 .../instructions/style-guide.instructions.md  | 66 +++++--------
 .../test.instructions.md}                     |  2 +
 5 files changed, 157 insertions(+), 45 deletions(-)
 create mode 100644 .github/copilot-instructions.md
 create mode 100644 .github/instructions/converters/converters.instructions.md
 rename .github/instructions/{ => docs}/docs.instructions.md (100%)
 rename .github/instructions/{unittests.instructions.md => tests/test.instructions.md} (87%)

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
new file mode 100644
index 0000000000..7d41c40922
--- /dev/null
+++ b/.github/copilot-instructions.md
@@ -0,0 +1,40 @@
+# PyRIT - Repository Instructions
+
+PyRIT (Python Risk Identification Tool for generative AI) is an open-source framework for security professionals to proactively identify risks in generative AI systems. It is maintained by the Microsoft AI Red Team.
+
+## Architecture
+
+PyRIT uses a modular "Lego brick" design. The main extensibility points are:
+
+- **Prompt Converters** (`pyrit/prompt_converter/`) — Transform prompts (70+ implementations). Base: `PromptConverter`.
+- **Scorers** (`pyrit/score/`) — Evaluate responses. Base: `Scorer`.
+- **Prompt Targets** (`pyrit/prompt_target/`) — Send prompts to LLMs/APIs. Base: `PromptTarget`.
+- **Executors / Scenarios** (`pyrit/executor/`, `pyrit/scenario/`) — Orchestrate multi-turn attacks.
+- **Memory** (`pyrit/memory/`) — `CentralMemory` for prompt/response persistence.
+
+All major components inherit from `Identifiable` and must implement `_build_identifier()`.
+
+## Coding Conventions
+
+Follow `.github/instructions/style-guide.instructions.md` for full details. The most critical rules:
+
+- All async functions MUST end with `_async`.
+- Functions with >1 parameter use `*` after self/cls to enforce keyword-only args.
+- Every parameter and return type must have type annotations.
+
+## Code Review Guidelines
+
+When performing a code review, be selective. Only leave comments for issues that genuinely matter:
+
+- Bugs, logic errors, or security concerns
+- Unclear code that would benefit from refactoring for readability
+- Violations of the critical coding conventions above (async suffix, keyword-only args, type annotations)
+- Missing or incorrect `_build_identifier()` implementations
+
+Do NOT leave comments about:
+- Style nitpicks that ruff/isort would catch automatically
+- Missing docstrings or comments — we prefer minimal documentation. Code should be self-explanatory.
+- Suggestions to add inline comments, logging, or error handling that isn't clearly needed
+- Minor naming preferences or subjective "improvements"
+
+Aim for fewer, higher-signal comments. A review with 2-3 important comments is better than 15 trivial ones.
diff --git a/.github/instructions/converters/converters.instructions.md b/.github/instructions/converters/converters.instructions.md
new file mode 100644
index 0000000000..23858641d2
--- /dev/null
+++ b/.github/instructions/converters/converters.instructions.md
@@ -0,0 +1,94 @@
+---
+applyTo: "pyrit/prompt_converter/**"
+---
+
+# Prompt Converter Development Guidelines
+
+## Base Class Contract
+
+All converters MUST inherit from `PromptConverter` and implement:
+
+```python
+class MyConverter(PromptConverter):
+    SUPPORTED_INPUT_TYPES = ("text",)    # Required — non-empty tuple of PromptDataType values
+    SUPPORTED_OUTPUT_TYPES = ("text",)   # Required — non-empty tuple of PromptDataType values
+
+    async def convert_async(self, *, prompt: str, input_type: PromptDataType = "text") -> ConverterResult:
+        ...
+```
+
+Missing or empty `SUPPORTED_INPUT_TYPES` / `SUPPORTED_OUTPUT_TYPES` raises `TypeError` at class definition time via `__init_subclass__`.
+
+## ConverterResult
+
+Always return a `ConverterResult` with matching `output_type`:
+
+```python
+return ConverterResult(output_text=result, output_type="text")
+```
+
+Valid `PromptDataType` values: `"text"`, `"image_path"`, `"audio_path"`, `"video_path"`, `"binary_path"`, `"url"`, `"error"`.
+
+The `output_type` MUST match what was actually produced — e.g., if you wrote a file, return the path with `"image_path"`, not `"text"`.
+
+## Input Validation
+
+Check input type support in `convert_async`:
+
+```python
+if not self.input_supported(input_type):
+    raise ValueError(f"Input type {input_type} not supported")
+```
+
+## Identifiable Pattern
+
+All converters inherit `Identifiable`. Override `_build_identifier()` to include parameters that affect conversion behavior:
+
+```python
+def _build_identifier(self) -> ComponentIdentifier:
+    return self._create_identifier(
+        params={"encoding": self._encoding},           # Behavioral params only
+        children={"target": self._target.get_identifier()}  # If converter wraps a target
+    )
+```
+
+Include: encoding types, templates, offsets, model names.
+Exclude: retry counts, logging config, timeouts.
+
+## Standard Imports
+
+```python
+from pyrit.models import PromptDataType
+from pyrit.prompt_converter.prompt_converter import ConverterResult, PromptConverter
+from pyrit.identifiers import ComponentIdentifier
+```
+
+For LLM-based converters, also import:
+```python
+from pyrit.prompt_target import PromptChatTarget
+```
+
+## Constructor Pattern
+
+Use keyword-only arguments. Use `@apply_defaults` if the converter accepts targets or common config:
+
+```python
+from pyrit.common.apply_defaults import apply_defaults
+
+class MyConverter(PromptConverter):
+    @apply_defaults
+    def __init__(self, *, target: PromptChatTarget, template: str = "default") -> None:
+        ...
+```
+
+## Exports
+
+New converters MUST be added to `pyrit/prompt_converter/__init__.py` — both the import and the `__all__` list.
+
+## Common Review Issues
+
+- `output_type` in `ConverterResult` doesn't match what's actually returned
+- Synchronous/blocking I/O inside `convert_async` (use `aiofiles` or run in executor)
+- `_build_identifier` missing parameters that change output behavior
+- Missing export in `__init__.py`
+- Using positional args instead of keyword-only in `__init__` or `convert_async`
diff --git a/.github/instructions/docs.instructions.md b/.github/instructions/docs/docs.instructions.md
similarity index 100%
rename from .github/instructions/docs.instructions.md
rename to .github/instructions/docs/docs.instructions.md
diff --git a/.github/instructions/style-guide.instructions.md b/.github/instructions/style-guide.instructions.md
index e411fa83d9..8ddb2e6099 100644
--- a/.github/instructions/style-guide.instructions.md
+++ b/.github/instructions/style-guide.instructions.md
@@ -42,11 +42,21 @@ def validate_input(self, data: dict) -> None:  # Should be private
 - **EVERY** function parameter MUST have explicit type declaration
 - **EVERY** function MUST declare its return type
 - Use `None` for functions that don't return a value
-- Import types from `typing` module as needed
+
+### Modern Type Syntax (Python 3.10+)
+- Use built-in generics and union syntax:
+  - `list[str]` not `List[str]`
+  - `dict[str, Any]` not `Dict[str, Any]`
+  - `str | None` not `Optional[str]`
+  - `int | float` not `Union[int, float]`
+- Still import `Any`, `Literal`, `TypeVar`, `Protocol`, `cast` etc. from `typing` as needed
 
 ```python
 # CORRECT
-def process_data(self, *, data: List[str], threshold: float = 0.5) -> Dict[str, Any]:
+def process_data(self, *, data: list[str], threshold: float = 0.5) -> dict[str, Any]:
+    ...
+
+def get_name(self) -> str | None:
     ...
 
 # INCORRECT
@@ -54,15 +64,6 @@ def process_data(self, data, threshold=0.5):  # Missing all type annotations
     ...
 ```
 
-### Common Type Imports
-```python
-from typing import (
-    Any, Dict, List, Optional, Union, Tuple, Set,
-    Callable, TypeVar, Generic, Protocol, Literal,
-    cast, overload
-)
-```
-
 ## Function Signatures
 
 ### Keyword-Only Arguments
@@ -75,13 +76,13 @@ def __init__(
     self,
     *,
     target: PromptTarget,
-    scorer: Optional[Scorer] = None,
+    scorer: Scorer | None = None,
     max_retries: int = 3
 ) -> None:
     ...
 
 # INCORRECT
-def __init__(self, target: PromptTarget, scorer: Optional[Scorer] = None, max_retries: int = 3):
+def __init__(self, target: PromptTarget, scorer: Scorer | None = None, max_retries: int = 3):
     ...
 ```
 
@@ -132,7 +133,7 @@ def calculate_score(
     response: str,
     objective: str,
     threshold: float = 0.8,
-    max_attempts: Optional[int] = None
+    max_attempts: int | None = None
 ) -> Score:
     """
     Calculate the score for a response against an objective.
@@ -144,7 +145,7 @@ def calculate_score(
         response (str): The response text to evaluate.
         objective (str): The objective to evaluate against.
         threshold (float): The minimum score threshold. Defaults to 0.8.
-        max_attempts (Optional[int]): Maximum number of scoring attempts. Defaults to None.
+        max_attempts (int | None): Maximum number of scoring attempts. Defaults to None.
 
     Returns:
         Score: The calculated score object containing value and metadata.
@@ -155,31 +156,6 @@ def calculate_score(
     """
 ```
 
-## Enums and Constants
-
-### Use Enums Over Literals
-- Always use Enum classes instead of Literal types for predefined choices
-- Enums are more maintainable and provide better IDE support
-
-```python
-# CORRECT
-from enum import Enum
-
-class AttackOutcome(Enum):
-    SUCCESS = "success"
-    FAILURE = "failure"
-    UNDETERMINED = "undetermined"
-
-def process_result(self, *, outcome: AttackOutcome) -> None:
-    ...
-
-# INCORRECT
-from typing import Literal
-
-def process_result(self, *, outcome: Literal["success", "failure", "undetermined"]) -> None:
-    ...
-```
-
 ### Class-Level Constants
 - Define constants as class attributes, not module-level
 - Use UPPER_CASE naming for constants
@@ -244,7 +220,7 @@ import logging
 from dataclasses import dataclass
 from enum import Enum
 from pathlib import Path
-from typing import Dict, List, Optional
+from typing import Any
 
 # Third-party imports
 import numpy as np
@@ -317,7 +293,7 @@ if not self._model:
 
 ```python
 # CORRECT
-def process_items(self, *, items: List[str]) -> List[str]:
+def process_items(self, *, items: list[str]) -> list[str]:
     if not items:
         return []
 
@@ -328,7 +304,7 @@ def process_items(self, *, items: List[str]) -> List[str]:
     return [self._process_single(item) for item in items]
 
 # INCORRECT - Excessive nesting
-def process_items(self, *, items: List[str]) -> List[str]:
+def process_items(self, *, items: list[str]) -> list[str]:
     if items:
         if len(items) == 1:
             return [self._process_single(items[0])]
@@ -411,7 +387,7 @@ class AttackExecutor:
         *,
         target: PromptTarget,
         scorer: Scorer,
-        logger: Optional[logging.Logger] = None
+        logger: logging.Logger | None = None
     ) -> None:
         self._target = target
         self._scorer = scorer
@@ -456,7 +432,7 @@ def process_large_dataset(self, *, file_path: Path) -> Generator[Result, None, N
             yield self._process_line(line)
 
 # INCORRECT
-def process_large_dataset(self, *, file_path: Path) -> List[Result]:
+def process_large_dataset(self, *, file_path: Path) -> list[Result]:
     with open(file_path) as f:
         lines = f.readlines()  # Loads entire file into memory
     return [self._process_line(line) for line in lines]
diff --git a/.github/instructions/unittests.instructions.md b/.github/instructions/tests/test.instructions.md
similarity index 87%
rename from .github/instructions/unittests.instructions.md
rename to .github/instructions/tests/test.instructions.md
index 7d7c998191..3a61375c1c 100644
--- a/.github/instructions/unittests.instructions.md
+++ b/.github/instructions/tests/test.instructions.md
@@ -6,6 +6,8 @@ applyTo: '**/tests/**'
 
 When generating unit tests for PyRIT components, follow these comprehensive guidelines to ensure consistent, maintainable, and thorough test coverage.
 
+Tests should be readable, and it should be clear what is being tested and how. Tests should also be maintainable, and it should be easy to update them when the code changes. We should make sure we reuse code. If there are common patterns across tests, we should extract those into helper functions or fixtures. This makes it easier to write new tests and maintain existing ones.
+
 ## Core Testing Requirements
 
 ### Database/Memory Isolation

From 9448af280f4e26252020fd2da39485496cf55d40 Mon Sep 17 00:00:00 2001
From: Richard Lundeen <rlundeen@microsoft.com>
Date: Mon, 16 Mar 2026 11:51:14 -0700
Subject: [PATCH 2/4] updating

---
 .github/copilot-instructions.md               |  15 +-
 .../converters.instructions.md                |   8 -
 .../{docs => }/docs.instructions.md           |   4 +-
 .../instructions/scenarios.instructions.md    | 141 ++++++++++++++++++
 .../{tests => }/test.instructions.md          |   0
 5 files changed, 146 insertions(+), 22 deletions(-)
 rename .github/instructions/{converters => }/converters.instructions.md (87%)
 rename .github/instructions/{docs => }/docs.instructions.md (96%)
 create mode 100644 .github/instructions/scenarios.instructions.md
 rename .github/instructions/{tests => }/test.instructions.md (100%)

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index 7d41c40922..4f3d518b97 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -1,6 +1,6 @@
 # PyRIT - Repository Instructions
 
-PyRIT (Python Risk Identification Tool for generative AI) is an open-source framework for security professionals to proactively identify risks in generative AI systems. It is maintained by the Microsoft AI Red Team.
+PyRIT (Python Risk Identification Tool for generative AI) is an open-source framework for security professionals to proactively identify risks in generative AI systems.
 
 ## Architecture
 
@@ -12,16 +12,6 @@ PyRIT uses a modular "Lego brick" design. The main extensibility points are:
 - **Executors / Scenarios** (`pyrit/executor/`, `pyrit/scenario/`) — Orchestrate multi-turn attacks.
 - **Memory** (`pyrit/memory/`) — `CentralMemory` for prompt/response persistence.
 
-All major components inherit from `Identifiable` and must implement `_build_identifier()`.
-
-## Coding Conventions
-
-Follow `.github/instructions/style-guide.instructions.md` for full details. The most critical rules:
-
-- All async functions MUST end with `_async`.
-- Functions with >1 parameter use `*` after self/cls to enforce keyword-only args.
-- Every parameter and return type must have type annotations.
-
 ## Code Review Guidelines
 
 When performing a code review, be selective. Only leave comments for issues that genuinely matter:
@@ -29,7 +19,6 @@ When performing a code review, be selective. Only leave comments for issues that
 - Bugs, logic errors, or security concerns
 - Unclear code that would benefit from refactoring for readability
 - Violations of the critical coding conventions above (async suffix, keyword-only args, type annotations)
-- Missing or incorrect `_build_identifier()` implementations
 
 Do NOT leave comments about:
 - Style nitpicks that ruff/isort would catch automatically
@@ -38,3 +27,5 @@ Do NOT leave comments about:
 - Minor naming preferences or subjective "improvements"
 
 Aim for fewer, higher-signal comments. A review with 2-3 important comments is better than 15 trivial ones.
+
+Follow `.github/instructions/style-guide.instructions.md` for style guidelines. And look in `.github/instructions/` for specific instructions on the different components.
diff --git a/.github/instructions/converters/converters.instructions.md b/.github/instructions/converters.instructions.md
similarity index 87%
rename from .github/instructions/converters/converters.instructions.md
rename to .github/instructions/converters.instructions.md
index 23858641d2..b7e954c1b3 100644
--- a/.github/instructions/converters/converters.instructions.md
+++ b/.github/instructions/converters.instructions.md
@@ -84,11 +84,3 @@ class MyConverter(PromptConverter):
 ## Exports
 
 New converters MUST be added to `pyrit/prompt_converter/__init__.py` — both the import and the `__all__` list.
-
-## Common Review Issues
-
-- `output_type` in `ConverterResult` doesn't match what's actually returned
-- Synchronous/blocking I/O inside `convert_async` (use `aiofiles` or run in executor)
-- `_build_identifier` missing parameters that change output behavior
-- Missing export in `__init__.py`
-- Using positional args instead of keyword-only in `__init__` or `convert_async`
diff --git a/.github/instructions/docs/docs.instructions.md b/.github/instructions/docs.instructions.md
similarity index 96%
rename from .github/instructions/docs/docs.instructions.md
rename to .github/instructions/docs.instructions.md
index 2050af2e79..659f1ea22a 100644
--- a/.github/instructions/docs/docs.instructions.md
+++ b/.github/instructions/docs.instructions.md
@@ -1,5 +1,5 @@
 ---
-applyTo: 'doc/code/**/*.{py,ipynb}'
+applyTo: 'doc/**/*.{py,ipynb}'
 ---
 
 # Documentation File Synchronization
@@ -8,7 +8,7 @@ applyTo: 'doc/code/**/*.{py,ipynb}'
 
 All Jupyter notebooks (.ipynb) in the `doc/` directory have corresponding Python (.py) files that are **tightly synchronized**. These files MUST always match exactly in content. They represent the same documentation in different formats.
 
-**Locations:** `doc/code/**/*.ipynb` and `doc/code/**/*.py`
+**Locations:** `doc/**/*.ipynb` and `doc/**/*.py`
 
 ## Editing Guidelines
 
diff --git a/.github/instructions/scenarios.instructions.md b/.github/instructions/scenarios.instructions.md
new file mode 100644
index 0000000000..ba544465fc
--- /dev/null
+++ b/.github/instructions/scenarios.instructions.md
@@ -0,0 +1,141 @@
+---
+applyTo: "pyrit/scenario/**"
+---
+
+# PyRIT Scenario Development Guidelines
+
+Scenarios orchestrate multi-attack security testing campaigns. Each scenario groups `AtomicAttack` instances and executes them sequentially against a target.
+
+## Base Class Contract
+
+All scenarios inherit from `Scenario` (ABC) and must:
+
+1. **Define `VERSION`** as a class constant (increment on breaking changes)
+2. **Implement four abstract methods:**
+
+```python
+class MyScenario(Scenario):
+    VERSION: int = 1
+
+    @classmethod
+    def get_strategy_class(cls) -> type[ScenarioStrategy]:
+        return MyStrategy
+
+    @classmethod
+    def get_default_strategy(cls) -> ScenarioStrategy:
+        return MyStrategy.ALL
+
+    @classmethod
+    def default_dataset_config(cls) -> DatasetConfiguration:
+        return DatasetConfiguration(dataset_names=["my_dataset"])
+
+    async def _get_atomic_attacks_async(self) -> list[AtomicAttack]:
+        ...
+```
+
+## Constructor Pattern
+
+```python
+@apply_defaults
+def __init__(
+    self,
+    *,
+    adversarial_chat: PromptChatTarget | None = None,
+    objective_scorer: TrueFalseScorer | None = None,
+    scenario_result_id: str | None = None,
+) -> None:
+    # 1. Resolve defaults for optional params
+    if not objective_scorer:
+        objective_scorer = self._get_default_scorer()
+
+    # 2. Store config objects for _get_atomic_attacks_async
+    self._scorer_config = AttackScoringConfig(objective_scorer=objective_scorer)
+
+    # 3. Call super().__init__ — required args: version, strategy_class, objective_scorer
+    super().__init__(
+        version=self.VERSION,
+        strategy_class=MyStrategy,
+        objective_scorer=objective_scorer,
+    )
+```
+
+Requirements:
+- `@apply_defaults` decorator on `__init__`
+- All parameters keyword-only via `*`
+- `super().__init__()` called with `version`, `strategy_class`, `objective_scorer`
+- complex objects like `adversarial_chat` or `objective_scorer` should be passed into the constructor.
+
+## Dataset Loading
+
+Datasets are read from `CentralMemory`.
+
+### Basic — named datasets:
+```python
+DatasetConfiguration(
+    dataset_names=["airt_hate", "airt_violence"],
+    max_dataset_size=10,  # optional: sample up to N per dataset
+)
+```
+
+### Advanced — custom subclass for filtering:
+```python
+class MyDatasetConfiguration(DatasetConfiguration):
+    def get_seed_groups(self) -> dict[str, list[SeedGroup]]:
+        result = super().get_seed_groups()
+        # Filter by selected strategies via self._scenario_composites
+        return filtered_result
+```
+
+Options:
+- `dataset_names` — load by name from memory
+- `seed_groups` — pass explicit groups (mutually exclusive with `dataset_names`)
+- `max_dataset_size` — cap per dataset
+- Override `_load_seed_groups_for_dataset()` for custom loading
+
+## Strategy Enum
+
+Strategies should be selectable by an axis. E.g. it could be harm category or and attack type, but likely not both or it gets confusing.
+
+```python
+class MyStrategy(ScenarioStrategy):
+    ALL = ("all", {"all"})            # Required aggregate
+    EASY = ("easy", {"easy"})
+
+    Base64 = ("base64", {"easy", "converter"})
+    Crescendo = ("crescendo", {"difficult", "multi_turn"})
+
+    @classmethod
+    def get_aggregate_tags(cls) -> set[str]:
+        return {"all", "easy", "difficult"}
+```
+
+- `ALL` aggregate is always required
+- Each member: `NAME = ("string_value", {tag_set})`
+- Aggregates expand to all strategies matching their tag
+
+## AtomicAttack Construction
+
+```python
+AtomicAttack(
+    atomic_attack_name=strategy_name,   # groups related attacks
+    attack=attack_instance,             # AttackStrategy implementation
+    seed_groups=list(seed_groups),       # must be non-empty
+    memory_labels=self._memory_labels,   # from base class
+)
+```
+
+- `seed_groups` must be non-empty — validate before constructing
+- `self._objective_target` is only available after `initialize_async()` — don't access in `__init__`
+- Pass `memory_labels` to every AtomicAttack
+
+## Exports
+
+New scenarios must be registered in `pyrit/scenario/__init__.py` as virtual package imports.
+
+## Common Review Issues
+
+- Accessing `self._objective_target` or `self._scenario_composites` before `initialize_async()`
+- Forgetting `@apply_defaults` on `__init__`
+- Empty `seed_groups` passed to `AtomicAttack`
+- Missing `VERSION` class constant
+- Missing `_async` suffix on `_get_atomic_attacks_async`
diff --git a/.github/instructions/tests/test.instructions.md b/.github/instructions/test.instructions.md
similarity index 100%
rename from .github/instructions/tests/test.instructions.md
rename to .github/instructions/test.instructions.md

From 6720159b39fb5e3b70eb6faede4dd82f1b067e2f Mon Sep 17 00:00:00 2001
From: Richard Lundeen <rlundeen@microsoft.com>
Date: Tue, 17 Mar 2026 14:15:14 -0700
Subject: [PATCH 3/4] pr feedback

---
 .../instructions/converters.instructions.md   |   5 +-
 .github/instructions/test.instructions.md     | 110 +++++++-----------
 2 files changed, 44 insertions(+), 71 deletions(-)

diff --git a/.github/instructions/converters.instructions.md b/.github/instructions/converters.instructions.md
index b7e954c1b3..945d03f6ce 100644
--- a/.github/instructions/converters.instructions.md
+++ b/.github/instructions/converters.instructions.md
@@ -81,6 +81,7 @@ class MyConverter(PromptConverter):
         ...
 ```
 
-## Exports
+## Exports and External Updates
 
-New converters MUST be added to `pyrit/prompt_converter/__init__.py` — both the import and the `__all__` list.
+- New converters MUST be added to `pyrit/prompt_converter/__init__.py` — both the import and the `__all__` list.
+- The modality table with new/updated converters `doc/code/converters/0_converters.ipynb` and the associated .py pct file must also be updated.
\ No newline at end of file
diff --git a/.github/instructions/test.instructions.md b/.github/instructions/test.instructions.md
index 3a61375c1c..ee45335ec0 100644
--- a/.github/instructions/test.instructions.md
+++ b/.github/instructions/test.instructions.md
@@ -2,86 +2,58 @@
 applyTo: '**/tests/**'
 ---
 
-# PyRIT Test Generation Instructions
+# PyRIT Test Instructions
 
-When generating unit tests for PyRIT components, follow these comprehensive guidelines to ensure consistent, maintainable, and thorough test coverage.
+Readable, maintainable tests. Reuse helpers from `conftest.py` and `mocks.py` in each test tier.
 
-Tests should be readable, and it should be clear what is being tested and how. Tests should also be maintainable, and it should be easy to update them when the code changes. We should make sure we reuse code. If there are common patterns across tests, we should extract those into helper functions or fixtures. This makes it easier to write new tests and maintain existing ones.
+## Test Tiers
 
-## Core Testing Requirements
+Most tests should be unit tests. Integration and end-to-end tests are for testing that systems work toegether.
 
-### Database/Memory Isolation
+- **Unit** (`tests/unit/`): Mock all external dependencies. Fast, parallel (`pytest -n 4`). Run: `make unit-test`
+- **Integration** (`tests/integration/`): Real APIs, real credentials. Requires `RUN_ALL_TESTS=true`. Sequential. Run: `make integration-test`
+- **End-to-End** (`tests/end_to_end/`): Full scenarios via `pyrit_scan` CLI, no mocking, very slow. Run: `make end-to-end-test`
 
-For unit tests (in tests/unit):
-- Always use `@pytest.mark.usefixtures("patch_central_database")` decorator on unit test classes that may interact with the Central Memory
-- This ensures tests run in isolation without affecting the actual database
+## Unit Test Rules
 
-### Async Testing
-- Use `@pytest.mark.asyncio` decorator for all async test methods
-- Use `AsyncMock` instead of `MagicMock` when mocking async methods
-- Properly await all async operations in tests
+- Directory mirrors `pyrit/` (e.g. `pyrit/prompt_converter/` → `tests/unit/converter/`)
+- File naming: `test_[component].py`
+- Group tests in classes prefixed with `Test`
+- Use `@pytest.mark.usefixtures("patch_central_database")` on classes touching Central Memory
+- Use `@pytest.mark.asyncio` and `AsyncMock` for async methods
+- Reuse `tests/unit/mocks.py` helpers: `MockPromptTarget`, `get_sample_conversations`, `get_mock_target_identifier`, `openai_chat_response_json_dict`
+- Key fixtures from `tests/unit/conftest.py`: `patch_central_database`, `sqlite_instance`
+- No network calls should ever happen in unit tests, but file access is okay
+- Unit tests should be fast and can be run in parallel.
 
+## Integration Test Rules
 
-### Using Pre-Configured Settings
+- Mark with `@pytest.mark.run_only_if_all_tests` (skipped unless `RUN_ALL_TESTS=true`)
+- File naming: `test_[component]_integration.py`
+- Has its own `conftest.py` (`azuresql_instance` fixture, `initialize_pyrit_async`) and `mocks.py`
+- Minimize mocking — test real integrations
 
-Check conftest and mocks.py to see if there are common utilities that can be reused across tests.
+## What to Test
 
-One common issue is setting the central database. Use the `patch_central_database` is a common solution.
+- **Init**: valid/invalid params, defaults, required-only vs all-optional
+- **Core methods**: normal inputs, boundary conditions, return values, side effects
+- **Errors**: invalid input, exception propagation, cleanup on failure
 
+## Mocking & Style Preferences
 
-### Test Organization
-- Group related tests into classes with descriptive names starting with `Test`
-- Place tests in `tests/unit/[module]/test_[component].py`
-- Each test class should focus on a specific aspect of the component
+- Use `unittest.mock.patch` / `patch.object` — not `monkeypatch`
+- Prefer `patch.object(instance, "method", new_callable=AsyncMock)` over broad module-path patches
+- Use `AsyncMock` directly for async methods, `MagicMock` for sync
+- Use `spec=ClassName` on mocks when you need to constrain to a real interface
+- Use `side_effect` for sequences or raising exceptions
+- For environment variables: `patch.dict("os.environ", {...})`
+- Check calls with `assert_called_once()`, `.call_args`, or `.call_count` — avoid `assert_called_once_with` for complex args, prefer `.call_args` inspection
 
-## Test Structure Guidelines
+## Test Structure Preferences
 
-### 1. Initialization Tests
-Test all initialization scenarios:
-- Valid initialization with required parameters only
-- Initialization with all optional parameters
-- Invalid parameter combinations that should raise exceptions
-- Default value verification
-- Configuration object handling
-
-### 2. Core Functionality Tests
-For each public method:
-- Test normal operation with valid inputs
-- Test boundary conditions
-- Test return values and side effects
-- Verify state changes
-- Test method interactions
-
-### 3. Error Handling Tests
-Comprehensive error scenario coverage:
-- Invalid input handling
-- Exception propagation
-- Recovery mechanisms
-- Error message clarity
-- Resource cleanup on failure
-
-### 4. Integration Tests
-Test component interactions:
-- Mock external dependencies appropriately
-- Verify correct calls to dependencies
-- Test data flow between components
-- Validate contracts between components
-
-## Mocking Best Practices
-
-### Dependency Isolation
-- Mock all external dependencies (APIs, databases, file systems)
-- Mock at the appropriate level - not too high, not too low
-- Use dependency injection patterns where possible
-
-### Mock Configuration
-```python
-# Example patterns to follow:
-# For async methods
-mock_obj.method_name.return_value = AsyncMock(return_value=expected_result)
-
-# For sync methods
-mock_obj.method_name.return_value = expected_result
-
-# For side effects
-mock_obj.method_name.side_effect = [result1, result2, Exception("error")]
+- **Standalone test functions preferred** over test classes (use classes only when `usefixtures` or grouping is needed)
+- **Fixtures**: define at module level with `@pytest.fixture`, not as class methods. Use `yield` for setup/teardown
+- **Test naming**: `test_<method_or_noun>_<scenario>` (e.g. `test_convert_async_default_settings`, `test_init_with_no_key_raises`)
+- **Assertions**: use plain `assert` statements. Use `pytest.raises(ExceptionType, match="...")` for error cases
+- **Test data**: define constants at module level, use `mocks.py` helpers, or inline small data. Don't create separate data files for unit tests
+- **Parametrize**: use `@pytest.mark.parametrize` for data-driven tests with multiple inputs

From a0e9e1bbb6df8c9089dc4cd12df9ec263277e9f7 Mon Sep 17 00:00:00 2001
From: Richard Lundeen <rlundeen@microsoft.com>
Date: Tue, 17 Mar 2026 14:17:21 -0700
Subject: [PATCH 4/4] pre-commit

---
 .github/instructions/converters.instructions.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/instructions/converters.instructions.md b/.github/instructions/converters.instructions.md
index 945d03f6ce..9c00bb2ed1 100644
--- a/.github/instructions/converters.instructions.md
+++ b/.github/instructions/converters.instructions.md
@@ -84,4 +84,4 @@ class MyConverter(PromptConverter):
 ## Exports and External Updates
 
 - New converters MUST be added to `pyrit/prompt_converter/__init__.py` — both the import and the `__all__` list.
-- The modality table with new/updated converters `doc/code/converters/0_converters.ipynb` and the associated .py pct file must also be updated.
\ No newline at end of file
+- The modality table with new/updated converters `doc/code/converters/0_converters.ipynb` and the associated .py pct file must also be updated.