Skip to content

FIX: Restore airt.cyber E2E + azure-ai-evaluation partner contract#1864

Open
romanlutz wants to merge 2 commits into
microsoft:mainfrom
romanlutz:romanlutz/airt-cyber-and-partner-ci
Open

FIX: Restore airt.cyber E2E + azure-ai-evaluation partner contract#1864
romanlutz wants to merge 2 commits into
microsoft:mainfrom
romanlutz:romanlutz/airt-cyber-and-partner-ci

Conversation

@romanlutz
Copy link
Copy Markdown
Contributor

Two unrelated CI failures on main, surfaced together because the GitHub check-run annotations only carried wrapper messages. Full investigation lives in the commit body; short version below.

End-to-End scenario tests (AzDO build #11909)

Every test_scenario_with_pyrit_scan[*] parametrization fails with Server not available at http://localhost:8000 because PR #1545 turned pyrit_scan into a thin client of a separate pyrit_backend server, and the e2e test never started one. The airt.cyber attribution in the GitHub annotation is just whichever scenario pytest printed last — every airt scenario, benchmark.adversarial, garak.encoding, etc., fail the same way.

Fixing the server launch surfaced a second latent failure: PR #1785 (scenario technique consolidation) made GET /api/scenarios/catalog/<name> instantiate every scenario class. Cyber.__init__ calls _build_cyber_strategy(), which requires AttackTechniqueRegistry to be populated — but the scenario_technique initializer was not in the test config, so the catalog GET returned 500 before any scenario run could begin.

Fixes:

  • tests/end_to_end/conftest.py (new): session-scoped autouse fixture that launches pyrit_backend via ServerLauncher and tears it down on exit. Idempotent if a backend is already healthy.
  • tests/end_to_end/test_config.yaml: declare scenario_technique as a backend startup initializer so the catalog endpoint can instantiate scenarios that rely on AttackTechniqueRegistry.

Verified locally end-to-end with a dummy API key: the catalog endpoint succeeds, the scenario runs both strategies (2/2 attacks), and only the final OpenAI request fails with 401 (fake key). On AzDO with real Key Vault credentials the scenario will pass.

Partner integration test (AzDO build #11908)

test_scorer_identifier_importable fails with:

ImportError: cannot import name ''ScorerIdentifier'' from ''pyrit.identifiers''

(The AzDO "exit code 2" is the script wrapper, not a pytest collection error — pytest itself reported 1 failed / 97 passed / 3 skipped.)

PR #1387 collapsed ScorerIdentifier/AttackIdentifier/ConverterIdentifier/TargetIdentifier into ComponentIdentifier with no deprecation alias. azure-ai-evaluation''s _rai_scorer.py still does from pyrit.identifiers import ScorerIdentifier and uses it as a return-type annotation (verified against the live partner source), so the test correctly flagged a real partner contract break.

Fix:

  • pyrit/identifiers/__init__.py: PEP 562 __getattr__ returns ComponentIdentifier for the name ScorerIdentifier and emits print_deprecation_message(removed_in="0.16.0") per the project deprecation policy. No partner code change required to keep azure-ai-evaluation working; the alias buys them a normal deprecation window to migrate.

Verification

  • tests/partner_integration: 98 passed, 3 skipped (was 97 passed, 1 failed, 3 skipped)
  • tests/unit/identifiers: 210 passed
  • tests/unit/cli/test_pyrit_scan.py + test_scenario_service: clean
  • tests/end_to_end/test_scenarios.py[airt.cyber] locally: backend starts, catalog 200s, scenario runs to LLM call (failed only on fake-key 401)

Out of scope

The 3 test_all_datasets.py HuggingFace/GitHub fetch failures in build #11909 (flaky 3rd-party HTTP) are unrelated to either root cause and are being handled in a separate session.

romanlutz and others added 2 commits May 30, 2026 14:10
Two unrelated CI failures on main, surfaced together because the
GitHub check-run annotations only carried wrapper messages.

End-to-end scenario tests (AzDO build #11909)
---------------------------------------------
Every test_scenario_with_pyrit_scan[*] parametrization fails with
"Server not available at http://localhost:8000" because PR microsoft#1545
turned pyrit_scan into a thin client of a separate pyrit_backend
server, and the e2e test never started one. The "airt.cyber" attribution
in the GitHub annotation is just whichever scenario pytest printed last;
every airt scenario, benchmark.adversarial, garak.encoding, etc., fail
the same way.

Fixing the server launch surfaced a second latent failure: PR microsoft#1785
(scenario technique consolidation) made the catalog endpoint
(GET /api/scenarios/catalog/<name>) instantiate every scenario class.
Cyber.__init__ then calls _build_cyber_strategy() which requires the
AttackTechniqueRegistry to be populated, but the scenario_technique
initializer was not in the test config — so the catalog GET returned
500 before any scenario run could begin.

Fixes:
- tests/end_to_end/conftest.py (new): session-scoped autouse fixture
  that launches pyrit_backend via ServerLauncher and tears it down on
  exit. Idempotent if a backend is already healthy.
- tests/end_to_end/test_config.yaml: declare scenario_technique as a
  backend startup initializer so the catalog endpoint can instantiate
  scenarios that rely on AttackTechniqueRegistry.

Verified locally end-to-end with a dummy API key: the catalog endpoint
succeeds, the scenario runs both strategies (2/2 attacks), and only the
final OpenAI request fails with 401 (fake key). On AzDO with real
Key Vault credentials the scenario will pass.

Partner integration test (AzDO build #11908)
--------------------------------------------
test_scorer_identifier_importable fails with
  ImportError: cannot import name 'ScorerIdentifier' from 'pyrit.identifiers'
(the AzDO "exit code 2" is the script wrapper, not a pytest collection
error — pytest itself reported 1 failed, 97 passed, 3 skipped).

PR microsoft#1387 collapsed ScorerIdentifier/AttackIdentifier/ConverterIdentifier/
TargetIdentifier into ComponentIdentifier with no deprecation alias.
azure-ai-evaluation's _rai_scorer.py still does
"from pyrit.identifiers import ScorerIdentifier" and uses it as a
return-type annotation (verified against the live partner source),
so the test correctly flagged a real partner contract break.

Fix:
- pyrit/identifiers/__init__.py: PEP 562 __getattr__ returns
  ComponentIdentifier for the name ScorerIdentifier and emits
  print_deprecation_message(removed_in="0.16.0") per the project
  deprecation policy. No partner code change required to keep
  azure-ai-evaluation working; the alias buys them a normal
  deprecation window to migrate.

Verification
------------
- tests/partner_integration: 98 passed, 3 skipped (was 97 passed,
  1 failed, 3 skipped)
- tests/unit/identifiers: 210 passed
- tests/unit/cli/test_pyrit_scan.py + test_scenario_service: clean
- tests/end_to_end/test_scenarios.py[airt.cyber] locally: backend
  starts, catalog 200s, scenario runs to LLM call (failed only on
  fake-key 401)

Out of scope: the 3 test_all_datasets.py HuggingFace/GitHub fetch
failures in build #11909 (flaky 3rd-party HTTP, unrelated to either
root cause).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds unit tests that exercise the new module-level `__getattr__` in pyrit/identifiers/__init__.py: the deprecated ScorerIdentifier alias resolves to ComponentIdentifier and emits a DeprecationWarning mentioning the 0.16.0 removal version, and unknown attributes raise AttributeError. This restores diff-cover (>=90%) on the changed lines in pyrit/identifiers/__init__.py (was 44.4%).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant