Skip to content

351 group intents by categories#409

Merged
Nicola Franco (franconicola) merged 4 commits into
mainfrom
351-group-intents-by-categories
May 29, 2026
Merged

351 group intents by categories#409
Nicola Franco (franconicola) merged 4 commits into
mainfrom
351-group-intents-by-categories

Conversation

@marcorusso97
Copy link
Copy Markdown
Contributor

Summary

This PR adds support for selecting attack goals through an intents taxonomy (OmniSafeBench), with explicit category/subcategory label propagation to tracking and dashboard outputs.

What Changed

  1. Added intents as a supported input source (alongside goals and dataset).
  2. Implemented input precedence:
    1. goals
    2. intents
    3. dataset
  3. Added intents selection support with three formats:
    1. Full label strings
    2. Enums (IntentCategory, IntentSubcategory)
    3. Label codes (for example: A, A1, A2)
  4. When intents provide explicit labels:
    1. Category-classifier preflight is skipped.
    2. Canonical labels are used directly in results metadata.
  5. Fixed tracking/index alignment for deferred and batched flows by preserving global goal index offsets.
  6. Added new intents dataset loader and public exports.
  7. Extended shared attack config schema with the intents field.

Default Behavior

  1. If subcategories is omitted, all subcategories of the selected category are used.
  2. If samples_per_subcategory is omitted, the default is 1 for each selected subcategory.
  3. If both are omitted, the behavior becomes: 1 sample for every subcategory in the selected category.

Documentation

  1. Updated Datasets root page with intents taxonomy usage.
  2. Added a dedicated page: Selecting intent categories.
  3. Added an early mention in Getting Started Datasets tutorial.
  4. Added explicit source citation for OmniSafeBench-MM.
  5. Replaced the taxonomy list with a structured table including:
    1. Code
    2. Name
    3. Enum ID
    4. Samples

Tests and Validation

  1. Added/updated unit tests for:
    1. Intents parsing and mapping
    2. Orchestrator precedence
    3. Classifier bypass with explicit labels
    4. Tracking coordinator offset behavior
  2. Documentation build passes successfully (npm run build).

Backward Compatibility

  1. Existing goals and dataset flows remain unchanged.
  2. New behavior is opt-in via the intents config path.

Source Attribution

Intent taxonomy and intents list are based on OmniSafeBench-MM:
https://github.com/jiaxiaojunQAQ/OmniSafeBench-MM/

@marcorusso97 Marco Russo (marcorusso97) linked an issue May 29, 2026 that may be closed by this pull request
@franconicola Nicola Franco (franconicola) temporarily deployed to 351-group-intents-by-categories - Docs PR #409 May 29, 2026 12:17 — with Render Destroyed

if intents_config is not None and dataset_config is not None:
logger.warning("Both 'intents' and 'dataset' provided. Using 'intents'.")
dataset_config = None
def __str__(self) -> str:
return str(self.value)

pass
@codecov
Copy link
Copy Markdown

codecov Bot commented May 29, 2026

Codecov Report

❌ Patch coverage is 78.46154% with 56 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
hackagent/datasets/intents.py 78.71% 43 Missing ⚠️
hackagent/attacks/orchestrator.py 70.96% 9 Missing ⚠️
hackagent/router/tracking/tracker.py 81.81% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@franconicola Nicola Franco (franconicola) merged commit d5106c7 into main May 29, 2026
24 checks passed
@franconicola Nicola Franco (franconicola) deleted the 351-group-intents-by-categories branch May 29, 2026 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Group intents by categories

2 participants