Skip to content

FEAT Add OR-Bench dataset loader#1423

Merged
romanlutz merged 18 commits intoAzure:mainfrom
romanlutz:romanlutz/add-or-bench-dataset
Mar 3, 2026
Merged

FEAT Add OR-Bench dataset loader#1423
romanlutz merged 18 commits intoAzure:mainfrom
romanlutz:romanlutz/add-or-bench-dataset

Conversation

@romanlutz
Copy link
Contributor

Add remote dataset loader for OR-Bench (bench-llm/OR-Bench), an over-refusal benchmark that tests whether language models wrongly refuse safe prompts. Supports both or-bench-hard-1k and or-bench-toxic configurations.

Copilot AI review requested due to automatic review settings March 1, 2026 14:25
@romanlutz romanlutz force-pushed the romanlutz/add-or-bench-dataset branch from fea917e to 5b341d2 Compare March 1, 2026 14:26
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new remote dataset loader for the HuggingFace OR-Bench benchmark (bench-llm/OR-Bench) so it can be discovered via SeedDatasetProvider and loaded as SeedDataset seeds, with support for both the or-bench-hard-1k and or-bench-toxic configurations.

Changes:

  • Introduces _ORBenchDataset remote loader that fetches OR-Bench from HuggingFace and converts rows into SeedPrompts.
  • Registers the new loader for automatic discovery and documents the new dataset name in the datasets loading notebook output.
  • Adds unit tests covering default loading and the toxic config path.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
pyrit/datasets/seed_datasets/remote/or_bench_dataset.py Implements the OR-Bench HuggingFace-backed dataset loader and maps records into SeedPrompts.
pyrit/datasets/seed_datasets/remote/__init__.py Imports/exports _ORBenchDataset to trigger provider registration and expose it from the remote loaders package.
tests/unit/datasets/test_or_bench_dataset.py Adds unit tests validating prompt mapping and config propagation to the HuggingFace fetch helper.
doc/code/datasets/1_loading_datasets.ipynb Updates the displayed list of available datasets to include or_bench.

@romanlutz romanlutz force-pushed the romanlutz/add-or-bench-dataset branch from 5b341d2 to 264aec8 Compare March 2, 2026 13:05
Copilot AI review requested due to automatic review settings March 2, 2026 13:43
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

Copilot AI review requested due to automatic review settings March 2, 2026 14:06
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Copilot AI review requested due to automatic review settings March 2, 2026 15:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

romanlutz and others added 7 commits March 2, 2026 11:22
Add remote dataset loader for OR-Bench (bench-llm/OR-Bench), an over-refusal
benchmark that tests whether language models wrongly refuse safe prompts.
Supports both or-bench-hard-1k and or-bench-toxic configurations.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…empty categories

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Each OR-Bench config gets its own loader class with a custom
description, sharing common fetch logic via _ORBenchBaseDataset.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…afety_tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@romanlutz romanlutz force-pushed the romanlutz/add-or-bench-dataset branch from 50762a2 to 014b274 Compare March 2, 2026 19:25
Copilot AI review requested due to automatic review settings March 2, 2026 22:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

Copilot AI review requested due to automatic review settings March 2, 2026 23:25
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

romanlutz and others added 2 commits March 2, 2026 16:41
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 3, 2026 03:57
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

romanlutz and others added 2 commits March 2, 2026 20:34
- Fix pyproject.toml per-file-ignore comment (import-location not ordering)
- Regenerate 1_loading_datasets.ipynb with latest dataset list
- Fix E712, E722, E731 violations from merged main
- Fix mypy cross-platform issue in attack_manager.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 3, 2026 04:43
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

@romanlutz romanlutz merged commit b265532 into Azure:main Mar 3, 2026
38 checks passed
@romanlutz romanlutz deleted the romanlutz/add-or-bench-dataset branch March 3, 2026 04:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants