Skip to content

FEAT Add BeaverTails dataset loader#1424

Merged
romanlutz merged 17 commits intoAzure:mainfrom
romanlutz:romanlutz/add-beaver-tails-dataset
Mar 4, 2026
Merged

FEAT Add BeaverTails dataset loader#1424
romanlutz merged 17 commits intoAzure:mainfrom
romanlutz:romanlutz/add-beaver-tails-dataset

Conversation

@romanlutz
Copy link
Contributor

Add remote dataset loader for BeaverTails (PKU-Alignment/BeaverTails), containing 330k+ QA pairs annotated across 14 harm categories for safety alignment research. Filters to unsafe entries by default.

Copilot AI review requested due to automatic review settings March 1, 2026 14:28
@romanlutz romanlutz force-pushed the romanlutz/add-beaver-tails-dataset branch from 7b635d9 to b652d70 Compare March 1, 2026 14:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new remote seed dataset loader for the BeaverTails HuggingFace dataset, making it discoverable via SeedDatasetProvider and documenting its availability.

Changes:

  • Introduces _BeaverTailsDataset remote loader with optional unsafe_only filtering (default: unsafe only).
  • Registers the loader in the remote datasets module and adds unit tests for filtering behavior.
  • Updates the “Loading Built-in Datasets” notebook output to include the new dataset name.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
pyrit/datasets/seed_datasets/remote/beaver_tails_dataset.py New HuggingFace-backed loader that converts BeaverTails rows into SeedPrompts (unsafe-only by default).
pyrit/datasets/seed_datasets/remote/__init__.py Imports/exports the new loader so it’s auto-registered/discoverable.
tests/unit/datasets/test_beaver_tails_dataset.py Adds unit tests covering unsafe-only vs all-entries behavior and dataset naming.
doc/code/datasets/1_loading_datasets.ipynb Notebook updated to reflect the new dataset in the available list (but now includes executed outputs/metadata).

@romanlutz romanlutz force-pushed the romanlutz/add-beaver-tails-dataset branch 2 times, most recently from 9741ae3 to 1fd2ef7 Compare March 2, 2026 13:02
Copilot AI review requested due to automatic review settings March 2, 2026 13:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

Copilot AI review requested due to automatic review settings March 2, 2026 13:56
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Copilot AI review requested due to automatic review settings March 2, 2026 15:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

romanlutz and others added 7 commits March 2, 2026 13:48
Add remote dataset loader for BeaverTails (PKU-Alignment/BeaverTails), containing
330k+ QA pairs annotated across 14 harm categories for safety alignment research.
Filters to unsafe entries by default.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The HF dataset identifier is now a class constant HF_DATASET_NAME
instead of a constructor parameter, consistent with other loaders.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
For a 330k-row dataset, this avoids hundreds of thousands of
redundant string/list allocations.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 2, 2026 21:50
@romanlutz romanlutz force-pushed the romanlutz/add-beaver-tails-dataset branch from 8a9dccb to a91052f Compare March 2, 2026 21:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

romanlutz and others added 2 commits March 2, 2026 16:48
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 3, 2026 04:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

romanlutz and others added 2 commits March 2, 2026 21:01
Copilot AI review requested due to automatic review settings March 3, 2026 05:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

pyrit/prompt_converter/braille_converter.py:133

  • In _get_braile, is_number is still reset to False after processing any character that isn’t in numberPunctuations (line 132). Since digits aren’t in numberPunctuations, this resets the number-mode after every digit, causing the Braille number indicator (characterUnicodes['num']) to be emitted before each digit instead of once per digit sequence. Consider resetting number-mode only when leaving a numeric run (e.g., when is_number is True and the current char is neither a digit nor an allowed number punctuation).
        is_number = False
        for char in text:
            if char in escapeCharacters:
                output += char
            elif char.isupper():
                if char.lower() in characterUnicodes:
                    output += characterUnicodes["caps"]
                    output += characterUnicodes[char.lower()]
            elif char in characterUnicodes:
                if char.isdigit() and not is_number:
                    is_number = True
                    output += characterUnicodes["num"]
                output += characterUnicodes[char]
            if is_number and char not in numberPunctuations:
                is_number = False

romanlutz and others added 2 commits March 2, 2026 21:21
Replaces isoformat().replace('+00:00', 'Z') with strftime('%Y-%m-%dT%H:%M:%SZ')
for second-resolution timestamps without microsecond noise.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 3, 2026 12:43
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Copy link
Contributor

@varunj-msft varunj-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loader looks clean. Worth splitting the braille and markdown printer fixes into their own PRs? The braille one may need a follow up on the reset condition

romanlutz and others added 4 commits March 3, 2026 15:19
Merge latest main into the branch. Revert unrelated changes to
braille_converter.py and markdown_printer.py that don't belong
in the BeaverTails dataset PR.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Wrap SeedPrompt construction in try/except TemplateSyntaxError to
gracefully skip prompts that contain Jinja2 syntax (e.g. endraw)
which would crash the template parser. Add test for this case.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@rlundeen2 rlundeen2 self-assigned this Mar 4, 2026
@romanlutz romanlutz merged commit 799e981 into Azure:main Mar 4, 2026
37 checks passed
@romanlutz romanlutz deleted the romanlutz/add-beaver-tails-dataset branch March 4, 2026 01:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants