TEST: stop GCG unit tests from hitting HuggingFace by romanlutz · Pull Request #1886 · microsoft/PyRIT

romanlutz · 2026-06-02T01:48:25Z

Description

Five tests under tests/unit/auxiliary_attacks/gcg/ silently downloaded the gpt2 tokenizer at test time via AutoTokenizer.from_pretrained("gpt2"). When HuggingFace rate-limits (429), the dev_all matrix on windows-latest + Python 3.10 fails on otherwise-unrelated PRs (most recent example: 5 OSError: We couldn't connect to 'https://huggingface.co' ... failures on PR #1866). Per .github/instructions/test.instructions.md, unit tests must not hit the network.

This PR splits the offenders two ways based on what each test actually needs gpt2 for:

Mocked in place -- the lone tokenizer-edge-case test test_gcg_core.py::TestUpdateIdsErrorPaths::test_end_tok_returns_len_toks_when_target_is_at_prompt_end was rewritten to use a fully-mocked tokenizer where encoding.char_to_token returns None past the target's end. This mirrors the pattern already used by the two adjacent test_start_tok_* tests in the same class, so it both fits in and exercises the exact return len(toks) if tok is None else tok branch in end_tok.
Moved to integration -- four wiring tests that exist precisely to exercise the real chat-template pipeline end-to-end (constructing real IndividualPromptAttack / ProgressiveMultiPromptAttack and running them through _update_ids). Mocking the tokenizer richly enough to satisfy that pipeline would defeat the test's purpose. Destination matches the existing tests/integration/auxiliary_attacks/test_gcg_integration.py precedent (same gpt2 + custom chat-template pattern):
- whole tests/unit/auxiliary_attacks/gcg/test_attack_wiring.py (both tests in it)
- just the TestCreateAttackWiring class extracted from tests/unit/auxiliary_attacks/gcg/test_generator.py
Consolidated into a new tests/integration/auxiliary_attacks/test_gcg_attack_wiring_integration.py. No @pytest.mark.run_only_if_all_tests marker -- that marker is for tests needing real API credentials; these only need a HF tokenizer, and the precedent file doesn't use it either.

GitHub Actions PR CI only runs make unit-test, so this fully removes the offenders from the PR-time matrix. The Azure DevOps integration pipeline still exercises them on push to main.

Tests and Documentation

Verified:

uv run pytest tests/unit/auxiliary_attacks/gcg/ -> 110 passed, no network
uv run --with pytest-socket pytest tests/unit/auxiliary_attacks/gcg/ --disable-socket --allow-hosts=127.0.0.1,localhost,::1 -> 110 passed (empirical proof: any off-loopback socket call would raise SocketBlockedError)
uv run pytest tests/integration/auxiliary_attacks/test_gcg_attack_wiring_integration.py -> 4 passed
rg 'from_pretrained\("gpt2"\)' tests/unit/ -> no matches
pre-commit (ruff format + ruff check) clean

No documentation changes needed; this is a test-only refactor.

Out of scope but noted

tests/unit/prompt_converter/test_pdf_converter.py::test_filename_extension_existing_pdf makes a real requests.get to raw.githubusercontent.com/.../fake_CV.pdf -- same class of bug but not in today's failing job. Worth a separate follow-up.

5 tests under `tests/unit/auxiliary_attacks/gcg/` were silently downloading the gpt2 tokenizer at test time via `AutoTokenizer.from_pretrained(`gpt2`)`, which flakes the dev_all CI matrix when HuggingFace rate-limits (e.g. 5 OSError failures on windows-latest+py3.10+dev_all in PR microsoft#1866). Per `.github/instructions/test.instructions.md`, unit tests must not hit the network. Two paths taken: - **Mocked in place** the lone tokenizer-edge-case test whose adjacent siblings in the same class already use a fully-mocked tokenizer pattern: `test_gcg_core.py::TestUpdateIdsErrorPaths::test_end_tok_returns_len_toks_when_target_is_at_prompt_end`. - **Moved to integration tier** four wiring tests that exist specifically to exercise the real chat-template pipeline end-to-end. Mocking the tokenizer richly enough to satisfy `_update_ids` would defeat the test's purpose. Destination matches the existing `tests/integration/auxiliary_attacks/test_gcg_integration.py` precedent (same gpt2 + custom chat-template pattern; no marker needed — these run in `make integration-test`, not in the PR-time `make unit-test` matrix): - `test_attack_wiring.py::TestAttackClassWiring::*` (whole file) - `test_generator.py::TestCreateAttackWiring::*` (just the class) → consolidated into `tests/integration/auxiliary_attacks/test_gcg_attack_wiring_integration.py`. Verification: uv run pytest tests/unit/auxiliary_attacks/gcg/ # 110 passed uv run pytest tests/integration/auxiliary_attacks/... # 4 passed rg 'from_pretrained\(`gpt2`\)' tests/unit/ # no matches Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…sts-no-network

romanlutz and others added 2 commits June 1, 2026 17:26

Merge remote-tracking branch 'origin/main' into romanlutz/gcg-unit-te…

70b017f

…sts-no-network

rlundeen2 approved these changes Jun 2, 2026

View reviewed changes

romanlutz added this pull request to the merge queue Jun 2, 2026

Merged via the queue into microsoft:main with commit 9bb005f Jun 2, 2026
47 checks passed

romanlutz deleted the romanlutz/gcg-unit-tests-no-network branch June 2, 2026 19:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TEST: stop GCG unit tests from hitting HuggingFace#1886

TEST: stop GCG unit tests from hitting HuggingFace#1886
romanlutz merged 2 commits into
microsoft:mainfrom
romanlutz:romanlutz/gcg-unit-tests-no-network

romanlutz commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

romanlutz commented Jun 2, 2026

Description

Tests and Documentation

Out of scope but noted

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants