Skip to content

fix: route large and renamed Flax MessagePack checkpoints#1280

Merged
mldangelo-oai merged 9 commits into
mainfrom
mdangelo/codex/fix-renamed-flax-msgpack-routing
May 26, 2026
Merged

fix: route large and renamed Flax MessagePack checkpoints#1280
mldangelo-oai merged 9 commits into
mainfrom
mdangelo/codex/fix-renamed-flax-msgpack-routing

Conversation

@mldangelo-oai
Copy link
Copy Markdown
Contributor

@mldangelo-oai mldangelo-oai commented May 24, 2026

Summary

  • route renamed Flax/JAX MessagePack checkpoints to the existing flax_msgpack scanner when bounded structural inspection exposes a checkpoint-state root (params, opt_state, or model_state)
  • walk large MessagePack maps with fixed depth/node budgets while seeking past oversized scalar metadata, so roots after large metadata cannot be hidden
  • retain conservative routing when those budgets are exhausted, but return explicit incomplete coverage before full MessagePack unpacking instead of a possible clean result
  • add direct, directory, filter, scanner, aggregate-exit, and cache regressions plus user-facing documentation

Why

A malicious MessagePack checkpoint saved as payload.jpg was silently skipped. During adversarial review, an otherwise identical renamed map larger than 1 MiB with a harmless metadata value before params was also skipped, while .flax reported the existing critical __reduce__ finding.

A further audit found a second failure mode in the draft: a generic 1.1 MiB MessagePack map with no checkpoint-state root could exhaust the bounded routing walk, be routed as flax_msgpack, fully unpacked, and return a clean result with exit code 0.

The repaired path reaches later top-level checkpoint roots without materializing large scalar values. If a renamed structure cannot be classified within the routing budget, it is still retained for safety but now returns inconclusive with reason flax_msgpack_routing_probe_limit_exceeded, is not cached, and yields exit code 2 without msgpack.unpackb.

Validation

  • env VIRTUAL_ENV=/Users/mdangelo/code/modelaudit/.venv UV_CACHE_DIR=/tmp/modelaudit-uv-cache uv run --active --no-sync ruff format modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • env VIRTUAL_ENV=/Users/mdangelo/code/modelaudit/.venv UV_CACHE_DIR=/tmp/modelaudit-uv-cache uv run --active --no-sync ruff check --fix modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • env VIRTUAL_ENV=/Users/mdangelo/code/modelaudit/.venv UV_CACHE_DIR=/tmp/modelaudit-uv-cache uv run --active --no-sync mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/ (Success: no issues found in 451 source files)
  • env npm_config_cache=/private/tmp/modelaudit-npm-cache npx prettier --check CHANGELOG.md README.md docs/user/compatibility-matrix.md
  • focused routing/scanner/core suite: 327 passed, 3 warnings
  • direct regression proof: budget-exhausting renamed generic map now returns flax_msgpack, success=False, scan_outcome=inconclusive, reason flax_msgpack_routing_probe_limit_exceeded, exit code 2
  • env PYTHONPATH=/private/tmp/modelaudit-flax-routing VIRTUAL_ENV=/Users/mdangelo/code/modelaudit/.venv UV_CACHE_DIR=/tmp/modelaudit-uv-cache PROMPTFOO_DISABLE_TELEMETRY=1 uv run --active --no-sync pytest -n auto -m "not slow and not integration" --maxfail=1 (5705 passed, 16 skipped, 21 warnings)

Recognize bounded, structurally valid Flax/JAX MessagePack payloads when they use misleading suffixes, while leaving generic MessagePack state maps skipped. Add routing, filtering, fail-closed, and malicious payload regression coverage plus user-facing documentation.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 24, 2026

Workflow run and artifacts

Performance Benchmarks

Compared 12 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 12 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 683.47ms -> 695.76ms (+1.8%).

Workload Benchmark Target Size Files Baseline Current Change Status
suspicious-pickle-intake tests/benchmarks/test_scan_benchmarks.py::test_scan_suspicious_pickle_intake suspicious-intake 183.8 KiB 4 89.70ms 95.95ms +7.0% stable
clean-training-checkpoint tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_clean_training_checkpoint safe_large 278.2 KiB 1 16.02ms 15.00ms -6.4% stable
warm-cache-rescan tests/benchmarks/test_scan_benchmarks.py::test_scan_warm_cached_repository_rescan release-candidate 547.3 KiB 32 55.16ms 57.04ms +3.4% stable
duplicate-heavy-registry tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_registry_snapshot registry-snapshot 915.2 KiB 13 189.46ms 195.19ms +3.0% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_base64] nested_base64 98 B 1 497.1us 489.0us -1.6% stable
direct-malicious-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_direct_malicious_upload malicious_reduce 52 B 1 1.61ms 1.59ms -1.4% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_raw] nested_raw 78 B 1 472.3us 478.4us +1.3% stable
single-checkpoint-preflight tests/benchmarks/test_scan_benchmarks.py::test_scan_single_checkpoint_before_load single_checkpoint.pkl 183.0 KiB 1 36.95ms 36.69ms -0.7% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_hex] nested_hex 130 B 1 492.8us 489.4us -0.7% stable
padded-multi-stream-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_padded_multi_stream_upload multi_stream_padded 4.1 KiB 1 1.68ms 1.67ms -0.6% stable
chunked-upload-stream tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_upload_stream chunked_stream 278.2 KiB 1 18.76ms 18.73ms -0.2% stable
mixed-model-repository tests/benchmarks/test_scan_benchmarks.py::test_scan_release_candidate_repository release-candidate 547.3 KiB 32 272.66ms 272.44ms -0.1% stable

@mldangelo-oai mldangelo-oai changed the title fix: route renamed Flax MessagePack checkpoints fix: route large and renamed Flax MessagePack checkpoints May 24, 2026
@mldangelo-oai
Copy link
Copy Markdown
Contributor Author

Independent follow-up audit found and repaired a false-clean path in the large renamed MessagePack routing logic.

Reproduction before 61e74e77: a renamed ~1.1 MiB generic MessagePack map with no params/opt_state/model_state root, but enough nested metadata nodes to exhaust the 4,096-node routing budget, was classified as flax_msgpack, fully unpacked, and completed cleanly with aggregate exit code 0.

Repair: bounded routing now distinguishes confirmed roots from probe exhaustion. An ambiguous renamed map remains conservatively routed, but FlaxMsgpackScanner returns explicit incomplete coverage (flax_msgpack_routing_probe_limit_exceeded) before msgpack.unpackb, with exit code 2 and no cache entry. Confirmed large malicious renamed checkpoints still receive full security analysis.

Validation:

  • focused routing/scanner/core suite: 327 passed, 3 warnings
  • full Ruff format/check clean; Prettier clean; mypy clean for 451 source files
  • full non-slow/non-integration suite with rebuilt native picklescan extension: 5705 passed, 16 skipped, 21 warnings

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves Flax/JAX MessagePack detection so renamed checkpoints can be routed to the Flax scanner using bounded structural inspection, and ambiguous oversized renamed maps fail closed as inconclusive instead of being treated as clean.

Changes:

  • Adds bounded MessagePack probing for renamed Flax/JAX checkpoint roots such as params, opt_state, and model_state.
  • Updates Flax MessagePack scanning and registry routing to handle renamed or header-detected MessagePack checkpoints.
  • Adds regression coverage and user-facing documentation for renamed/large MessagePack checkpoint behavior.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
modelaudit/utils/file/detection.py Adds bounded MessagePack structure probing and renamed Flax checkpoint detection.
modelaudit/scanners/flax_msgpack_scanner.py Routes can_handle through structural detection and adds fail-closed inconclusive handling.
modelaudit/scanner_registry_metadata.py Registers flax_msgpack as a header-routed format.
tests/utils/file/test_filetype.py Adds format-detection regressions for large renamed Flax MessagePack maps.
tests/utils/file/test_file_filter.py Adds skip-filter regression coverage for disguised Flax checkpoints.
tests/test_directory_file_filtering.py Verifies directory scans preserve large disguised malicious Flax MessagePack files.
tests/test_core.py Adds scan, exit-code, dependency, and cache regressions for renamed/ambiguous MessagePack checkpoints.
tests/scanners/test_flax_msgpack_scanner.py Adds scanner-level routing and no-unpack fail-closed regressions.
README.md Documents renamed Flax/JAX MessagePack recognition.
docs/user/compatibility-matrix.md Documents compatibility behavior for renamed Flax/JAX MessagePack maps.
CHANGELOG.md Records the bug fix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread modelaudit/scanners/flax_msgpack_scanner.py Outdated
Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

# Conflicts:
#	docs/user/compatibility-matrix.md
#	modelaudit/utils/file/detection.py
#	tests/test_core.py
#	tests/test_directory_file_filtering.py
#	tests/utils/file/test_file_filter.py
#	tests/utils/file/test_filetype.py
# Conflicts:
#	modelaudit/core.py
#	modelaudit/scanners/archive_dispatch.py
#	tests/utils/file/test_filetype.py
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4b69a72d31

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/utils/file/detection.py
Comment thread modelaudit/utils/file/detection.py
Comment thread modelaudit/utils/file/detection.py Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5bfe75091b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/utils/file/detection.py
Comment thread modelaudit/utils/file/detection.py
@mldangelo-oai mldangelo-oai merged commit 40766c4 into main May 26, 2026
29 checks passed
@mldangelo-oai mldangelo-oai deleted the mdangelo/codex/fix-renamed-flax-msgpack-routing branch May 26, 2026 23:51
@mldangelo-oai
Copy link
Copy Markdown
Contributor Author

Follow-up review remediation is now in #1379.

  • Document/text suffix routing is corrected for structurally confirmed Flax checkpoints, including .markdown.
  • Protocol-0 GLOBAL and binary pickle-shaped prefix overlaps now retain Flax as primary while also composing pickle findings.
  • The JSON-prefix concern is addressed conservatively: the unbounded trailing-whitespace read is removed, but unseen trailing bytes remain fail-closed because they can conceal a later malicious MessagePack checkpoint object.
  • The checkpoint-suffix ownership concern is already fixed in merged main and covered by the existing .ckpt, .checkpoint, and .orbax-checkpoint regressions.

Validation on the follow-up branch includes independent review, focused routing/scanner/core tests (663 passed), the full required non-slow/non-integration lane (6352 passed, 16 skipped), and live CLI probes for each routing class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants