Skip to content

fix: redact catboost evidence secrets#1428

Merged
mldangelo-oai merged 45 commits into
mainfrom
mdangelo/codex/fix-catboost-evidence-redaction-c001
Jun 6, 2026
Merged

fix: redact catboost evidence secrets#1428
mldangelo-oai merged 45 commits into
mainfrom
mdangelo/codex/fix-catboost-evidence-redaction-c001

Conversation

@mldangelo-oai

Copy link
Copy Markdown
Contributor

Summary

  • Redact non-URL secrets from CatBoost suspicious-fragment excerpts before they are stored in check details, JSON/cacheable results, or SARIF properties.
  • Add a bounded shared evidence redactor for sensitive assignments, Authorization/Bearer values, escaped/serialized strings, computed sensitive keys, command-bearing sensitive expressions, command option credentials, and reversible encoded evidence.
  • Report sanitized decoded CatBoost base64/hex payload evidence instead of publishing reversible encoded blobs.

Validation

  • PYTHONPATH=/private/tmp/modelaudit-c001 UV_CACHE_DIR=/private/tmp/modelaudit-uv-cache PROMPTFOO_DISABLE_TELEMETRY=1 /Users/mdangelo/code/modelaudit/.venv/bin/python -m pytest tests/scanners/test_evidence_redaction.py tests/scanners/test_catboost_scanner.py tests/integrations/test_sarif_formatter.py tests/scanners/test_llamafile_scanner.py::test_llamafile_scanner_redacts_sensitive_runtime_evidence -q
  • /Users/mdangelo/code/modelaudit/.venv/bin/ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • /Users/mdangelo/code/modelaudit/.venv/bin/ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • /Users/mdangelo/code/modelaudit/.venv/bin/mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • git diff --check HEAD --
  • PYTHONPATH=/private/tmp/modelaudit-c001 UV_CACHE_DIR=/private/tmp/modelaudit-uv-cache PROMPTFOO_DISABLE_TELEMETRY=1 /Users/mdangelo/code/modelaudit/.venv/bin/python -m pytest -n auto -m "not slow and not integration" --maxfail=1

Full lane result: 7743 passed, 16 skipped, 21 warnings.

@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review

Comment thread modelaudit/scanners/_evidence_redaction.py Fixed
@github-actions

github-actions Bot commented May 30, 2026

Copy link
Copy Markdown
Contributor

Workflow run and artifacts

Performance Benchmarks

Compared 12 shared benchmarks with a regression threshold of 15%.
Status: 6 regressions, 0 improved, 6 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 1.335s -> 1.351s (+1.2%).

Top regressions:

  • tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_direct_malicious_upload +394.5% (85.5us -> 422.8us, direct-malicious-upload, malicious_reduce, size=52 B, files=1)
  • tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_raw] +302.9% (110.2us -> 443.9us, nested-payload-review, nested_raw, size=78 B, files=1)
  • tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_base64] +289.3% (120.3us -> 468.3us, nested-payload-review, nested_base64, size=98 B, files=1)
Workload Benchmark Target Size Files Baseline Current Change Status
direct-malicious-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_direct_malicious_upload malicious_reduce 52 B 1 85.5us 422.8us +394.5% regression
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_raw] nested_raw 78 B 1 110.2us 443.9us +302.9% regression
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_base64] nested_base64 98 B 1 120.3us 468.3us +289.3% regression
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_hex] nested_hex 130 B 1 122.2us 470.5us +284.9% regression
padded-multi-stream-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_padded_multi_stream_upload multi_stream_padded 4.1 KiB 1 152.9us 508.9us +232.8% regression
suspicious-pickle-intake tests/benchmarks/test_scan_benchmarks.py::test_scan_suspicious_pickle_intake suspicious-intake 183.8 KiB 4 101.43ms 119.62ms +17.9% regression
warm-cache-rescan tests/benchmarks/test_scan_benchmarks.py::test_scan_warm_cached_repository_rescan release-candidate 547.3 KiB 32 88.65ms 84.10ms -5.1% stable
clean-training-checkpoint tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_clean_training_checkpoint safe_large 278.2 KiB 1 111.16ms 111.36ms +0.2% stable
mixed-model-repository tests/benchmarks/test_scan_benchmarks.py::test_scan_release_candidate_repository release-candidate 547.3 KiB 32 462.39ms 462.81ms +0.1% stable
duplicate-heavy-registry tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_registry_snapshot registry-snapshot 915.2 KiB 13 385.52ms 385.75ms +0.1% stable
single-checkpoint-preflight tests/benchmarks/test_scan_benchmarks.py::test_scan_single_checkpoint_before_load single_checkpoint.pkl 183.0 KiB 1 70.92ms 70.88ms -0.1% stable
chunked-upload-stream tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_upload_stream chunked_stream 278.2 KiB 1 114.05ms 114.09ms +0.0% stable

Comment thread modelaudit/scanners/_evidence_redaction.py Fixed
Comment thread modelaudit/scanners/_evidence_redaction.py Fixed
Comment thread modelaudit/scanners/_evidence_redaction.py Fixed

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bd76338f5a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/catboost_scanner.py Outdated
Comment thread modelaudit/scanners/_evidence_redaction.py Outdated
Comment thread modelaudit/scanners/_evidence_redaction.py Outdated
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review

Comment thread modelaudit/scanners/_evidence_redaction.py Fixed
Comment thread modelaudit/scanners/_evidence_redaction.py Fixed
Comment thread modelaudit/scanners/_evidence_redaction.py Fixed

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6a26e4dd1a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/catboost_scanner.py
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 347edd0c34

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/catboost_scanner.py Outdated
Comment thread modelaudit/scanners/_evidence_redaction.py Outdated
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review

@mldangelo-oai mldangelo-oai marked this pull request as ready for review May 31, 2026 04:58

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f006a3f433

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/catboost_scanner.py Outdated
Comment thread modelaudit/scanners/_evidence_redaction.py Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b7c1971554

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/_evidence_redaction.py Outdated
Comment thread modelaudit/scanners/_evidence_redaction.py Outdated
Comment thread modelaudit/scanners/_evidence_redaction.py Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c498158b15

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/_evidence_redaction.py Outdated
Comment thread modelaudit/scanners/_evidence_redaction.py
Comment thread modelaudit/scanners/_evidence_redaction.py Outdated
…-c001

Keep current shared evidence redaction intact while isolating CatBoost's AST-aware command redactor and addressing the remaining review findings.
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Review/QA update: merged current main and resolved the redaction conflict by preserving the current shared redactor unchanged while moving CatBoost’s AST-aware command evidence logic into a scanner-specific internal module. This prevents the older PR implementation from regressing newer R/Keras/SavedModel/shared redaction behavior.\n\nAddressed all three remaining review findings with regressions: bare curl/wget/nc credential options are redacted, known Authorization schemes redact only the credential token while preserving command context (with full-value handling retained for Digest/AWS4), and already-redacted URL query assignments no longer consume trailing /bin/sh evidence.\n\nFocused validation: 133 passed across the CatBoost redactor, current shared redactor, CatBoost scanner, and SARIF coverage; targeted Ruff format/check and mypy are clean. All review threads are resolved.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6dee4443af

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/catboost_scanner.py Outdated
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py Outdated
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py Outdated

Copy link
Copy Markdown
Contributor Author

Review fixes pushed in 5b57e2c9 after syncing current main.

  • decode and redact standard and URL-safe base64 evidence down to minimum provider-token lengths
  • redact semicolon-delimited, bracketed, and encoded nested query secrets while preserving benign query context
  • cover documented curl/wget TLS, proxy, OAuth, FTP, and HTTP credential options with a benign near-match guard
  • resolve all outstanding review threads

Focused validation:

  • tests/scanners/test_catboost_evidence_redaction.py: 45 passed
  • tests/scanners/test_catboost_scanner.py: 37 passed
  • targeted Ruff format/check and mypy: clean

The prior Python 3.10 CI job's tests passed; the job was cancelled at the old 30-minute timeout during post-job cache cleanup. Current main raises that timeout to 35 minutes and is now merged into this branch.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5b57e2c997

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/catboost_scanner.py Outdated

Copy link
Copy Markdown
Contributor Author

Follow-up security QA addressed all five new review findings:

  • Redacts overlapping subprocess/list argv option-value pairs such as ['curl', '--password', '...'].
  • Decodes shorter bounded base64 candidates only when the decoded text contains a real sensitive value.
  • Covers compact AWS assignment aliases without broadening benign keys such as pass_word.
  • Decodes bounded bytes-literal collections before evidence is stored.
  • Decodes short \\x.. runs in context so escaped provider-token prefixes cannot bypass redaction.

Added positive and benign near-match regressions plus an integrated findings/SARIF regression. Associated validation: 374 passed; scoped Ruff, format check, and mypy all pass; git diff --check is clean. All 79 review threads are resolved.

Pushed as a268f625bce4065be03d61240e4d2bbb5b46cabe; moving on while CI runs.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a268f625bc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/_catboost_evidence_redaction.py Outdated
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py

Copy link
Copy Markdown
Contributor Author

Follow-up redaction QA pushed in ca854e68.

  • fixed attached, combined, serialized, and argv curl cookie forms, including separated ['-b', 'value'] arguments
  • added compact prefixed credential aliases while preserving detoken, retoken, tokenizer, policy, URL-option-lookalike, and unrelated bash -bc near-matches
  • verified the new values remain absent from CatBoost SARIF output

Focused validation: 395 passed across the two associated CatBoost test files; Ruff format/check and mypy are clean for the three modified files.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

"shell interpreter invocation",

P2 Badge Tighten IPv6 token boundaries before flagging networks

With ordinary metadata that contains C++-style names, this pattern can match a one-character IPv6 substring inside the word. For example, metric::name yields c::, which ipaddress accepts as a global IPv6 address, so the analyzer reports a public-IP network indicator and can even create a command/network correlation if the same fragment also mentions a command; require real address boundaries before adding the network match.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/catboost_scanner.py Outdated
Comment thread modelaudit/scanners/catboost_scanner.py Outdated

Copy link
Copy Markdown
Contributor Author

QA follow-up pushed in fa1855f. Fixed the remaining reversible percent-decoding leak by completing bounded decoding before sanitization, and restricted IP indicators to globally routable unicast addresses so multicast/reserved IPv4 and IPv6 literals cannot create false command/network correlations. Added direct regressions for the reported foo%3Dapi_key%253Dhunter2 leak plus ff02::1, 64:ff9b::192, IPv4 multicast, and reserved ranges. Validation: 404 associated CatBoost tests passed after syncing current main; scoped Ruff and mypy are clean. The existing Windows CI failure is the separate dangling-symlink base issue being repaired in #1463; this PR will rerun against the updated base once that lands.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fa1855fddd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/_catboost_evidence_redaction.py
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py

Copy link
Copy Markdown
Contributor Author

Reviewed and pushed f30ac803.

Addressed all four fresh review findings:

  • curl inline config credentials (oauth2-bearer, ftp-account);
  • MySQL attached and separated -p passwords, with prompt/port near-matches;
  • registry-scoped npmrc auth assignments;
  • Twine upload short password arguments.

Validation: 328 associated redaction and CatBoost-to-SARIF tests passed; scoped Ruff, format, mypy, and git diff --check are clean. All open review threads are resolved.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f30ac80362

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/_catboost_evidence_redaction.py
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py
Handle curl config heredocs, Docker password stdin producers, and sshpass env/process-substitution sources, then sync the branch with current main.

Copy link
Copy Markdown
Contributor Author

QA update for cfc7b5c8:

  • Synced with current main.
  • Added bounded heredoc handling for curl --config - and --netrc-file - stdin.
  • Redacted literal pipe, here-string, and heredoc producers for Docker login --password-stdin, while preserving dynamic aws ecr get-login-password input.
  • Redacted sshpass default/custom env assignments and inline process-substitution password files, while preserving external file paths and unrelated env assignments.
  • Added CatBoost/SARIF coverage for all three source classes.

Validation: redaction helper module 338 passed; exact SARIF regression passed; Ruff and mypy clean.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cfc7b5c875

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/_catboost_evidence_redaction.py Outdated
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py

Copy link
Copy Markdown
Contributor Author

Follow-up pushed at c1d08159 after reviewing the concurrent fix on cfc7b5c8. It closes two remaining edges: wrapped Docker stdin pipelines such as os.system("echo ... | docker login --password-stdin") now redact the producer value, and sshpass environment-variable matching is case-sensitive so foo=... -eFOO is not falsely treated as the password source. Associated QA: both complete CatBoost test modules (439 passed), focused regressions after formatting (8 passed), scoped Ruff check/format, mypy, and git diff --check.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c1d08159ba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/_catboost_evidence_redaction.py
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py

Copy link
Copy Markdown
Contributor Author

Addressed the five fresh findings in de1ef51. Inline secret sources now share emitter, here-string, and heredoc body redaction for sshpass -f and command process substitutions; sshpass -d binds only its declared descriptor to the matching here-string; npm scoped auth keys support both key=value and key value; and already-redacted attached argv options retain their closing literal and trailing URL context. Added mismatched-fd, dynamic cookie source, npm hint-key, non-command prose, and password-policy negatives. Validation: all 354 focused redactor tests plus the end-to-end CatBoost details/SARIF regression pass; scoped Ruff and mypy are clean.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: de1ef5191c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/_catboost_evidence_redaction.py Outdated
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py Outdated
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py

Copy link
Copy Markdown
Contributor Author

Addressed the five remaining review findings in 2d5e360:

  • Azure Storage assignment aliases now redact separated and compact forms.
  • az login -p, Poetry credential config, GitHub CLI inline token stdin, and OpenSSL enc passphrases are covered.
  • Added plain-command and SARIF regressions plus benign near-match coverage; notably, gh auth command text and dynamic stdin sources remain intact.

Scoped QA:

  • tests/scanners/test_catboost_evidence_redaction.py: 371 passed
  • test_catboost_sarif_redacts_follow_up_reversible_secret_variants: 1 passed
  • Ruff format/check and mypy passed for the three touched files

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2d5e360eb6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/catboost_scanner.py Outdated
Comment thread modelaudit/scanners/catboost_scanner.py Outdated
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py
Comment thread modelaudit/scanners/_catboost_evidence_redaction.py Outdated

Copy link
Copy Markdown
Contributor Author

Addressed the four latest CatBoost evidence findings in 3a3acbd7:

  • Named Unicode escapes now decode through bounded unicodedata.lookup before provider-token redaction.
  • Percent-decoded evidence uses the hidden-redaction fail-closed pass, including secrets beyond the first displayed window.
  • Static subprocess input= values are redacted only for literal Docker/GitHub stdin-auth commands; dynamic inputs and near matches remain visible.
  • Separated curl argv user/certificate values retain the username/certificate while redacting only the password.
  • Executable paths and Windows .exe forms are covered.

Scoped QA: both CatBoost redaction modules passed together (490 tests), the expanded SARIF regression passed, and Ruff format/check plus mypy passed for the four touched files.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3a3acbd7ab

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/_catboost_evidence_redaction.py

Copy link
Copy Markdown
Contributor Author

Pushed ac3e294b with the remaining review and QA fixes:

  • redacts Docker/GitHub CLI stdin credentials supplied through Bash process substitution while preserving dynamic cat sources;
  • handles Azure storage/federated credential options and Poetry --local/-- credential forms;
  • preserves OpenSSL dynamic passphrase descriptors while redacting literal passphrases;
  • handles Python print(...) stdin emitters and piped heredocs;
  • prevents end-of-segment percent-encoded secrets from escaping when the redaction marker would otherwise be truncated.

Validation:

  • 510 passed across test_catboost_evidence_redaction.py and test_catboost_scanner.py
  • scoped Ruff check and format check
  • scoped mypy
  • git diff --check

@mldangelo-oai mldangelo-oai merged commit 5e56d8a into main Jun 6, 2026
27 of 28 checks passed
@mldangelo-oai mldangelo-oai deleted the mdangelo/codex/fix-catboost-evidence-redaction-c001 branch June 6, 2026 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants