fix: scan generic archive python handlers#1047
Conversation
Performance BenchmarksCompared
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c93337cabd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 67490754a0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 124db27959
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Address three review findings on the generic ZIP/TAR Python-member scan: 1. Rule-code attribution. Previously every high-risk finding was emitted under S104 (eval/exec) regardless of the actual call, so SARIF export, dashboards, and per-rule severity overrides lost the distinction between `os.system` (S101), `subprocess.run` (S103), `importlib. import_module` (S107), `pickle.load`/`pickle.loads` (S213), bare `__import__` (S106), and `eval`/`exec` (S104). Emit one finding per rule code, with the rule code derived from the resolved call name. 2. Config override. The generic ZIP path silently ignored `max_mar_python_analysis_bytes`, so operators who raised or lowered the cap for MAR archives never saw it applied to ordinary ZIPs. Extract a `_max_python_member_analysis_bytes()` helper that the MAR path also consumes, keeping both paths in sync. 3. Code duplication. Move the ~60-line dispatch body that was copy-pasted between `zip_scanner._scan_generic_member_security` and `tar_scanner._scan_generic_member_security` into a shared `scan_archive_member_for_known_risks` helper in `archive_member_security.py`, parameterised by archive kind, analysis cap, and inconclusive-reason string. Also pass source bytes to `ast.parse` directly so PEP 263 encoding declarations (`# -*- coding: latin-1 -*-`) are honored instead of being lossily replaced before parsing — verified with a regression test that embeds a non-UTF-8 byte in a comment. Adds TAR executable-member parity test, parametrized rule-code tests for ZIP and TAR, a mixed-risk test asserting one finding per rule code, a config-override test, a PEP 263 test, and short comments on the deliberate `getattr` string-concat limitation and the `Starred` binding-loss case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 60211ccfcb
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 86609d64d9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…neric-archive-python-handlers # Conflicts: # CHANGELOG.md
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bfd96c6149
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…neric-archive-python-handlers # Conflicts: # CHANGELOG.md
…neric-archive-python-handlers # Conflicts: # CHANGELOG.md
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1d1825024e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 94f9dddc04
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| def visit_Assign(self, node: ast.Assign) -> None: | ||
| self.visit(node.value) | ||
| for target in node.targets: | ||
| self._bind_target_to_value(target, node.value) | ||
| self.visit(target) |
There was a problem hiding this comment.
Handle match-case alias branches before mutating alias scope
_HighRiskPythonCallVisitor merges aliases for if/while/try, but not match. Case bodies are visited sequentially, so visit_Assign mutates aliases as if every case executed. For match __name__: case 'a': import subprocess as sp; case _: import os as sp; sp.run(...), sp ends up as os and subprocess.run is missed, creating a false-negative bypass in archive Python-member security checks.
Useful? React with 👍 / 👎.
Summary
Finding
Fixes finding 5: generic archives pass Python handlers as unknown.
Validation