Skip to content

fix: anchor exclude dirs to absolute paths so bandit honors --exclude#616

Open
elfensky wants to merge 2 commits into
peteromallet:mainfrom
elfensky:fix/security-exclude-not-honored
Open

fix: anchor exclude dirs to absolute paths so bandit honors --exclude#616
elfensky wants to merge 2 commits into
peteromallet:mainfrom
elfensky:fix/security-exclude-not-honored

Conversation

@elfensky
Copy link
Copy Markdown
Contributor

@elfensky elfensky commented Jun 1, 2026

Problem

collect_exclude_dirs() is documented as returning "All exclusion directories as absolute paths, for passing to external tools", but it returned str(scan_root / p)relative whenever scan_root is relative.

The Python security detector derives scan_root from os.path.commonpath() of the discovered file list (scan_root_from_files), which yields Path(".") for any scan launched with a relative --path (the common case).

bandit is invoked with -r <absolute scan root> and matches --exclude entries against the absolute paths it walks. A relative entry like .worktrees therefore matched nothing, and bandit re-scanned excluded directories anyway.

Concrete impact

On a repo that uses git worktrees (.worktrees/ added to exclude), bandit re-scanned every worktree copy of the source tree, producing ~2000 phantom B101 "assert detected" findings from the duplicated test files — even though .worktrees was correctly excluded from every other detector (file count, duplication, smells, …). Security dropped from 100% to ~50% purely on noise.

Root cause

desloppify/base/discovery/source.py::collect_exclude_dirs joins exclusion names onto an unresolved, often-relative scan_root.

Only bandit surfaces this:

  • file-discovery detectors (structural / smells / duplication) filter an already-relative file list via matches_exclusion's path-component match — they never hand a root to an external tool;
  • ruff's --exclude tolerates relative entries;
  • jscpd uses only Path(dirname).name;
  • bandit receives a directory root and does its own recursive walk, so it relies entirely on --exclude — which was malformed.

Verified against bandit 1.9.4: an absolute --exclude /abs/.worktrees excludes correctly; a relative .worktrees does not.

Fix

Anchor a relative scan_root to its absolute form before joining the exclusion names — matching the .resolve() bandit already applies to its own scan target. Already-absolute roots are left byte-stable, so existing callers and the test_returns_absolute_paths expectation are unchanged. (7 lines.)

Testing

  • New regression test TestCollectExcludeDirs::test_relative_scan_root_yields_absolute_paths — fails before, passes after.
  • Existing TestCollectExcludeDirs, ruff/bandit exclude-flag, and bandit-adapter suites stay green.
  • Full suite: 5810 passed, 6 skipped.
  • End-to-end on the original repro: the security detector goes from 1961 phantom findings to clean (33 files scanned).

Follow-up (intentionally not in this PR)

The per-detector security cache (review_cache.detectors.security) is keyed by _file_fingerprint, which hashes the file set but not the exclusion config. So on an already-scanned project, changing exclude doesn't invalidate the cached security result until the file set changes (fresh scans are unaffected). A natural fix is to feed the active exclusions into the existing salt= parameter of _file_fingerprint — happy to send that as a separate PR if you'd like.

🤖 Generated with Claude Code

elfensky and others added 2 commits June 1, 2026 22:36
collect_exclude_dirs() documents itself as returning "absolute paths" but
returned str(scan_root / p), which is relative whenever scan_root is. The
Python security detector derives scan_root from os.path.commonpath of the
file list (scan_root_from_files), so it is Path(".") for any scan launched
with a relative --path.

bandit matches --exclude against the absolute paths it walks, so a relative
entry like ".worktrees" matched nothing and bandit re-scanned excluded
directories — producing thousands of phantom B101 "assert detected" findings
from worktree copies of a repo, even though ".worktrees" was in the exclude
config. The file-discovery detectors were unaffected (they filter an
already-relative file list via matches_exclusion's path-component match), and
ruff tolerates relative excludes; only bandit, which receives a directory
root and does its own walk, surfaced the bug.

Anchor a relative scan_root to its absolute form (matching the .resolve()
bandit already applies to its own scan target). Already-absolute roots are
left byte-stable, so existing callers and the test_returns_absolute_paths
expectation are unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…w prompt glob

## The problem

CI has been red on `main` since 2026-04-06, and on every PR since,
because of 5 pre-existing test failures that are unrelated to feature
work. They block green builds without any way to fix them inside a
feature PR. Two distinct root causes:

  1. Three bash tests in
     `desloppify/tests/lang/common/test_bash_unused_imports.py` call
     `detect_unused_imports`, which parses via tree-sitter. When
     `tree-sitter-language-pack` is not installed (the case in the
     `tests-core` job, but not `tests-full`), the function returns
     `[]`. The tests then assert specific findings and fail.

  2. Two review tests in
     `desloppify/tests/review/test_review_commands.py` and its
     integration counterpart use
     `list(runs_dir.glob("*/prompts/batch-*.md"))` without sorting.
     The glob order is filesystem-dependent: on macOS it happens to
     return `batch-1.md` (the batch with `historical_issue_focus`)
     first, on Linux it returns `batch-2.md` first. The assertion
     `"Previously flagged issues" in prompt_text` then fails on
     Linux because that string only appears in `batch-1.md`.

## The solution

Two small fixes:

  1. Gate the bash test file on tree-sitter availability with the
     existing pattern from
     `desloppify/tests/lang/common/test_treesitter.py`:

     ```python
     pytestmark = pytest.mark.skipif(
         not is_available(),
         reason="tree-sitter-language-pack not installed",
     )
     ```

     Without tree-sitter the tests skip cleanly; with it they pass
     as before.

  2. Wrap the review test's glob in `sorted(...)`:

     ```python
     prompt_files = sorted(runs_dir.glob("*/prompts/batch-*.md"))
     ```

     This makes the test deterministic across operating systems.
     `batch-1.md` always sorts first; the assertion that depends on
     its content is reliable.

## Why this is in the same PR

These two test fixes are independent of the exclude-dir anchoring in
this PR (peteromallet#616) — they are pre-existing `main` failures that could be
cherry-picked back to `main` separately. Bundling them here lets the PR
actually go green without waiting for a separate maintenance PR. The fix
was originally authored on the framework-support branch; see the
cherry-pick trailer below.

(cherry picked from commit 9baba25)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant