Skip to content

Stale per-section reports left behind when discovery reassigns ASVS sections to different domains #16

@andrewmusselman

Description

@andrewmusselman

Summary

When asvs_orchestrate re-audits the same commit but asvs_discover produces a different domain partition than the previous run, per-section reports end up duplicated across multiple domain folders. The consolidated report is correct (consolidate only reads from the current run's domain dirs), but stale .md files from the previous run's domain assignment remain in the repo indefinitely. Nothing in the pipeline lists or deletes them.

Steps to reproduce

  1. Run asvs_orchestrate against apache/mahout at commit 245aad3 with discover=true. Suppose discovery assigns ASVS section 6.1.1 to the auth_identity domain. The pipeline pushes ASVS/reports/mahout/245aad3/auth_identity/6.1.1.md.
  2. Re-run asvs_orchestrate against the same commit. Discovery is non-deterministic (Sonnet temperature 0.7) and this time decides 6.1.1 belongs in session_management. The pipeline pushes ASVS/reports/mahout/245aad3/session_management/6.1.1.md.
  3. Inspect the repo — both files now exist:
    • ASVS/reports/mahout/245aad3/auth_identity/6.1.1.md (stale, from run 1)
    • ASVS/reports/mahout/245aad3/session_management/6.1.1.md (current, from run 2)

Why this happens

asvs_push_github overwrites individual files in place (GET sha → PUT with sha = update). It only touches paths it's explicitly told about. Nothing in the pipeline lists a directory and deletes orphans.

asvs_orchestrate builds the consolidator's report_directories list from the current run's discovery output:

# asvs_orchestrate.py
seen_pass_dirs = set()
for pass_def, _ in work_items:
    pn = pass_def.get("name", "unknown")
    d = f"{push_directory}/{pn}" if push_directory else pn
    if d not in seen_pass_dirs:
        report_directories.append(d)
        seen_pass_dirs.add(d)

So asvs_consolidate reads only from the current domain dirs and produces a correct consolidated.md. The stale file is invisible to consolidation but remains in the repo.

Impact

  • Repo bloat. Each re-audit of the same commit can accumulate orphan files in old domain folders. Over many reruns, the run directory grows arbitrarily large.
  • QA tooling false negatives. The README's "check for missing reports" snippet lists every .md file in the run directory and matches by section ID. With duplicates across two domain folders, both register as "present" — even if one is stale and contradicts the consolidated report.
  • Confusion when reading per-section reports directly. A user clicking through GitHub to read a per-section finding may land on the orphan and read a stale audit, with no indication that a newer version exists in a different folder.
  • Doesn't break consolidation today, but it's a quiet correctness hazard. If the orphan happens to contain different findings (different prompt, different Opus reasoning trace, different domain context), the per-section file disagrees with the consolidated report.

Proposed fix

Add a new orchestrator parameter cleanStaleReports (default "false" for safety; opt-in until proven). When "true", after the audit phase but before consolidation, the orchestrator:

  1. Builds the set of expected paths under outputDirectory/<repo>/<commit>/ from the current run's work_items.
  2. Lists the run directory on GitHub via the Trees API (GET /repos/{owner}/{repo}/git/trees/{sha}?recursive=1, filtered to the run path).
  3. Identifies any .md file under a domain folder whose path isn't in the expected set AND whose filename matches ^\d+\.\d+\.\d+\.md$ (so we never accidentally delete consolidated.md, issues.md, README files, or anything outside the section-report convention).
  4. Deletes each orphan via DELETE /repos/{owner}/{repo}/contents/{path} with the file's SHA.
  5. Logs a summary: how many orphans were found and deleted.

If the run aborted or was partial, cleanStaleReports=true should be skipped automatically — only run cleanup after a successful audit phase. (Ideally we should never delete based on a partial work_items list, since a missing section in this run could just mean the audit failed and shouldn't trigger deletion of last run's good copy.)

For the carve-out flow, cleanup runs against both repos: the private repo (full reports) and the public repo (redacted reports), each with its own token.

Behavior matrix

cleanStaleReports Audit phase status Action
"false" (default) any No cleanup. Existing behavior. Orphans accumulate.
"true" All work items succeeded Delete orphans matching \d+\.\d+\.\d+\.md not in expected set
"true" Partial failures Skip cleanup; print warning. Orphans preserved (they may correspond to sections the audit failed to re-audit)
"true" All failed Skip cleanup

What is not deleted

  • consolidated.md, issues.md, and any consolidated-*.md / issues-*.md variants
  • Anything not matching the strict \d+\.\d+\.\d+\.md section-report pattern
  • Anything outside outputDirectory/<repo>/<commit>/
  • Anything in subdirectories named rerun/ (preserve manual rerun results — these are intentionally kept across runs per the README)
  • The <commit>/ directory itself

Why opt-in

Defaulting to "false" lets us roll out the change without surprising existing automation. Once it's been exercised against a few real runs and we're confident in the safety bounds, we can flip the default to "true" in a follow-up.

Alternatives considered

Always delete the entire run directory before pushing new files. Brute force; clean. Downside: if the new audit fails partway through, you've lost the previous good run with nothing to fall back on. The opt-in surgical approach above preserves the previous run as a safety net until the new run completes successfully.

Detect and warn only. Add a check that lists orphans and prints a warning, but doesn't delete them. Punts the problem to the user. Useful as a debugging aid; not a fix.

Make discovery deterministic. Lower discovery temperature to 0, or cache the discovery output keyed by commit hash so re-audits get the same partition. Helps for re-audits but not for legitimate "I tweaked the discovery prompt and want to re-run" cases. And doesn't address the core issue that the pipeline is silent about leftover files.

Acceptance criteria

  • New cleanStaleReports parameter on asvs_orchestrate (string, defaults to "false")
  • When enabled and the audit phase fully succeeded, orphan section-report files (matching \d+\.\d+\.\d+\.md) are deleted from the run directory
  • consolidated.md, issues.md, the <commit>/ dir itself, and rerun/ subdirs are never deleted
  • Cleanup is skipped (with a logged warning) when audit phase had failures
  • In carve-out mode, cleanup applies to both private and public repos using their respective tokens
  • Cleanup logs a summary line: Cleanup: deleted N orphan section reports from <repo>/<dir>
  • Tested against a re-audit of apache/mahout at the same commit with discovery seeded to produce a different partition; verify orphans are removed and consolidated.md content is unaffected
  • README updated: new param documented in the orchestrate input table; new troubleshooting note about discovery non-determinism and how to enable cleanup

Out of scope for this issue

  • Cross-commit cleanup (e.g., deleting old <commit_hash>/ directories that are no longer interesting). Different problem, different policy.
  • Cleanup of stale entries in the audit-cache:* data store namespaces. Unrelated; those caches are content-derived (file-set hash) and self-invalidate on real changes.
  • Making discovery deterministic. Worth its own discussion, but doesn't substitute for cleanup.

Related

  • The existing --consolidate-only mode in rerun-sections.sh doesn't trigger this issue because it doesn't re-audit anything; it only reads existing per-section reports. Cleanup should NOT run in --consolidate-only mode either.
  • README QA section: the "check for missing reports" snippet should be updated once cleanup ships, since it currently has no way to detect duplicates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ASVSFor ASVS auditsbugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions