Skip to content

Development#60

Merged
cemililik merged 4 commits into
mainfrom
development
May 22, 2026
Merged

Development#60
cemililik merged 4 commits into
mainfrom
development

Conversation

@cemililik
Copy link
Copy Markdown
Collaborator

@cemililik cemililik commented May 22, 2026

Summary by Sourcery

Introduce opt-in YAML-based CVE ignore support to the pip-audit severity gate, with schema-validated suppressions and project-specific ignore configuration, and update tests, docs, and CI nightly workflow to enforce the new policy.

New Features:

  • Add a --ignores PATH option to check_pip_audit.py to apply schema-validated CVE suppressions from a YAML file while keeping deployer runs unsuppressed by default.

Enhancements:

  • Extend pip-audit severity bucketing to track and log suppressed findings, including id/alias matching and GitHub Actions ::notice:: annotations.
  • Add robust parsing and validation for pip-audit ignore YAML files, including alias indexing and error reporting for invalid schema or unreadable files.

CI:

  • Update the nightly workflow to use the checked-in tools/pip_audit_ignores.yaml with check_pip_audit.py instead of passing CVE ignores directly to pip-audit.

Documentation:

  • Document the new --ignores YAML suppression workflow and schema in English and Turkish reference and operations guides, clarifying that deployers do not inherit ForgeLM's project ignore list.

Tests:

  • Add unit tests covering ignore-by-id/alias behaviour, non-matching ignores, schema validation failures, invalid/missing ignore files, default no-ignore behaviour, and validation of the checked-in project ignore file.

Chores:

  • Add and populate tools/pip_audit_ignores.yaml with ForgeLM's own vetted CVE suppressions for transformers, torch, and markdown.

Summary by CodeRabbit

Release Notes

  • New Features

    • CVE suppression mechanism now uses centralized YAML-based configuration files, replacing inline command-line flags for improved vulnerability exception management.
  • Documentation

    • Updated operational guides with instructions for the new YAML-based CVE suppression workflow, including required fields and validation rules.
  • Tests

    • Added comprehensive test coverage validating CVE suppression functionality and configuration schema validation.

Review Change Stack

cemililik and others added 3 commits May 21, 2026 15:45
The 2026-05-21 nightly (run 26210539455) failed on `pip-audit` because
the OSV / GHSA databases published 9 new torch advisories and 1 markdown
advisory overnight, none of which have an upstream fix.  Without an
ignore, `tools/check_pip_audit.py` fails closed on UNKNOWN severity
(pip-audit's JSON does not serialise OSV severity) and the nightly stays
red, masking real future breakage.

Triage (issue #58):

- torch PYSEC-2025-191..197, PYSEC-2025-210, PYSEC-2026-139: all
  require a LOCAL attacker passing malformed inputs to specific torch
  APIs (`jit.script`, `lstm_cell`, `cuda.memory.*`, `pt2` loader, etc.).
  ForgeLM is a local-CLI tool; an attacker with that access is already
  inside the trust boundary.  None of the affected APIs are called with
  attacker-controlled arguments in `forgelm/`.

- markdown PYSEC-2026-89: OSV affected-range misclassification — the
  advisory description states the fix shipped in markdown==3.8.1 but
  the range record has no `fixed` event, so every version is flagged.
  Installed 3.10.2 is post-fix.

Each ignore is documented inline in `.github/workflows/nightly.yml`
with the surface, the threat-model carve-out, and the condition for
re-evaluating (per `docs/reference/supply_chain_security.md`).  Also
drops the stale "Issue #37 tracks the active set" reference — #37 was
a closed nightly-failure issue, not a tracker.

Refs: #58

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address PR #59 review comment: extract the growing --ignore-vuln list
from .github/workflows/nightly.yml into tools/pip_audit_ignores.yaml,
consumed by tools/check_pip_audit.py via a new opt-in --ignores flag.
New CVE suppressions no longer require editing the workflow.

Design — opt-in, not opt-out
============================

docs/reference/supply_chain_security.md explicitly tells deployers that
ForgeLM does NOT ship a default project-level ignore list. The new
flag preserves that contract: without --ignores PATH, check_pip_audit.py
applies no suppressions, so a deployer running

    pip install forgelm[security]
    python3 tools/check_pip_audit.py /tmp/pip-audit.json

still sees the full unfiltered severity gate. The project's own nightly
opts in explicitly:

    python3 tools/check_pip_audit.py /tmp/pip-audit.json \
        --ignores tools/pip_audit_ignores.yaml

Schema enforcement
==================

Every entry in pip_audit_ignores.yaml must carry six required fields
(id, package, reason, threat_model, verified_at, reevaluate_after);
missing any one fails the gate with an ::error:: that names the gap.
This blocks the "stick a bare id: in and forget" pattern — every
suppression now carries a written justification + re-evaluate trigger
as required by docs/reference/supply_chain_security.md.

Matching uses {id} ∪ aliases on both sides, so an ignore listing the
CVE alias still matches a pip-audit finding emitted under its PYSEC
primary id (and vice versa). Each match is logged as a ::notice::
annotation so the run summary surfaces the audit trail; suppressions
don't disappear into the workflow log silently.

Migrations
==========

All 11 ignores in nightly.yml (CVE-2026-1839 transformers + 9 torch
PYSEC-2025-191..197/210 + PYSEC-2026-139 + 1 markdown PYSEC-2026-89)
moved verbatim into the YAML file with the justifications expanded
into structured fields. The workflow's pip-audit step shrinks from
~95 lines of inline comments + per-CVE --ignore-vuln args to a
~12-line block pointing at the YAML.

Tests
=====

Extends tests/test_check_pip_audit.py with 9 new cases covering:
- suppression by primary id and by alias
- no false-positive match on unrelated CVEs
- schema validation (each required field individually named on failure)
- missing / invalid YAML files fail closed
- default (no --ignores) is unchanged — deployer-safe
- the checked-in tools/pip_audit_ignores.yaml itself passes schema
  validation (regression guard so the workflow never breaks on its
  own ignore file)

Docs
====

Updates docs/reference/supply_chain_security.md and the TR mirror plus
docs/usermanuals/{en,tr}/operations/supply-chain.md with the new
deployer workflow (write your own ignores.yaml, pass via --ignores).
Bilingual parity verified by tools/check_bilingual_parity.py --strict.

Refs: #58, #59 (review comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(supply-chain): ignore 10 no-fix CVEs that broke nightly 2026-05-21
@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented May 22, 2026

Reviewer's Guide

Adds opt-in YAML-based vulnerability ignore support to tools/check_pip_audit.py, wires it into the nightly workflow, documents the policy in reference and operations docs (EN/TR), and introduces tests plus a checked-in project ignore file to enforce schema and behaviour around CVE suppressions.

Sequence diagram for pip-audit gating with optional YAML ignores

sequenceDiagram
    actor GitHubActions
    participant pip_audit
    participant check_pip_audit_py as check_pip_audit.py
    participant _load_report
    participant _load_ignores
    participant _bucket_findings

    GitHubActions->>pip_audit: pip-audit --format json --output /tmp/pip-audit.json || true
    GitHubActions->>check_pip_audit_py: python3 tools/check_pip_audit.py /tmp/pip-audit.json --ignores tools/pip_audit_ignores.yaml

    check_pip_audit_py->>_load_report: _load_report(args.report)
    _load_report-->>check_pip_audit_py: report or None
    alt [report is None]
        check_pip_audit_py-->>GitHubActions: exit 1
    else [report loaded]
        check_pip_audit_py->>_load_ignores: _load_ignores(args.ignores)
        _load_ignores-->>check_pip_audit_py: ignores or None
        alt [ignores is None]
            check_pip_audit_py-->>GitHubActions: exit 1
        else [ignores loaded]
            check_pip_audit_py->>_bucket_findings: _bucket_findings(report, ignores)
            _bucket_findings-->>check_pip_audit_py: high, medium, unknown, suppressed
            loop for each suppressed
                check_pip_audit_py-->>GitHubActions: ::notice::pip-audit suppressed ...
            end
            opt [medium/high/unknown findings]
                check_pip_audit_py-->>GitHubActions: ::warning:: / ::error:: annotations
                check_pip_audit_py-->>GitHubActions: exit 1 or 0
            end
        end
    end
Loading

File-Level Changes

Change Details Files
Introduce YAML-based ignore file loading, validation, and suppression logic to the pip-audit gate helper.
  • Add _IGNORE_REQUIRED_KEYS to define required fields for ignore entries and treat missing fields as policy violations.
  • Implement _load_ignores to read a YAML file, validate its structure and required fields, index entries by id and aliases, and emit ::error:: or ::warning:: annotations on problems.
  • Add _vuln_identifiers helper to normalize vulnerability ids and aliases for matching against ignores.
  • Extend _bucket_findings to accept an optional ignores map, filter matching findings into a suppressed list, and return high, medium, unknown, and suppressed buckets.
  • Introduce _parse_argv to parse the report path and optional --ignores flag, replacing the previous manual argv length check.
  • Update main to use _parse_argv, load the report and optional ignores (failing closed on ignore load issues), handle the new suppressed bucket by printing ::notice:: annotations, and adjust error messages to reference args.report.
tools/check_pip_audit.py
Add tests covering ignore file behaviour and schema enforcement, including the checked-in project ignore file.
  • Introduce _write_ignores and _valid_ignore_entry helpers to generate ignore files in tests.
  • Add tests for suppression by primary id and alias, ensuring notices are emitted and severity headers are absent when ignored.
  • Add a test verifying that unrelated CVEs are not suppressed by ignore entries, guarding against permissive matching.
  • Add a test that missing required fields in an ignore entry cause the tool to fail and list the missing field names.
  • Add tests for missing ignore file and invalid YAML both causing failures with appropriate error messages.
  • Add a test to ensure default behaviour without --ignores remains unfiltered for deployers.
  • Add a structural test that tools/pip_audit_ignores.yaml passes schema validation and contains an expected historical transformers ignore entry.
tests/test_check_pip_audit.py
Document the new --ignores mechanism and project-vs-deployer suppression policy in reference and operations manuals (EN and TR).
  • Extend check_pip_audit.py module docstring with an explanation of the optional --ignores PATH flag, suppression semantics (id ∪ aliases, ::notice:: logging), and deployer behaviour without ignores.
  • Update docs/reference/supply_chain_security.md to describe opt-in ignores, required YAML fields, logging of suppressions, presence and purpose of tools/pip_audit_ignores.yaml, and its non-inheritance by deployers.
  • Mirror the same explanation and policy in docs/reference/supply_chain_security-tr.md for Turkish readers.
  • Update docs/usermanuals/en/operations/supply-chain.md to replace pip-audit --ignore-vuln usage with a YAML ignore file passed via --ignores, including a minimal example and explanation of required fields and project/deployer separation.
  • Update docs/usermanuals/tr/operations/supply-chain.md similarly for Turkish, including example YAML and policy language.
tools/check_pip_audit.py
docs/reference/supply_chain_security.md
docs/reference/supply_chain_security-tr.md
docs/usermanuals/en/operations/supply-chain.md
docs/usermanuals/tr/operations/supply-chain.md
Wire the new project ignore file into nightly CI and remove inline ignore flags from pip-audit.
  • Adjust the nightly workflow step for pip-audit to document that tools/check_pip_audit.py now applies the severity policy and consumes tools/pip_audit_ignores.yaml via --ignores.
  • Remove the explicit --ignore-vuln CVE-2026-1839 flag from the pip-audit invocation, delegating suppression to the YAML ignore file.
  • Pass --ignores tools/pip_audit_ignores.yaml to check_pip_audit.py in the nightly workflow, and update comments to describe the schema and policy around that file.
.github/workflows/nightly.yml
Add a checked-in project-level pip-audit ignore file encoding ForgeLM’s internal accepted-risk CVEs with a strict schema.
  • Create tools/pip_audit_ignores.yaml with a header explaining that it is only for ForgeLM’s own nightly, not deployers, and documenting the schema (id, aliases, package, reason, threat_model, verified_at, reevaluate_after, references).
  • Populate the ignore list with one transformers CVE (CVE-2026-1839) including detailed reason, threat model, and reevaluation conditions.
  • Add nine torch-related PYSEC/CVE entries with shared local-attacker threat model justification and per-entry reasons, dates, and reevaluation rules.
  • Add a markdown PYSEC entry documenting an OSV misclassification case, including references to the upstream advisory database and explanation of why the installed version is post-fix.
  • Ensure entries use consistent structure so _load_ignores can validate and index them by id and aliases.
tools/pip_audit_ignores.yaml

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 22, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ba4464c5-448a-4b27-a5e9-d8a57736759f

📥 Commits

Reviewing files that changed from the base of the PR and between 80bd450 and 2a28f8c.

📒 Files selected for processing (8)
  • .github/workflows/nightly.yml
  • docs/reference/supply_chain_security-tr.md
  • docs/reference/supply_chain_security.md
  • docs/usermanuals/en/operations/supply-chain.md
  • docs/usermanuals/tr/operations/supply-chain.md
  • tests/test_check_pip_audit.py
  • tools/check_pip_audit.py
  • tools/pip_audit_ignores.yaml

📝 Walkthrough

Walkthrough

The pull request refactors CVE suppression in pip-audit from inline workflow flags to a YAML-based opt-in system. The tool gains --ignores argument support with strict schema validation; the workflow passes tools/pip_audit_ignores.yaml to enforce project policy; tests validate suppression and schema enforcement; and documentation guides deployers on the new mechanism.

Changes

CVE Suppression Policy Refactor

Layer / File(s) Summary
Tool enhancement: --ignores option and suppression logic
tools/check_pip_audit.py
check_pip_audit.py adds --ignores argument parsing, strict YAML schema validation (required fields: id, package, reason, threat_model, verified_at, reevaluate_after; optional aliases list), and suppression matching by primary id or aliases. Suppressed findings emit ::notice:: annotations before severity bucketing. The tool fails closed on any ignore-file read, parse, or schema error.
Test suite for --ignores behavior and schema enforcement
tests/test_check_pip_audit.py
Comprehensive test coverage validates suppression by primary id and aliases, confirms non-matches fail the gate, enforces schema rules (field types, required/optional), handles file read/YAML parse errors as hard failures, and validates the repository's checked-in ignore file.
Project's pip-audit ignore policy file
tools/pip_audit_ignores.yaml
New file defines ForgeLM's internal suppression list for nightly CI: entries for transformers (CVE-2026-1839), multiple torch vulnerabilities (PYSEC-2025-191 through PYSEC-2026-139), and markdown (PYSEC-2026-89), each with required schema fields and threat-model justification. Clarifies ignores are not inherited externally.
Nightly workflow integration
.github/workflows/nightly.yml
The supply-chain-security job's pip-audit step now invokes tools/check_pip_audit.py --ignores tools/pip_audit_ignores.yaml, applying the project's suppression policy and replacing prior inline --ignore-vuln list.
Documentation updates: references and operations guides
docs/reference/supply_chain_security.md, docs/reference/supply_chain_security-tr.md, docs/usermanuals/en/operations/supply-chain.md, docs/usermanuals/tr/operations/supply-chain.md
Reference and operational guides document the YAML schema, command syntax, "fail closed" validation, notice annotation emission, and clarify that no default ignore list is provided; deployers must explicitly pass --ignores to enable suppression.

Sequence Diagram

sequenceDiagram
  participant AuditJSON as pip-audit JSON
  participant CheckScript as check_pip_audit.py
  participant IgnoresYAML as --ignores YAML
  participant IgnoresValidator as Schema Validator
  participant SuppressChecker as Suppression Matcher
  participant Output as Severity Gate & Notices

  AuditJSON->>CheckScript: audit findings (json)
  CheckScript->>IgnoresYAML: load file (--ignores path)
  IgnoresYAML->>IgnoresValidator: parse YAML & validate schema
  IgnoresValidator->>CheckScript: ignores index {id+aliases}
  CheckScript->>SuppressChecker: match findings vs ignores
  SuppressChecker->>Output: suppressed findings as ::notice::
  SuppressChecker->>Output: remaining findings for severity gate
Loading

🎯 3 (Moderate) | ⏱️ ~25 minutes

🐰 A YAML file hops into view,
With ignores all properly vetted,
The gate is now strict—
Suppression opt-in picked,
So security stays well-protected! 🔐

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 76.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The PR title 'Development' is vague and does not clearly summarize the main change; it does not convey the opt-in YAML-based CVE ignore support implementation. Use a descriptive title such as 'Add opt-in YAML-based CVE suppression to pip-audit gate' to clearly communicate the primary change.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The description provided is detailed and comprehensive, covering all major changes; however, it deviates from the template structure (Summary, Changes, Type, Testing, Checklist).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch development

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 22, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 56 complexity · 4 duplication

Metric Results
Complexity 56
Duplication 4

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an opt-in suppression mechanism for pip-audit via a new --ignores flag in the check_pip_audit.py tool. The changes include updated documentation in both English and Turkish, a new YAML-based ignore list for internal project use, and comprehensive unit tests. The implementation enforces a strict schema for ignore entries to ensure documented justification for every suppressed CVE. Feedback was provided to add explicit type validation for the aliases field in the YAML parser to prevent incorrect matching or runtime errors if a non-list type is provided.

Comment thread tools/check_pip_audit.py
# match without the workflow having to know which form pip-audit
# emits this week. Last write wins on duplicates, but we surface
# the dup so the file stays clean.
ids = {primary_id, *(entry.get("aliases") or [])}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The aliases field is expected to be a list, but the current implementation lacks type validation for it. If a user provides a string (e.g., aliases: "CVE-2024-XXXX") instead of a list, Python will iterate over the string's characters when unpacking, leading to incorrect matching logic. Additionally, providing a non-iterable type would cause the script to crash. Adding an explicit type check ensures the YAML adheres to the expected schema and fails gracefully with a clear error message.

Suggested change
ids = {primary_id, *(entry.get("aliases") or [])}
aliases = entry.get("aliases")
if aliases is not None and not isinstance(aliases, list):
print(
f"::error::pip-audit ignore entry #{index} in {ignores_path} "
f"'aliases' must be a list.",
file=sys.stderr,
)
return None
ids = {primary_id, *(aliases or [])}

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • In _load_ignores, consider validating that aliases (when present) is a list of strings rather than accepting arbitrary types; as written, a YAML entry like aliases: CVE-... will be treated as an iterable of characters and pollute the id index.
  • In _bucket_findings, you repeatedly intersect _vuln_identifiers(vuln) with ignores.keys(); computing a ignore_ids = set(ignores) once and using that would simplify the code and avoid rebuilding the keys view on every finding.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `_load_ignores`, consider validating that `aliases` (when present) is a list of strings rather than accepting arbitrary types; as written, a YAML entry like `aliases: CVE-...` will be treated as an iterable of characters and pollute the id index.
- In `_bucket_findings`, you repeatedly intersect `_vuln_identifiers(vuln)` with `ignores.keys()`; computing a `ignore_ids = set(ignores)` once and using that would simplify the code and avoid rebuilding the keys view on every finding.

## Individual Comments

### Comment 1
<location path="tools/check_pip_audit.py" line_range="237" />
<code_context>
+        # match without the workflow having to know which form pip-audit
+        # emits this week.  Last write wins on duplicates, but we surface
+        # the dup so the file stays clean.
+        ids = {primary_id, *(entry.get("aliases") or [])}
+        for ident in ids:
+            if ident in by_id and by_id[ident] is not entry:
</code_context>
<issue_to_address>
**issue (bug_risk):** Validate that `aliases` is a list of strings to avoid accidental character-splitting and non-string ids.

If `aliases` is given as a string or other iterable, this will iterate over its elements (e.g., characters) and add invalid IDs, and it also permits non-string values. Consider validating `aliases` before building `ids`:

```python
aliases = entry.get("aliases")
if aliases is None:
    aliases_list: list[str] = []
elif isinstance(aliases, list) and all(isinstance(a, str) for a in aliases):
    aliases_list = aliases
else:
    print(
        f"::error::pip-audit ignore entry #{index} in {ignores_path} 'aliases' must be a list of strings.",
        file=sys.stderr,
    )
    return None

ids = {primary_id, *aliases_list}
```

This keeps the ignore index well-formed and prevents unexpected matches from malformed `aliases`.
</issue_to_address>

### Comment 2
<location path="tools/check_pip_audit.py" line_range="154-155" />
<code_context>
+# Required keys per entry in the ignore file.  Missing any of them is a
+# policy violation (an undocumented suppression), so the gate fails closed
+# rather than silently accepting CVEs with no recorded justification.
+_IGNORE_REQUIRED_KEYS: frozenset[str] = frozenset(
+    {"id", "package", "reason", "threat_model", "verified_at", "reevaluate_after"}
+)
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider validating basic types/shape of required ignore fields, not just their presence.

The current check only verifies key presence, not that values are of the expected type/format. That allows cases like `verified_at` being an int/None or `package`/`reason`/`threat_model` being non-strings to slip through, weakening the guarantees this gate provides. Please add minimal validation (e.g., required fields are non-empty strings and `verified_at` roughly matches `YYYY-MM-DD`), and fail closed when values are invalid.

Suggested implementation:

```python
# Required keys per entry in the ignore file.  Missing any of them is a
# policy violation (an undocumented suppression), so the gate fails closed
# rather than silently accepting CVEs with no recorded justification.
_IGNORE_REQUIRED_KEYS: frozenset[str] = frozenset(
    {"id", "package", "reason", "threat_model", "verified_at", "reevaluate_after"}
)


def _validate_ignore_entry(entry: "Mapping[str, object]", *, location: str) -> None:
    """Validate the required keys and basic types/shape for an ignore entry.

    This is intentionally strict and fails closed: any missing or malformed
    required field is treated as a policy violation.
    """
    missing = _IGNORE_REQUIRED_KEYS.difference(entry)
    if missing:
        raise ValueError(
            f"{location}: ignore entry is missing required field(s): "
            f"{', '.join(sorted(missing))}"
        )

    def _require_non_empty_str(field: str) -> str:
        value = entry[field]
        if not isinstance(value, str) or not value.strip():
            raise TypeError(
                f"{location}: ignore entry field '{field}' must be a non-empty string"
            )
        return value.strip()

    # Required string fields
    for field in ("id", "package", "reason", "threat_model"):
        _require_non_empty_str(field)

    # Date-like fields must roughly match YYYY-MM-DD
    date_pattern = re.compile(r"^\d{4}-\d{2}-\d{2}$")
    for field in ("verified_at", "reevaluate_after"):
        raw = _require_non_empty_str(field)
        if not date_pattern.match(raw):
            raise ValueError(
                f"{location}: ignore entry field '{field}' must be in YYYY-MM-DD format"
            )


Used in ``.github/workflows/nightly.yml`` after the ``pip-audit`` step.

```

To fully wire this up you will also need to:

1. **Imports**
   - Add the following imports near the top of `tools/check_pip_audit.py`, alongside existing imports:
     - `import re`
     - `from collections.abc import Mapping`
   - Since the snippet above uses a forward reference `"Mapping[str, object]"` in the type hint, this will work even if `Mapping` hasn’t been imported yet at the point where the function is defined, but you still need the import for type checking and other uses.

2. **Call the validator**
   - Locate the code that loads/parses the ignore YAML file (likely something like `yaml.safe_load` producing a list/dict of ignore entries).
   - For each ignore entry, call:
     ```python
     _validate_ignore_entry(entry, location=f"{ignore_file_path}:{idx}")
     ```
     where `ignore_file_path` is the path or logical name of the ignore file, and `idx` is the entry index or some identifier you already have. The `location` string is just for clearer error messages; adapt as appropriate for your context.
   - Ensure that any exception raised by `_validate_ignore_entry` is allowed to propagate and cause the gate to fail (i.e., do not swallow it), so invalid values result in a failed check rather than silently passing.

3. **Optional refinement**
   - If `reevaluate_after` is intended to allow special sentinel values (e.g., `"never"`), adjust the date validation for that field accordingly (e.g., skip the regex check when the field equals the sentinel value) while still enforcing type and non-emptiness.
</issue_to_address>

### Comment 3
<location path="docs/reference/supply_chain_security.md" line_range="123-128" />
<code_context>
+
+Each entry in the YAML file must carry `id`, `package`, `reason`,
+`threat_model`, `verified_at`, and `reevaluate_after` (optional:
+`aliases`, `references`); missing any required field fails the gate
+closed, so an undocumented suppression cannot land silently. Every
+match is logged as a `::notice::` annotation in the run summary so
+the audit trail stays visible.
</code_context>
<issue_to_address>
**suggestion (typo):** Clarify the phrase "fails the gate closed" for better grammar and readability.

The clause "missing any required field fails the gate closed" is awkward and may confuse readers. Please rephrase to clearer English, e.g. "missing any required field causes the gate to fail closed" or similar.

```suggestion
Each entry in the YAML file must carry `id`, `package`, `reason`,
`threat_model`, `verified_at`, and `reevaluate_after` (optional:
`aliases`, `references`); missing any required field causes the gate
to fail closed, so an undocumented suppression cannot land silently.
Every match is logged as a `::notice::` annotation in the run summary
so the audit trail stays visible.
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread tools/check_pip_audit.py
Comment thread tools/check_pip_audit.py Outdated
Comment thread docs/reference/supply_chain_security.md Outdated
…view)

Address review feedback on the pip_audit_ignores.yaml loader:

- Validate `aliases` is a list of strings before unpacking it into the
  id index. Previously `aliases: CVE-2025-2953` (a bare string) would
  be iterated character-by-character, polluting the index with
  single-char "ids" and silently breaking matching; a non-iterable
  would crash. Now fails closed with a clear ::error::.

- Validate value shape of required fields, not just presence: id,
  package, reason, threat_model, reevaluate_after must be non-empty
  strings, and verified_at must be a 'YYYY-MM-DD' string (quote it in
  YAML so it is not parsed as a datetime.date). An empty reason was a
  documented-justification gap the old presence-only check let through.

  Note: reevaluate_after is deliberately NOT date-validated — it is a
  free-text retirement condition ("Each release cycle, or when …"), so
  the review's suggested date-regex on it would have rejected every
  real entry. Date validation applies to verified_at only.

- Hoist `ignore_ids = set(ignores)` once in _bucket_findings instead of
  intersecting against `ignores.keys()` per finding.

- Doc grammar: "fails the gate closed" -> "causes the gate to fail
  closed" in both EN mirrors, and document the new malformed-value
  rejection in the reference + user-manual (EN + TR).

Adds 7 tests: aliases-as-string, aliases-with-non-string-element,
null-aliases-accepted, empty-required-string, malformed verified_at,
unquoted-YAML-date rejection, and free-text reevaluate_after accepted.
23 tests pass.

Refs: #59 (review)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

@cemililik cemililik merged commit a2bdb82 into main May 22, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant