Skip to content

ci(security): add advanced CodeQL setup with path-scoped query filters#860

Merged
jrusso1020 merged 1 commit into
mainfrom
05-15-ci_security_add_advanced_codeql_setup_with_path-scoped_query_filters
May 15, 2026
Merged

ci(security): add advanced CodeQL setup with path-scoped query filters#860
jrusso1020 merged 1 commit into
mainfrom
05-15-ci_security_add_advanced_codeql_setup_with_path-scoped_query_filters

Conversation

@jrusso1020
Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 commented May 15, 2026

What

Switch CodeQL from default setup to advanced setup, with two new files:

  • .github/workflows/codeql.yml — workflow scanning javascript-typescript, python, and actions (mirroring what default setup was configured with).
  • .github/codeql/codeql-config.yml — per-rule path filters that silence the known-noisy rules on the file shapes where they're always false positives, without excluding those paths from analysis entirely.

Stacked on top of #856 (the actual code fixes for the critical + 2 medium findings).

Why

The CodeQL dashboard hit 170 open alerts, and ~157 of them were duplicates of the same 4 patterns across either generated test fixtures or functional regex that CodeQL misreads as sanitizers. That noise level made it nearly impossible to spot a real finding land — which is exactly the regression mode we don't want for an OSS repo that takes external contributions.

Naive fix: paths-ignore for packages/producer/tests/. That would also blind CodeQL to a malicious contributor smuggling a real execSync(user_input) into a "test fixture." Wrong tradeoff.

Better fix: silence specific rules on specific paths, scoped narrowly enough that anything outside the pattern still trips the alarms.

How

The config has four rule-scoped exclusions, with comments explaining why each path-shape is genuinely a false positive for that one rule:

Rule Excluded paths Why
js/incomplete-sanitization packages/producer/tests/**/output/compiled.html, packages/producer/tests/**/failures/*.html Generated golden baselines, not shipped
js/functionality-from-untrusted-source packages/producer/tests/**, skills/**/test-corpus/**, skills/**/assets/test-corpus/** CDN scripts intentionally unpinned in test fixtures
js/bad-tag-filter packages/cli/src/capture/index.ts, packages/cli/src/whisper/normalize.ts, packages/core/src/lint/utils.ts, packages/producer/src/services/htmlCompiler.ts Functional HTML cleanup regex, output goes to Puppeteer not a user-facing DOM
js/incomplete-multi-character-sanitization packages/cli/src/capture/index.ts, packages/cli/src/whisper/normalize.ts Text normalization, no DOM emission

Key property: rules stay on for everything else. js/command-line-injection, js/code-injection, js/xss-through-dom, etc. all still scan packages/producer/tests/ — so a real exploit slipped into a test fixture would still get caught.

The two cleanup-regex files are named individually, not by glob, so any new file in cli/, core/, or producer/ that looks like a sanitizer still trips the rules. Loosening a path means touching this file — which shows up in PR review the same way a CODEOWNERS change does.

Required action before merge

This repo is currently on CodeQL default setup. Both default and advanced setup can't run simultaneously — a maintainer needs to disable default setup before this workflow can take effect:

  1. Go to Settings → Security → Code scanning.
  2. Find the existing "CodeQL analysis" default-setup config and click Set up → Advanced (or "Switch to advanced setup").
  3. GitHub will note that the repo already has an advanced workflow committed.
  4. Confirm and merge this PR.

If we merge before doing the toggle, the workflow file is harmless (it'll just sit unrun) — but the path-filtered scan won't actually start until default setup is off.

Companion: bulk dismissal of existing noise

In tandem with this PR, 157 of the 170 existing alerts were dismissed via the GitHub Code Scanning API with per-category reason strings:

  • 145 dismissed as used in tests (test fixtures, golden baselines, *.test.ts files, perf test scenarios)
  • 11 dismissed as false positive (functional regex, scheme check that already filters data:/javascript:)

Each dismissal carries a one-line audit comment visible on the Security tab. Open count: 170 → 13. The 13 that remain:

  • 3 close automatically when fix(security): close CodeQL critical command-line-injection and bad-code-sanitization #856 merges (the actual code fixes)
  • 9 js/polynomial-redos (real but lower-risk — build-time regex on developer-authored input, except studio-api/routes/preview.ts which is server-facing and worth a closer look)
  • 1 js/incomplete-sanitization in studio/src/captions/parser.ts:283 (real but low-risk single-quote → JSON normalization for caption pastes)

These were intentionally left open as a small to-do list visible on the dashboard rather than buried in a comment thread.

Test plan

  • bunx oxfmt --check clean on both new files.
  • Action SHAs verified against gh api repos/{actions/checkout,github/codeql-action}/git/refs/tags/{v4,v3} — using same checkout SHA the rest of the repo pins (34e114876b…) and the resolved tip of codeql-action@v3 (7fd177fa68…).
  • CodeQL workflow run on this PR confirms the config parses and the matrix executes (this is the gate that proves nothing's broken).
  • After merge + default-setup toggle: re-scan should produce only 13 open alerts, and any new PR that adds a non-test-path command-injection or xss sink should still trip CodeQL.

What this PR is not doing

  • Not changing what queries run (security-extended was already enabled by default setup).
  • Not adding paths-ignore — every path is still analyzed, only specific rules are silenced where they're known false-positive.
  • Not auto-resolving alerts — dismissals were a one-time API action with comments, separate from this PR.

🤖 Generated with Claude Code

Copy link
Copy Markdown
Collaborator Author

jrusso1020 commented May 15, 2026

Base automatically changed from 05-15-fix_security_close_codeql_critical_bad-code-sanitization to main May 15, 2026 06:50
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean, well-scoped security policy change. Approved.

Verified:

  • actions/checkout SHA (34e114876b…) matches all 9 existing workflows in the repo
  • All 4 excluded source files exist on main: capture/index.ts, whisper/normalize.ts, lint/utils.ts, htmlCompiler.ts
  • Test fixture globs match real paths (packages/producer/tests/**/output/compiled.html, skills/**/test-corpus/**)
  • No existing CodeQL workflow or config in .github/ — this is additive, no conflict
  • query-filters use per-rule exclude + exact file paths (not directory globs) for source files, so new files in the same directories still get scanned

Design:
The key insight — silencing specific rules on specific paths rather than paths-ignore on whole directories — is the right tradeoff. A malicious execSync(user_input) in a test fixture still trips js/command-line-injection because only the false-positive rules are filtered out. Good separation of the config file from the workflow file too (policy changes show up in codeql-config.yml diffs).

One note for the maintainer toggle: the PR body documents this well, but worth double-checking that the schedule trigger (cron: "39 14 * * 1") fires correctly after the default→advanced switch — sometimes GitHub requires a manual workflow dispatch or a push to main before scheduled runs start.

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean security-policy change with the right tradeoff (per-rule path filters, not blanket paths-ignore). Magi already covered the structural review (SHA pin matches the rest of the repo, all 4 source files + fixture globs exist on main, additive — no conflict with existing .github/codeql/ setup, and the design rationale). Adding a few gap-fill findings rather than restating his points.

Strengths

  • .github/workflows/codeql.yml:18-22 — least-privilege permissions: block (only security-events: write; everything else read). This is the right shape for code-scanning workflows and matches what GitHub's hardening guide recommends; easy to get wrong on hand-written advanced setup.
  • .github/codeql/codeql-config.yml:22-28 and :46-50 — scoping the two cleanup-regex rule exclusions to exact file paths rather than directory globs is the load-bearing detail. A new file under packages/cli/src/ that looks like a sanitizer will still trip CodeQL, so the policy doesn't silently widen as the codebase grows.
  • Stacking on #856 (the actual code fixes) is the right ordering — #856 is already merged at 06:50:25Z, so the 3 alerts referenced as "close automatically when #856 merges" will resolve on the first advanced-setup scan rather than being open noise at switchover.

important

  • Independent SHA verification of github/codeql-action@7fd177fa680c9881b53cdab4d346d32574c9f7f4 — confirmed against gh api repos/github/codeql-action/git/refs/tags/v3: the v3 annotated tag dereferences to exactly that commit (committed 2026-05-08 on backport-v3.35.4). So the pin is live-correct, not just internally consistent. Worth recording because the next time v3 rolls, this pin needs a follow-up. Consider adding a comment in the workflow naming the v3 sub-version (v3.35.4 per the merge commit message) so the bump cadence is grep-able.
  • Unchecked test-plan items #3 + #4 in the PR body are the actual gate. "CodeQL workflow run on this PR confirms the config parses and the matrix executes" — that's still a [ ]. Item #4 ("re-scan should produce only 13 open alerts") is post-merge so it can stay unchecked, but item #3 needs to land green before merge — otherwise a malformed query-filters: key (e.g. a typo'd rule id, or a path glob the parser rejects) sails through. Easy mitigation: a dry-run on this PR's commit before flipping the default→advanced toggle, or pre-flighting the YAML against CodeQL's schema. Today's mergeable_state is blocked because required CI checks are still in progress (Typecheck, Test, CLI smoke, Windows render) — none failed — but none of them exercise this workflow file.

nit

  • The skills/**/test-corpus/** glob (without the assets/ infix) in the js/functionality-from-untrusted-source exclusion currently matches no path on main — the only test-corpus that exists is skills/remotion-to-hyperframes/assets/test-corpus/ (matched by the next line). Harmless and forward-looking, but the PR-body table presents both as active surface; either one is fine and could be dropped, or both kept for the explicit "we cover both layouts" intent. Cite if the second variant is intentionally pre-provisioned for an upcoming skill.

notes (not actionable on this PR)

  • Rule 7 scan: the local composite at .github/actions/install-ffmpeg-windows/action.yml does Invoke-WebRequest on a BtbN/FFmpeg-Builds zip without SHA256 verification on the downloaded artifact. Pre-existing, not introduced here — out of scope for this PR — but worth a follow-up ticket on the same security-hardening theme. A pinned URL alone doesn't prevent a tag/asset rebuild on the upstream side.
  • The CodeQL workflow's schedule: cron: "39 14 * * 1" does not fire on forks/branches — it only runs on the default branch. Magi's note about possibly needing a manual workflow_dispatch after the default→advanced toggle is right; consider adding workflow_dispatch: to the on: block so the toggle can be tested without waiting for Monday or a main push.

Verdict: APPROVE
Reasoning: Policy file is well-scoped, SHA pins are independently verified, design tradeoff (per-rule path filter vs. paths-ignore) is the right one. The two important items are operational follow-ups, not code defects; the nit is forward-looking. No blockers.

Review by Vai

@jrusso1020 jrusso1020 merged commit 952d265 into main May 15, 2026
26 checks passed
@jrusso1020 jrusso1020 deleted the 05-15-ci_security_add_advanced_codeql_setup_with_path-scoped_query_filters branch May 15, 2026 06:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants