Skip to content

Add CodeQL static analysis to CI for automated security scanning #8

@atkaridarshan04

Description

@atkaridarshan04

What problem does this solve?

There is currently no automated security analysis on pull requests. As the codebase grows and new contributors add code, an entire class of vulnerabilities — path traversal, unsafe deserialization, sensitive data in logs — can be introduced and merged without any automated check catching it.

This is a process gap, not a specific known bug. CodeQL closes it by running static analysis on every PR and flagging dangerous data-flow patterns before they reach main.

This project has specific areas where this matters:

  • CLI writes files to paths derived from user input (--name, --versionmodels/<name>/<version>/) — path traversal class
  • CLI loads .pkl files via pickle and runs subprocess-based artifact inspection — unsafe deserialization class
  • Auth, API key parsing, and rate limiting logic — sensitive data handling class

Proposed solution

Add .github/workflows/codeql.yml using GitHub's native github/codeql-action. This is free for public repositories, requires no external service or account, and runs entirely within GitHub Actions.

What it does on every PR and push to main:

  • Analyses Python source for known vulnerability patterns using CodeQL's standard security query suite
  • Posts findings directly on the PR as inline annotations
  • Fails the check if HIGH severity findings are present

Files to add:

File What
.github/workflows/codeql.yml CodeQL analysis workflow using github/codeql-action/analyze@v3 with python language and security-extended query suite

Notes:

  • Use the security-extended query suite (not just security-and-quality) — it covers the vulnerability classes relevant to this project without the noise of style/quality rules.
  • CodeQL will flag subprocess calls in the inspector and validator as potential injection points — these are intentional and should be suppressed with inline comments (# codeql-suppress) once reviewed, not fixed blindly.
  • Schedule a weekly scan in addition to PR triggers (schedule: cron) — catches newly published queries against existing code.

Alternatives considered

No response

Area

Deployment / infrastructure

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions