Skip to content

[codex] Add DeepSec PR scanning#2110

Open
riderx wants to merge 7 commits into
mainfrom
codex/setup-deepsec-pr-gate
Open

[codex] Add DeepSec PR scanning#2110
riderx wants to merge 7 commits into
mainfrom
codex/setup-deepsec-pr-gate

Conversation

@riderx
Copy link
Copy Markdown
Member

@riderx riderx commented May 10, 2026

Summary (AI generated)

  • Add a .deepsec workspace for Capgo with project-specific scan context and priority paths.
  • Add a DeepSec PR workflow that scans same-repo and fork PR diffs through pull_request_target without executing PR code.
  • Use the existing OPENAI_API_TOKEN secret through a local OpenAI proxy; no AI Gateway upstream or token is required.
  • Route fork PR scans through a separate deepsec-fork-pr environment so they can be maintainer-approved/on-demand while still producing the mandatory Scan PR changes check.

Motivation (AI generated)

DeepSec should block net-new security issues in PRs without requiring contributors to remember a manual scan step. Fork PRs are the highest-risk contribution path, so the workflow must create the same required check there while keeping secrets behind an approval gate and a local proxy.

Business Impact (AI generated)

This adds an automated security gate for changes to Capgo's backend, Cloudflare Workers, database, frontend services, and CLI code. It should reduce the chance of shipping auth, storage, update-delivery, or credential-handling regressions.

Test Plan (AI generated)

  • bun install --frozen-lockfile in .deepsec
  • bun --cwd .deepsec deepsec scan --project-id capgo
  • Focused DeepSec process pass on .github/workflows/deepsec.yml and .github/scripts/openai-token-proxy.mjs returned zero findings
  • Proxy smoke test for health, unauthorized request rejection, unknown path rejection, and model-policy rejection
  • git diff --check
  • go run github.com/rhysd/actionlint/cmd/actionlint@latest .github/workflows/deepsec.yml
  • bunx eslint .github/scripts/openai-token-proxy.mjs
  • bun lint
  • Commit hook ran bun run cli:build && vue-tsc --noEmit
  • PR CI is passing after the latest review-fix commit

Checklist (AI generated)

  • Code style follows the repository conventions.
  • Documentation was updated for the DeepSec setup and workflow.
  • Security validation covered the CI proxy, fork approval gate, and workflow scan target handling.
  • Manual smoke testing covered the local proxy health, auth, path, and model rejection behavior.

Summary by CodeRabbit

  • New Features

    • Integrated automated DeepSec security scanning into PR workflows with safe handling and automated findings posted to PRs.
  • Documentation

    • Added workspace and project setup guides, project-specific security review notes, and onboarding/setup instructions for scanning.
  • Chores

    • Configured scan defaults, prioritized scan paths, CI workflow protections, local request-proxying to limit/validate scanning requests, and workspace packaging metadata.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds DeepSec security scanning infrastructure: a Bun workspace with DeepSec config, project threat model and guidance, operational docs and setup, a local OpenAI proxy that enforces budgets and request validation, and a GitHub Actions workflow that runs sanitized scans on pull requests and publishes findings.

Changes

DeepSec Security Scanning Setup

Layer / File(s) Summary
Workspace and Project Foundation
.deepsec/package.json, .deepsec/deepsec.config.ts, .deepsec/data/capgo/config.json, .deepsec/.gitignore
Package workspace pinned to deepsec@^2.0.8 with Bun metadata, root config declaring capgo project and codex agent, priorityPaths for scan focus, and gitignore rules for generated scan outputs while preserving INFO.md and SETUP.md.
Capgo Security Threat Model and Review Guidance
.deepsec/data/capgo/INFO.md
Documents repo components and expected authentication middleware shapes (middlewareAuth, middlewareV2, middlewareKey, middlewareAPISecret), threat model focused on unauthorized mutations and cross-tenant access, review patterns to flag (auth, DB/RLS, plugin endpoints, storage derivation), and known false-positives.
DeepSec Operational Documentation
.deepsec/AGENTS.md, .deepsec/README.md, .deepsec/data/capgo/SETUP.md
Workspace and agent documentation, setup steps, daily command sequence (scan, process, revalidate, export), CI vs local token guidance (OPENAI_API_TOKEN / OPENAI_API_KEY / codex login), PR-check behavior, and per-project setup guide.
AI Gateway Proxy and Token Management
.github/scripts/openai-token-proxy.mjs
Local HTTP proxy executable enforcing client authorization, restricting endpoints to /v1/chat/completions and /v1/responses, validating models and capping output-token fields, applying per-request and global byte/count budgets, and streaming upstream responses with timeout/abort handling.
GitHub Actions DeepSec Workflow
.github/workflows/deepsec.yml
pull_request_target workflow that validates SHAs, computes changed files, builds a sanitized scan-target/ (path safety and exclusion rules), installs DeepSec deps, starts the local proxy, runs bun deepsec process inside a container with proxy environment, uploads comment.md artifact, and upserts a PR comment with a fixed marker.

Sequence Diagrams

sequenceDiagram
  participant GitHubWF
  participant DeepSec
  participant Proxy
  participant OpenAI
  GitHubWF->>DeepSec: Start deepsec process (container) with PROXY env
  DeepSec->>Proxy: POST /v1/chat/completions (Authorization: Bearer clientToken)
  Proxy->>Proxy: Validate client token & model, cap output tokens
  Proxy->>OpenAI: Forward request with OpenAI token
  OpenAI-->>Proxy: Stream response
  Proxy-->>DeepSec: Stream capped response back
Loading
sequenceDiagram
  participant PR as Pull Request
  participant WF as GitHub Actions
  participant Scanner as Trusted Scanner Checkout
  participant Target as PR Checkout (scan-target)
  participant Proxy as AI Proxy
  participant DeepSec
  participant Comments as PR Comments
  WF->>Scanner: Checkout scanner at base SHA
  WF->>Target: Checkout PR head and build sanitized scan-target
  WF->>Proxy: Start proxy and wait for /health
  WF->>DeepSec: Run deepsec process in container (proxy configured)
  DeepSec->>Proxy: Request completions/responses
  Proxy->>DeepSec: Stream capped responses
  DeepSec->>WF: Emit comment.md
  WF->>Comments: Upload artifact and upsert PR comment with marker
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A rabbit read the repo’s needs and crept,

Through configs, docs, and proxies it leapt,
It wired a scanner, tidy and keen,
That builds a safe target and keeps PRs clean,
Hopping home with badges bright and green.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title '[codex] Add DeepSec PR scanning' clearly and concisely summarizes the main change: introducing DeepSec-based PR scanning infrastructure.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering summary, motivation, business impact, a detailed test plan with completed checkmarks, and a complete checklist addressing code style, documentation updates, security validation, and manual testing.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/setup-deepsec-pr-gate

Comment @coderabbitai help to get the list of available commands and usage tips.

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented May 10, 2026

Merging this PR will not alter performance

✅ 43 untouched benchmarks
⏩ 2 skipped benchmarks1


Comparing codex/setup-deepsec-pr-gate (7780420) with main (7e89b68)

Open in CodSpeed

Footnotes

  1. 2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@riderx riderx marked this pull request as ready for review May 10, 2026 23:10
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
.github/workflows/deepsec.yml (1)

83-93: 💤 Low value

Consider adding a brief diagnostic message before failing.

If the proxy fails to start within the 5-second polling window, the log file is dumped, but there's no explicit message indicating what went wrong. Adding a brief diagnostic could help with troubleshooting.

Suggested improvement
           if ! curl -fsS http://127.0.0.1:8787/health >/dev/null 2>&1; then
+            echo "AI gateway proxy failed to start within 5 seconds."
             cat "$RUNNER_TEMP/ai-gateway-proxy.log"
             exit 1
           fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/deepsec.yml around lines 83 - 93, When the health check
loop times out, add a concise diagnostic log message before dumping the log and
exiting: detect the failure branch after the retry loop that checks
http://127.0.0.1:8787/health and emit a clear message (e.g., "ai-gateway-proxy
failed to become healthy after 5s; dumping log at
$RUNNER_TEMP/ai-gateway-proxy.log") to stderr or echo so the reason is explicit,
then proceed to cat "$RUNNER_TEMP/ai-gateway-proxy.log" and exit 1; update the
failure branch around the existing curl check and log file dump accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.deepsec/AGENTS.md:
- Around line 12-14: Update the command example that currently reads "deepsec
init-project <root>" to use the workspace-consistent invocation "bun deepsec
init-project <root>" (matching the pattern used in `.deepsec/README.md`); search
for the exact token "deepsec init-project <root>" and replace it with "bun
deepsec init-project <root>" so documentation is consistent across files.

In @.deepsec/README.md:
- Around line 53-64: The fenced code block in .deepsec/README.md is missing a
language tag (triggering MD040); update the opening fence from ``` to include a
language like ```text so the block becomes fenced with a language specifier;
modify the fenced block that starts at the snapshot containing
"deepsec.config.ts        Project list (one entry per scanned repo)" to use
```text and keep the rest of the block unchanged.

In @.github/scripts/ai-gateway-proxy.mjs:
- Around line 83-110: The upstream fetch (the call that assigns upstreamResponse
using fetch(upstreamUrl, { method: req.method, headers, body })) must be made
cancellable and timeout-able: create an AbortController, pass its signal to
fetch, set a configurable timeout from an env var (e.g.,
AI_GATEWAY_UPSTREAM_TIMEOUT_MS defaulting to 30000) that calls
controller.abort() when fired, and register req.on('close', ...) to call
controller.abort() if the client disconnects; ensure you clear the timeout timer
on both successful response/stream completion and on errors, and use the same
controller to abort the reader/stream handling (references: upstreamResponse,
fetch(..., { signal }), upstreamUrl, headers, body, res, reader).

---

Nitpick comments:
In @.github/workflows/deepsec.yml:
- Around line 83-93: When the health check loop times out, add a concise
diagnostic log message before dumping the log and exiting: detect the failure
branch after the retry loop that checks http://127.0.0.1:8787/health and emit a
clear message (e.g., "ai-gateway-proxy failed to become healthy after 5s;
dumping log at $RUNNER_TEMP/ai-gateway-proxy.log") to stderr or echo so the
reason is explicit, then proceed to cat "$RUNNER_TEMP/ai-gateway-proxy.log" and
exit 1; update the failure branch around the existing curl check and log file
dump accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b7d7c74f-a0c7-49af-b782-d1bfda9e1d3f

📥 Commits

Reviewing files that changed from the base of the PR and between cd9f618 and 00457af.

⛔ Files ignored due to path filters (1)
  • .deepsec/bun.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • .deepsec/.gitignore
  • .deepsec/AGENTS.md
  • .deepsec/README.md
  • .deepsec/data/capgo/INFO.md
  • .deepsec/data/capgo/SETUP.md
  • .deepsec/data/capgo/config.json
  • .deepsec/deepsec.config.ts
  • .deepsec/package.json
  • .github/scripts/ai-gateway-proxy.mjs
  • .github/workflows/deepsec.yml

Comment thread .deepsec/AGENTS.md Outdated
Comment thread .deepsec/README.md Outdated
Comment thread .github/scripts/ai-gateway-proxy.mjs Outdated
@albercr3
Copy link
Copy Markdown

I think the workflow currently leaves the highest-risk PRs outside the new security gate. deepsec.yml runs on pull_request_target, checks out the trusted scanner from base.sha, and uses a local proxy so the PR code only gets passed to DeepSec as target files. But the analyze job is gated with:

if: github.event.pull_request.head.repo.full_name == github.repository

So every forked contributor PR skips the scan entirely, and the comment job also does nothing because needs.analyze.result is skipped, not failure. For an OSS repo, those fork PRs are the main untrusted contribution path and the place where a PR security scanner is most valuable.

If the scanner path is safe to run on fork content as data, I would remove that same-repo guard and keep using the base-branch checkout plus proxy hardening. If forks must remain excluded, I would add a visible fallback check/comment that says DeepSec was intentionally skipped for fork PRs; otherwise maintainers may assume the new gate covered a PR when it never ran.

Copy link
Copy Markdown

@KCDaemon KCDaemon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rechecked the latest head (6883a57). The workflow is clean/green, but I think the current gating leaves the most important PR class outside the new DeepSec check.

deepsec.yml runs on pull_request_target and carefully checks out the trusted scanner from base.sha, but the analyze job has if: github.event.pull_request.head.repo.full_name == github.repository. That means every fork PR skips analyze entirely. The comment job also only runs when needs.analyze.result == 'failure', so a skipped fork scan produces no visible fallback comment/check explaining that DeepSec did not run.

For an OSS repo, fork PRs are the main untrusted contribution path and the place where a security PR scanner is most useful. If the scanner is safe because PR code is checked out as target data while secrets stay behind the local proxy, I would remove the same-repo guard. If fork scans must stay disabled, the workflow should create a visible skipped/fallback result so maintainers do not assume DeepSec covered those PRs.

git diff --check origin/main...origin/pr-2110 passes locally.

@socket-security
Copy link
Copy Markdown

socket-security Bot commented May 15, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addeddeepsec@​2.0.8811001009680

View full report

@riderx
Copy link
Copy Markdown
Member Author

riderx commented May 15, 2026

Addressed across 5f7d35d, 9203d1d, 932bd22, and 7780420.

  • Fork PRs now create the same Scan PR changes job. Forks route through the deepsec-fork-pr environment, so required reviewers can make the scan maintainer-approved/on-demand while branch protection can still require the check.
  • Removed the AI Gateway path entirely. The workflow uses the existing OPENAI_API_TOKEN secret through a local OpenAI proxy; DeepSec only receives a per-run local token.
  • Removed untrusted PR checkout from pull_request_target. The workflow installs DeepSec from the trusted base checkout, gets the changed file list from GitHub PR files API, fetches the PR head commit by SHA, and copies regular blobs into scan-target.
  • Hardened the OpenAI proxy with strict request schemas, model/path allowlists, output-token caps for both max_tokens and max_completion_tokens, request/response budgets, timeout/cancel handling, and rejection of cost-multiplying built-in tools/options.
  • Added per-file and cumulative scan-target blob limits, and made the DeepSec findings comment reconcile on clean runs so stale finding comments are removed.

Local validation passed: focused DeepSec process returned 0 findings after the latest workflow/proxy changes, proxy smoke test covered auth/path/model/cost-policy rejection, plus git diff --check, actionlint, bunx eslint .github/scripts/openai-token-proxy.mjs, bun lint, and the commit hook. PR CI is green, all review threads are resolved, and CodeRabbit is no longer in review progress.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/scripts/openai-token-proxy.mjs:
- Around line 75-78: The proxy currently only caps 'max_completion_tokens' (via
capTokenField) for non-/v1/responses requests, letting clients pass 'max_tokens'
for chat completions; update the else branch where pathname !== '/v1/responses'
to call capTokenField(json, 'max_tokens') before capTokenField(json,
'max_completion_tokens') so the proxy enforces the output-token budget for
OpenAI Chat Completions requests (refer to the pathname variable and
capTokenField function to locate the change).

In @.github/workflows/deepsec.yml:
- Around line 41-55: The current step builds the changed-files list using
BASE_SHA (github.event.pull_request.base.sha), which is the branch tip at PR
creation and can cause incorrect diffs; replace the left side of the diff with
the PR merge base instead: compute MERGE_BASE by calling git merge-base against
the target branch tip (e.g., origin/${{ github.base_ref }} and HEAD) or use the
GitHub PR files API to enumerate files, then use that MERGE_BASE in the git diff
command that writes to changed-files.txt (update references to BASE_SHA in the
git cat-file/fetch/diff logic accordingly and remove the brittle
fetch-from-origin fallback).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d115cb30-7fbf-48b9-bcd4-8740d6cc1dcd

📥 Commits

Reviewing files that changed from the base of the PR and between 6883a57 and 5f7d35d.

📒 Files selected for processing (3)
  • .deepsec/README.md
  • .github/scripts/openai-token-proxy.mjs
  • .github/workflows/deepsec.yml
✅ Files skipped from review due to trivial changes (1)
  • .deepsec/README.md

Comment thread .github/scripts/openai-token-proxy.mjs Outdated
Comment thread .github/workflows/deepsec.yml Outdated
Comment thread .github/workflows/deepsec.yml Fixed
Comment thread .github/workflows/deepsec.yml Fixed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/deepsec.yml:
- Around line 217-235: The job 'comment' is currently gated by "if: always() &&
needs.analyze.result == 'failure'", which prevents reconciliation when analyze
later succeeds; remove the "needs.analyze.result == 'failure'" part so the
'comment' job runs on every analyze completion (keep always()), keep the
existing step with id download-comment to fetch the artifact and keep the Upsert
PR comment step conditioned on "if: steps.download-comment.outcome == 'success'"
to upsert when comment.md exists, and add a new step (e.g., "Clear PR comment"
or similar) that runs when download-comment.outcome != 'success' to delete or
clear the existing <!-- capgo-deepsec-findings --> marker comment (use the same
permissions and PR write actions) so the marker is removed when no comment.md
artifact was produced.
- Around line 111-128: The script currently copies every regular file into
scan-target via git -C target show "$HEAD_SHA:$file" without size checks; before
writing each blob, use git -C target cat-file -s "$HEAD_SHA:$file" to get the
blob size and enforce a per-file cap (e.g., MAX_FILE_BYTES) and track a
cumulative counter (e.g., total_bytes) to enforce a total cap (e.g.,
MAX_TOTAL_BYTES); if the file exceeds per-file or would push total over the cap,
skip it and log a warning, and only write the blob and append to scan-files.txt
when both checks pass (affecting the commands around git -C target show
"$HEAD_SHA:$file", mkdir -p "scan-target/$(dirname "$file")", and printf '%s\n'
"$file" >> scan-files.txt).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d89e2e81-bb2c-40ed-9a30-c1feed35a179

📥 Commits

Reviewing files that changed from the base of the PR and between 9203d1d and 932bd22.

📒 Files selected for processing (3)
  • .deepsec/README.md
  • .github/scripts/openai-token-proxy.mjs
  • .github/workflows/deepsec.yml
✅ Files skipped from review due to trivial changes (1)
  • .deepsec/README.md

Comment thread .github/workflows/deepsec.yml
Comment thread .github/workflows/deepsec.yml
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants