Skip to content

feat(risk): default to recent-N drain; opt-in full backfill#2889

Merged
mfbx9da4 merged 11 commits into
mainfrom
risk-analysis-limit-backfill
May 18, 2026
Merged

feat(risk): default to recent-N drain; opt-in full backfill#2889
mfbx9da4 merged 11 commits into
mainfrom
risk-analysis-limit-backfill

Conversation

@mfbx9da4
Copy link
Copy Markdown
Contributor

@mfbx9da4 mfbx9da4 commented May 17, 2026

Closes AGE-2212 and AGE-2378.

Why

Every new chat message and every policy edit triggered a full drain of all unanalyzed messages for that policy. The work isn't redundant (the query already filters by risk_policy_version), but it's expensive: each message runs through gitleaks + Presidio + optional prompt-injection scanning, and a single new message could kick off thousands of analyses on the hot path.

AGE-2212 asks us to default the ingest path to "scan recent traffic, not all history". Operators who want a historical re-scan opt in explicitly from the Progress tab.

What changed

  • Default recent-N drain. Ingest (OnMessagesStored), CreateRiskPolicy, and UpdateRiskPolicy cap their drain at the most recent 100 unanalyzed messages (DefaultRecentMessagesBudget).
  • Explicit backfill. TriggerRiskAnalysis accepts an optional limit (defaults to 100). Pass 0 to backfill everything. The Progress sheet has both Backfill all messages (sends 0) and Backfill last N with a number input. Each click still bumps the policy version (re-analysis semantic preserved); successive bounded clicks therefore replace rather than accumulate, which matches "trigger means re-do".
  • Workflow budget. DrainRiskAnalysisParams.MaxMessages + SignalNewMessagesPayload.MaxMessages. A MaxMessages>0 run fetches up to that many and stops; 0 drains to empty. Concurrent signals during a bounded run escalate the next cycle (0 wins; larger positive wins over smaller).
  • No self-loop on SignalWithStart. SignalWithStart leaves the triggering signal in the channel for the new run. The workflow now drains it at the top so it isn't mistaken for "new work arrived" at end-of-cycle. Without this, bounded runs ContinueAsNewed forever with the same params. Covered by TestDrainWorkflow_StartSignalDoesNotSelfLoop (fails on the pre-fix code, passes with the fix) and TestDrainWorkflow_SignalDuringDrainContinuesAsNew (mid-cycle escalation still works).
  • Parallel signaling. OnMessagesStored fans out per-policy signals so the chat hot path isn't N × 20ms.
  • Index-only scan. FetchUnanalyzedMessageIDs orders by cm.id DESC. uuidv7 is k-sortable so the existing composite (project_id, id) index satisfies the order via Index Only Scan Backward — no Sort node, no new index. EXPLAIN ANALYZE: LIMIT 100 over 15k rows runs in ~2ms.
  • Auto-naming (AGE-2378). generatePolicyName translates sources to user-facing category labels before prompting the model, and the prompt instructs the model not to mention internal tool or library names. fallbackPolicyName uses the same mapping.

Demo

Default behavior + full backfill on a fresh policy

A new Demo Secrets Only policy is created via the API. The create signal fires the default recent-N drain so the Progress sheet opens at exactly 100 of 15,300 analyzed. Clicking Backfill all messages then drains to completion — counter climbs all the way to 15,300 / 15,300.

https://github.com/speakeasy-api/gram/releases/download/_gh-attach-assets/recording2-y4vhbb.mp4

Bounded explicit backfill

Two clicks of Backfill last N on a fresh policy, first at N=100 then N=400. Trigger bumps the policy version on each call, so the second click re-scans under a new version — final state settles at exactly 400 of 15,300 at version 3, not 15,300. Demonstrates the bound is honored.

https://github.com/speakeasy-api/gram/releases/download/_gh-attach-assets/recording-gkptb5.mp4

Test plan

  • go test ./internal/background/... ./internal/risk/...
  • mise lint:server, cd client/dashboard && pnpm lint
  • Manual via dashboard (see demos above)
  • EXPLAIN ANALYZE confirms Index Only Scan Backward on the composite index for both LIMIT 100 and LIMIT 20000

Previously every new chat message and every policy edit triggered a
full backfill of every unanalyzed message for every enabled policy.
That work scales with project history and is wasted effort when the
operator only cares about ongoing scanning.

Now ingest signals and policy create/update default to the most recent
100 unanalyzed messages per policy. Explicit backfill is opt-in via the
existing Trigger endpoint, which accepts a new optional `limit` field
(omit or 0 for unbounded). The Progress sheet exposes both
"Backfill all messages" and "Backfill last N".

The workflow gained a MaxMessages param and a typed signal payload so
concurrent signals can escalate a bounded run to unbounded mid-flight
(0 wins; larger positive wins over smaller). Pending signals at end of
a cycle are folded into the next ContinueAsNew.

FetchUnanalyzedMessageIDs now orders by cm.id DESC. uuidv7 is
k-sortable, so the PK btree gives us "recent first" without a new
index and Postgres can stop scanning early once LIMIT is satisfied.
@mfbx9da4 mfbx9da4 requested review from a team as code owners May 17, 2026 16:01
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 17, 2026

⚠️ No Changeset found

Latest commit: 1b1861b

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@vercel
Copy link
Copy Markdown

vercel Bot commented May 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gram-docs-redirect Ready Ready Preview, Comment May 18, 2026 10:29am

Request Review

@github-actions github-actions Bot added the preview Spawn a preview environment label May 17, 2026
@speakeasybot
Copy link
Copy Markdown
Collaborator

speakeasybot commented May 17, 2026

🚀 Preview Environment (PR #2889)

Preview URL: https://pr-2889.dev.getgram.ai

Component Status Details Updated (UTC)
⏳ Database Pending Waiting for db-init job 2026-05-18 10:41:27.
✅ Images Available Container images ready 2026-05-18 10:41:24.

Gram Preview Bot

…aming

Two related risk-policy polish items.

1. triggerRiskAnalysis: default `limit` to 100 (the recent-N drain
   budget). Callers can pass 0 explicitly to request a full backfill.
   The previous "omit = unbounded" was a footgun because the most
   common Progress-tab use case is "scan recent messages, not all
   history".

2. Auto-naming: stop leaking detection library names. Policy authors
   think in what is detected (secrets, PII, prompt injection) not how
   (gitleaks, presidio). The LLM prompt was passing raw source
   identifiers, so generated names regularly regurgitated the library
   name (e.g. "Block Gitleaks Secrets"). Translate sources to user
   facing category labels and tell the model not to mention internal
   tool names. Closes AGE-2378.

Also: tighten the FetchUnanalyzedMessageIDs comment. The previous
wording credited the primary-key btree for the recent-first ordering,
but EXPLAIN ANALYZE shows the composite (project_id, id) index is what
satisfies ORDER BY cm.id DESC via an Index Only Scan Backward (no Sort
node, ~2ms for LIMIT 100 on 15k rows).
@linear-code
Copy link
Copy Markdown

linear-code Bot commented May 17, 2026

AGE-2212

AGE-2378

Each SignalNewMessages call round-trips to Temporal (~20ms). The
observer fires on the chat-message hot path, and a project with N
enabled risk policies was paying N × 20ms serially. Fan out the
per-policy signals so total latency is the slowest signal, not the
sum.
The dashboard Input wrapper passes (value: string) to onChange, not
the raw React event. Caught by tsc in CI.
Goa's Default(100) on triggerRiskAnalysis.limit converts an omitted
field into 100 on the server. The Backfill all messages button was
calling onTrigger() with no argument, which serializes the field as
undefined, so the server applied the default and the "all messages"
button silently behaved as "last 100".

Pass 0 explicitly to request the unbounded backfill.
Comment thread .speakeasy/out.openapi.yaml Outdated
Comment thread server/internal/background/drain_risk_analysis.go Outdated
Comment thread server/internal/background/drain_risk_analysis.go
Daniel caught this in PR review: SignalWithStart leaves the triggering
signal in the new run's channel. Our end-of-cycle drain consumed it
and treated it as "new work arrived", so a bounded run with
MaxMessages=100 would ContinueAsNew forever with the same params and
process ~Nx100 messages per click instead of exactly 100.

Drain the channel at the top of the workflow so the start signal is
absorbed into the initial budget. End-of-cycle drain only picks up
signals that arrived *after* this run started — those are real "new
work" requests that warrant ContinueAsNew with a refreshed budget.

Added TestDrainWorkflow_StartSignalDoesNotSelfLoop, which fails on
the pre-fix code (workflow exits via ContinueAsNew with only the
start signal in the channel) and passes after this change. Replaced
the older TestDrainWorkflow_EmptyDrainWaitsForSignal with
TestDrainWorkflow_SignalDuringDrainContinuesAsNew which uses a >0
delay so the signal arrives after the start drain — modeling the real
"signal landed mid-cycle" case the original test was trying to cover.

Also trim the verbose comment paragraphs flagged in the review.
@blacksmith-sh

This comment has been minimized.

Each call to TriggerRiskAnalysis bumped the policy version. The
analyzed-count display filters by current version, so a 'Backfill last
100' followed by 'Backfill last 400' would show 400 analyzed — not the
expected 500 — because the first 100 results now sit under v=N-1 while
the second batch writes to v=N.

The version bump existed because Trigger used to be the only way to
force re-analysis. Now that we have an explicit limit parameter
(limit=0 means full backfill), the bump is the wrong default: it
silently overwrites the user's mental model of accumulating backfills.

Policy edits that change detection rules still bump the version via
UpdateRiskPolicy's WHEN clause, which is when re-analysis is actually
warranted.

Renamed TestTriggerRiskAnalysis_BumpsVersion to ...DoesNotBumpVersion
and extended it to cover two sequential triggers.
…test

- Split the drain_risk_analysis const block so each group has matching
  typing (SA9004: only the first constant was explicitly typed).
- Drive the mid-cycle signal from inside the fetch activity so it is
  reliably queued before the workflow's end-of-cycle drain check. The
  prior 1ms RegisterDelayedCallback raced with activity completion in
  the test scheduler and failed in CI.
@mfbx9da4
Copy link
Copy Markdown
Contributor Author

@claude review once

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Code review skipped — your organization's overage spend limit has been reached.

Code review is billed via overage credits. To resume reviews, an organization admin can raise the monthly limit at claude.ai/admin-settings/claude-code.

Once credits are available, comment @claude review on this pull request to trigger a review.

@claude
Copy link
Copy Markdown

claude Bot commented May 18, 2026

Claude encountered an error after 0s —— View job


I'll analyze this and get back to you.

@mfbx9da4 mfbx9da4 enabled auto-merge May 18, 2026 11:56
@mfbx9da4 mfbx9da4 added this pull request to the merge queue May 18, 2026
Merged via the queue into main with commit e76e63b May 18, 2026
30 checks passed
@mfbx9da4 mfbx9da4 deleted the risk-analysis-limit-backfill branch May 18, 2026 12:01
@github-actions github-actions Bot locked and limited conversation to collaborators May 18, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

preview Spawn a preview environment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants