Problem
Aperio surfaces findings, but operators still triage one row at a time, reading evidence JSON to understand "what happened, why it matters, what to do." Aperio already has the foundation for AI-driven investigation — Agent, AgentTask, AgentMessage, AgentProposal — and the safety primitive every other vendor is bolting on awkwardly: every state-changing action must pass through AgentProposal with human approval. We're under-using it.
Goals
- Natural-language search over findings, identities, OAuth grants, and audit-log events.
- Auto-generated investigation narratives that stitch related signals (
SecurityFinding + OauthAppGrant + TenantAuditLog + SaasIdentity) into a single coherent story.
- Plain-English risk explanations per finding ("This OAuth app holds
gmail.modify AND drive.readonly — the combination enables exfil via auto-forward.").
- Detection-rule authoring from prose — operator describes a condition, Aperio generates a YAML rule (depends on the detection-as-code issue) plus tests.
- Weekly posture-drift digest that tells the security team what materially changed.
All inference outputs go through the existing AgentProposal flow whenever they would mutate provider state or org policy. Read-only narratives don't require approval.
Non-goals
- No silent autonomous actions on customer infrastructure. Every provider-side write stays gated behind
AgentProposal.APPROVED by a human with the right role.
- Not building our own model — pluggable provider interface only (OpenAI, Anthropic, Bedrock, Azure OpenAI, Ollama for self-hosted).
- Not promising perfect recall on prose queries — surface confidence and let the operator audit the underlying query.
Proposed design
Provider abstraction
// packages/ai/src/provider.ts
export interface AiProvider {
readonly id: "openai" | "anthropic" | "bedrock" | "azure-openai" | "ollama";
embed(text: string): Promise<Float32Array>;
complete(prompt: AiPrompt, options?: AiCompleteOptions): Promise<AiCompletion>;
redact(text: string): string; // PII / secret scrubbing before the model sees data
}
Provider selection per-org via:
APERIO_AI_PROVIDER=openai|anthropic|bedrock|azure-openai|ollama
APERIO_AI_API_KEY=...
APERIO_AI_MODEL=gpt-4o-mini (or per-feature override)
APERIO_AI_ENABLED=true
APERIO_AI_REDACT_PII=true
Default disabled; the audit log records every model invocation (prompt hash, tokens used, latency, finding/task linkage) so admins can prove "no customer data left the cluster" when the provider is ollama.
Feature 1 — Natural-language finding search
New ConnectRPC SearchFindings(query: string, scope) -> SearchResult.
Pipeline:
- Embed the query.
- Hybrid search: semantic over a new
finding_embeddings table + BM25 / Postgres pg_trgm over SecurityFinding.title/description/evidence + structured filter extraction from the prose ("critical", "last 7 days", "production GitHub repos").
- Return findings with relevance scores and a 1-line explanation per hit.
Embeddings live in a new table:
model FindingEmbedding {
findingId String @id @map("finding_id")
embedding Bytes // pgvector once a migration adds it
model String @db.VarChar(60)
generatedAt DateTime @default(now()) @map("generated_at")
finding SecurityFinding @relation(...)
@@map("finding_embeddings")
}
Backfill job hydrates embeddings for existing findings; ingestion worker generates them at finding-creation time.
Feature 2 — Investigation narratives
When an operator opens a finding (or via "Generate investigation"), spawn an AgentTask with taskType = "investigation.summarize.v1". The agent assembles context:
- The finding itself + evidence JSON.
- Sibling findings sharing the same
assetId, integrationId, or dedupeKey prefix.
- Related
OauthAppGrant rows for the affected user(s).
- Last 90 days of
TenantAuditLog for the actor(s).
- The actor's
Person profile (if the identity-graph issue is shipped).
Produces an AgentMessage with messageType = "investigation.narrative.v1" containing:
- 1-paragraph executive summary.
- Timeline of related events.
- Plain-English risk explanation.
- Recommended remediation chain (which generates
AgentProposal rows the operator can approve).
Feature 3 — Per-finding risk explanations
Inline UI: every finding card gets a "Why this is risky" expandable that renders an LLM-generated explanation cached against the finding's dedupeKey. Cache is invalidated when evidence changes. No write actions involved, so no AgentProposal gate.
Feature 4 — Rule-from-prose
New flow in the rule-authoring UI (depends on detection-as-code issue):
- Operator types: "alert me when any external user joins a private repo".
- Agent emits a draft YAML rule + matching test fixtures.
- Operator runs the rule in backtest mode against the last 30 days of
IngestedEvent data.
- Approval gate: the rule does not become active until the operator clicks Approve, which writes an
AgentProposal with action = "rule.activate".
Feature 5 — Weekly posture-drift digest
Cron'd AgentTask (taskType = "digest.weekly.v1") per org. Inputs:
- New findings opened in the last 7d.
- Resolved findings.
- Net change in
Organization.criticalRiskThreshold exceedances.
- New / removed
OauthAppGrant rows.
- New / dormant
Person rows.
Output: a digest artifact stored in AgentMessage, plus an email (Resend, already wired) and optional Slack post.
Safety / privacy controls
- PII redaction toggle (
APERIO_AI_REDACT_PII=true) strips emails / IP addresses / numeric IDs before any prompt that goes to a remote provider.
- Per-org allowlist for which features can call out to a remote model vs. require Ollama self-host.
- Audit log entries (
ai.invocation) recording timestamp, feature, prompt hash, model, tokens, latency, request initiator.
- No autonomy beyond proposals — every state change goes through
AgentProposal.APPROVED.
Phasing
| Phase |
Scope |
| P1 |
Provider abstraction; per-finding risk explanations; PII redaction; AI invocation audit log |
| P2 |
Embeddings + hybrid SearchFindings; UI for prose query |
| P3 |
Investigation narratives via AgentTask (requires identity-graph for full context) |
| P4 |
Rule-from-prose + backtest harness (depends on detection-as-code issue) |
| P5 |
Weekly posture digest + email/Slack delivery |
Open questions
- Where do embeddings live before we adopt pgvector —
Bytes column + brute-force cosine, or external (Qdrant / Weaviate)?
- What's the right default for prompt-content redaction — strict (strip all emails) or operator-tunable per-feature?
- How do we let operators "thumbs down" a narrative and feed the signal back? Just a
quality_score column on AgentMessage?
- Token-budget controls per org to prevent runaway spend.
References
- Reused primitives:
Agent, AgentTask, AgentMessage, AgentProposal, TenantAuditLog, SecurityFinding, OauthAppGrant, IngestedEvent.
- Depends on: identity-graph (for richer narrative context) and detection-as-code (for rule-from-prose).
- Inspiration: GitHub Copilot autofix (PR-style approval gate), Dropzone AI investigations, Microsoft Security Copilot (we want the value without the proprietary lock-in).
Problem
Aperio surfaces findings, but operators still triage one row at a time, reading evidence JSON to understand "what happened, why it matters, what to do." Aperio already has the foundation for AI-driven investigation —
Agent,AgentTask,AgentMessage,AgentProposal— and the safety primitive every other vendor is bolting on awkwardly: every state-changing action must pass throughAgentProposalwith human approval. We're under-using it.Goals
SecurityFinding+OauthAppGrant+TenantAuditLog+SaasIdentity) into a single coherent story.gmail.modifyANDdrive.readonly— the combination enables exfil via auto-forward.").All inference outputs go through the existing
AgentProposalflow whenever they would mutate provider state or org policy. Read-only narratives don't require approval.Non-goals
AgentProposal.APPROVEDby a human with the right role.Proposed design
Provider abstraction
Provider selection per-org via:
Default disabled; the audit log records every model invocation (prompt hash, tokens used, latency, finding/task linkage) so admins can prove "no customer data left the cluster" when the provider is
ollama.Feature 1 — Natural-language finding search
New ConnectRPC
SearchFindings(query: string, scope) -> SearchResult.Pipeline:
finding_embeddingstable + BM25 / Postgrespg_trgmoverSecurityFinding.title/description/evidence+ structured filter extraction from the prose ("critical", "last 7 days", "production GitHub repos").Embeddings live in a new table:
Backfill job hydrates embeddings for existing findings; ingestion worker generates them at finding-creation time.
Feature 2 — Investigation narratives
When an operator opens a finding (or via "Generate investigation"), spawn an
AgentTaskwithtaskType = "investigation.summarize.v1". The agent assembles context:assetId,integrationId, ordedupeKeyprefix.OauthAppGrantrows for the affected user(s).TenantAuditLogfor the actor(s).Personprofile (if the identity-graph issue is shipped).Produces an
AgentMessagewithmessageType = "investigation.narrative.v1"containing:AgentProposalrows the operator can approve).Feature 3 — Per-finding risk explanations
Inline UI: every finding card gets a "Why this is risky" expandable that renders an LLM-generated explanation cached against the finding's
dedupeKey. Cache is invalidated when evidence changes. No write actions involved, so noAgentProposalgate.Feature 4 — Rule-from-prose
New flow in the rule-authoring UI (depends on detection-as-code issue):
IngestedEventdata.AgentProposalwithaction = "rule.activate".Feature 5 — Weekly posture-drift digest
Cron'd
AgentTask(taskType = "digest.weekly.v1") per org. Inputs:Organization.criticalRiskThresholdexceedances.OauthAppGrantrows.Personrows.Output: a
digestartifact stored inAgentMessage, plus an email (Resend, already wired) and optional Slack post.Safety / privacy controls
APERIO_AI_REDACT_PII=true) strips emails / IP addresses / numeric IDs before any prompt that goes to a remote provider.ai.invocation) recording timestamp, feature, prompt hash, model, tokens, latency, request initiator.AgentProposal.APPROVED.Phasing
SearchFindings; UI for prose queryAgentTask(requires identity-graph for full context)Open questions
Bytescolumn + brute-force cosine, or external (Qdrant / Weaviate)?quality_scorecolumn onAgentMessage?References
Agent,AgentTask,AgentMessage,AgentProposal,TenantAuditLog,SecurityFinding,OauthAppGrant,IngestedEvent.