Skip to content

AI-native investigation, triage & posture narratives #46

@dcoln25-writer

Description

@dcoln25-writer

Problem

Aperio surfaces findings, but operators still triage one row at a time, reading evidence JSON to understand "what happened, why it matters, what to do." Aperio already has the foundation for AI-driven investigation — Agent, AgentTask, AgentMessage, AgentProposal — and the safety primitive every other vendor is bolting on awkwardly: every state-changing action must pass through AgentProposal with human approval. We're under-using it.

Goals

  1. Natural-language search over findings, identities, OAuth grants, and audit-log events.
  2. Auto-generated investigation narratives that stitch related signals (SecurityFinding + OauthAppGrant + TenantAuditLog + SaasIdentity) into a single coherent story.
  3. Plain-English risk explanations per finding ("This OAuth app holds gmail.modify AND drive.readonly — the combination enables exfil via auto-forward.").
  4. Detection-rule authoring from prose — operator describes a condition, Aperio generates a YAML rule (depends on the detection-as-code issue) plus tests.
  5. Weekly posture-drift digest that tells the security team what materially changed.

All inference outputs go through the existing AgentProposal flow whenever they would mutate provider state or org policy. Read-only narratives don't require approval.

Non-goals

  • No silent autonomous actions on customer infrastructure. Every provider-side write stays gated behind AgentProposal.APPROVED by a human with the right role.
  • Not building our own model — pluggable provider interface only (OpenAI, Anthropic, Bedrock, Azure OpenAI, Ollama for self-hosted).
  • Not promising perfect recall on prose queries — surface confidence and let the operator audit the underlying query.

Proposed design

Provider abstraction

// packages/ai/src/provider.ts
export interface AiProvider {
  readonly id: "openai" | "anthropic" | "bedrock" | "azure-openai" | "ollama";
  embed(text: string): Promise<Float32Array>;
  complete(prompt: AiPrompt, options?: AiCompleteOptions): Promise<AiCompletion>;
  redact(text: string): string; // PII / secret scrubbing before the model sees data
}

Provider selection per-org via:

APERIO_AI_PROVIDER=openai|anthropic|bedrock|azure-openai|ollama
APERIO_AI_API_KEY=...
APERIO_AI_MODEL=gpt-4o-mini  (or per-feature override)
APERIO_AI_ENABLED=true
APERIO_AI_REDACT_PII=true

Default disabled; the audit log records every model invocation (prompt hash, tokens used, latency, finding/task linkage) so admins can prove "no customer data left the cluster" when the provider is ollama.

Feature 1 — Natural-language finding search

New ConnectRPC SearchFindings(query: string, scope) -> SearchResult.

Pipeline:

  1. Embed the query.
  2. Hybrid search: semantic over a new finding_embeddings table + BM25 / Postgres pg_trgm over SecurityFinding.title/description/evidence + structured filter extraction from the prose ("critical", "last 7 days", "production GitHub repos").
  3. Return findings with relevance scores and a 1-line explanation per hit.

Embeddings live in a new table:

model FindingEmbedding {
  findingId   String   @id @map("finding_id")
  embedding   Bytes    // pgvector once a migration adds it
  model       String   @db.VarChar(60)
  generatedAt DateTime @default(now()) @map("generated_at")
  finding     SecurityFinding @relation(...)
  @@map("finding_embeddings")
}

Backfill job hydrates embeddings for existing findings; ingestion worker generates them at finding-creation time.

Feature 2 — Investigation narratives

When an operator opens a finding (or via "Generate investigation"), spawn an AgentTask with taskType = "investigation.summarize.v1". The agent assembles context:

  • The finding itself + evidence JSON.
  • Sibling findings sharing the same assetId, integrationId, or dedupeKey prefix.
  • Related OauthAppGrant rows for the affected user(s).
  • Last 90 days of TenantAuditLog for the actor(s).
  • The actor's Person profile (if the identity-graph issue is shipped).

Produces an AgentMessage with messageType = "investigation.narrative.v1" containing:

  • 1-paragraph executive summary.
  • Timeline of related events.
  • Plain-English risk explanation.
  • Recommended remediation chain (which generates AgentProposal rows the operator can approve).

Feature 3 — Per-finding risk explanations

Inline UI: every finding card gets a "Why this is risky" expandable that renders an LLM-generated explanation cached against the finding's dedupeKey. Cache is invalidated when evidence changes. No write actions involved, so no AgentProposal gate.

Feature 4 — Rule-from-prose

New flow in the rule-authoring UI (depends on detection-as-code issue):

  1. Operator types: "alert me when any external user joins a private repo".
  2. Agent emits a draft YAML rule + matching test fixtures.
  3. Operator runs the rule in backtest mode against the last 30 days of IngestedEvent data.
  4. Approval gate: the rule does not become active until the operator clicks Approve, which writes an AgentProposal with action = "rule.activate".

Feature 5 — Weekly posture-drift digest

Cron'd AgentTask (taskType = "digest.weekly.v1") per org. Inputs:

  • New findings opened in the last 7d.
  • Resolved findings.
  • Net change in Organization.criticalRiskThreshold exceedances.
  • New / removed OauthAppGrant rows.
  • New / dormant Person rows.

Output: a digest artifact stored in AgentMessage, plus an email (Resend, already wired) and optional Slack post.

Safety / privacy controls

  • PII redaction toggle (APERIO_AI_REDACT_PII=true) strips emails / IP addresses / numeric IDs before any prompt that goes to a remote provider.
  • Per-org allowlist for which features can call out to a remote model vs. require Ollama self-host.
  • Audit log entries (ai.invocation) recording timestamp, feature, prompt hash, model, tokens, latency, request initiator.
  • No autonomy beyond proposals — every state change goes through AgentProposal.APPROVED.

Phasing

Phase Scope
P1 Provider abstraction; per-finding risk explanations; PII redaction; AI invocation audit log
P2 Embeddings + hybrid SearchFindings; UI for prose query
P3 Investigation narratives via AgentTask (requires identity-graph for full context)
P4 Rule-from-prose + backtest harness (depends on detection-as-code issue)
P5 Weekly posture digest + email/Slack delivery

Open questions

  • Where do embeddings live before we adopt pgvector — Bytes column + brute-force cosine, or external (Qdrant / Weaviate)?
  • What's the right default for prompt-content redaction — strict (strip all emails) or operator-tunable per-feature?
  • How do we let operators "thumbs down" a narrative and feed the signal back? Just a quality_score column on AgentMessage?
  • Token-budget controls per org to prevent runaway spend.

References

  • Reused primitives: Agent, AgentTask, AgentMessage, AgentProposal, TenantAuditLog, SecurityFinding, OauthAppGrant, IngestedEvent.
  • Depends on: identity-graph (for richer narrative context) and detection-as-code (for rule-from-prose).
  • Inspiration: GitHub Copilot autofix (PR-style approval gate), Dropzone AI investigations, Microsoft Security Copilot (we want the value without the proprietary lock-in).

Metadata

Metadata

Assignees

No one assigned

    Labels

    aiAI / LLM-driven featuresenhancementNew feature or requesttier-1-differentiatorTier 1: high-impact strategic differentiator

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions