SIEMTriage

An AI SOC triage agent for Microsoft Sentinel and Defender XDR, built on the Claude Agent SDK. It does the work of a Tier-1 analyst — pulls the incident, enriches the entities, runs hunting queries, and produces a verdict with a full evidence trail — then hands off to a human analyst with a proposed deep-dive plan.

The agent never closes incidents or takes response actions on its own. It produces verdicts; analysts decide.

What it does

Given a Sentinel or Defender XDR incident ID, the agent:

Pulls the incident envelope, alerts, and entities (Microsoft Graph Security API).
Enriches entities — IPs/hashes/domains through VirusTotal, GreyNoise, AbuseIPDB, MS Threat Intel; users through Entra ID risky-user score + MFA state; devices through MDE timeline.
Runs targeted KQL hunts in both Sentinel Log Analytics and Defender XDR Advanced Hunting.
Returns a strictly-typed verdict (zod-validated): classification, confidence, evidence chain, deep-dive plan, recommended actions.
Streams every step to the UI so analysts can watch and intervene.

Three example incidents shipped with the repo demonstrate the spectrum:

Fixture	Truth	Agent verdict	What it shows
`inc-001-failed-signins-fp`	False positive	`FalsePositive` 95%	Failed sign-ins resolved by a self-service password reset. The agent spots the AuditLog row between the failures and the success and clears it.
`inc-002-brute-force-investigate`	True positive	`TruePositive` 97% · deep dive	Password spray from a known-bad IP succeeds against a no-MFA service account, followed by SharePoint downloads. The agent independently flags the CA-policy gap and the PII-breach angle.
`inc-003-malware-tp`	True positive	`TruePositive` 98% · deep dive	Wacatac trojan with C2 callout. The agent notes Defender's remediation was partial and proposes memory forensics + tenant-wide IOC sweep.

Architecture

┌─────────────────────┐    ┌──────────────────┐    ┌────────────────────┐
│ Sentinel automation │──▶ │ /api/ingest/     │──▶ │ Redis queue        │
│  → Logic App        │    │   sentinel       │    │  (BullMQ)          │
└─────────────────────┘    └──────────────────┘    └─────────┬──────────┘
                                                              │
                                                              ▼
                ┌─────────────────────────────────────────────────────┐
                │ Worker service — Claude Agent SDK                    │
                │   • Triage Agent (Sonnet 4.6, read-only tools)       │
                │   • SOC MCP tools → Microsoft Graph + Log Analytics  │
                │   • submit_verdict (zod-validated)                   │
                │   • Streams every step to Redis pub/sub + Postgres   │
                └────────────────────────┬────────────────────────────┘
                                          │
                                          ▼
┌────────────────────┐    ┌──────────────────────────────────────────┐
│ Postgres           │◀── │ Next.js UI                                │
│   • incidents      │    │   • Queue (verdict pills, status)         │
│   • triage_runs    │───▶│   • Incident detail (verdict + evidence + │
│   • tool_calls     │    │     deep-dive plan + decision panel)      │
│   • agent_thoughts │    │   • SSE stream of live agent reasoning    │
│   • decisions      │    │   • Eval report (accuracy, FN rate)       │
└────────────────────┘    └──────────────────────────────────────────┘

Quickstart — demo mode (no infrastructure)

This is the fastest way to see the UI. No Postgres, no Redis, no Azure access — the UI reads pre-computed verdicts from replay-output/.

git clone https://github.com/rod-trent/SIEMTriage.git
cd SIEMTriage
npm install
cp .env.example .env                       # add ANTHROPIC_API_KEY for the replay step
npm run replay -- --all                    # generates replay-output/*.json
echo "TRIAGE_DEMO_MODE=1" > apps/web/.env.local
npm run dev:web                            # http://localhost:3000

Pages:

/ — incident queue
/incidents/<id> — verdict, evidence, deep-dive plan, decision panel, live-streaming agent reasoning
/eval — accuracy, false-negative rate, confusion matrix, per-case results

Quickstart — full stack (local DB + queue)

For end-to-end with persistence, ingestion, and analyst decisions:

# 1. Bring up Postgres + Redis
docker compose up -d

# 2. Env
cp .env.example .env
# Fill in ANTHROPIC_API_KEY, leave TRIAGE_BACKEND=fixture for now

# 3. Apply schema
cd packages/db && npx drizzle-kit push && cd ../..

# 4. Run the worker in one terminal
npm run worker

# 5. Run the web app in another
npm run dev:web

# 6. Enqueue a fixture incident
curl -X POST http://localhost:3000/api/ingest/sentinel \
  -H "content-type: application/json" \
  -d '{"id":"inc-001-failed-signins-fp","title":"Failed sign-ins","severity":"Medium","source":"Sentinel","createdTimeUtc":"2026-05-10T14:32:00Z"}'

The worker picks it up, runs the agent, streams every step to the incident page in real time, and stops at awaiting_decision for you to click Close or Proceed.

Wiring to live Microsoft Sentinel + Defender XDR

# In .env
TRIAGE_BACKEND=azure
AZURE_TENANT_ID=<your-tenant>
AZURE_LOG_ANALYTICS_WORKSPACE_ID=<workspace-guid>

# Optional threat-intel keys — missing keys mean the source is silently skipped
VIRUSTOTAL_API_KEY=
GREYNOISE_API_KEY=
ABUSEIPDB_API_KEY=

Auth: uses DefaultAzureCredential. Locally run az login; in Azure use a managed identity granted:

Microsoft Graph: SecurityIncident.Read.All, SecurityAlert.Read.All, ThreatHunting.Read.All, User.Read.All, IdentityRiskyUser.Read.All, UserAuthenticationMethod.Read.All
Log Analytics workspace: Log Analytics Reader

Sentinel ingestion: in the Sentinel portal create an automation rule that triggers a Logic App on incident creation. The Logic App posts the incident envelope to https://<your-host>/api/ingest/sentinel with header X-Ingest-Secret: <INGEST_SECRET>.

Eval — measure agent agreement against analyst-closed incidents

Before letting the agent auto-close anything, replay it against your historical incidents and measure:

TRIAGE_BACKEND=azure npm run eval -- --since 2026-02-01 --until 2026-05-01 --max 200

Output:

Total incidents:     187
Overall accuracy:    164/187 (87.7%)
False negatives:     2/64 real TPs (3.1%)  ← the number that gates autonomy
No verdict:          1/187 (0.5%)
Mean tool calls:     10.3
Mean duration:       86.7s

=== Auto-close gate ===
Cases meeting auto-close criteria (FP/Benign + !deepDive + conf >= 0.9): 47
  …of which were actually TPs: 0  ← MUST be zero before enabling auto-close

The dangerous auto-close count must be zero before any production auto-closure of incidents. The eval CLI exits non-zero when it isn't.

Repo layout

SIEMTriage/
├── packages/
│   ├── shared/              Zod schemas: Incident, Verdict, Evidence, KQL, Enrichment
│   ├── mcp-soc-tools/       SDK MCP server. Backend interface with two impls:
│   │   ├── backend/fixture  Reads from packages/mcp-soc-tools/fixtures/
│   │   └── backend/azure    Microsoft Graph + Log Analytics + XDR + Entra + MDE
│   ├── agent-core/          Triage Agent (system prompt + submit_verdict gate)
│   ├── db/                  Drizzle ORM schema for incidents/runs/tools/decisions
│   └── queue/               BullMQ + Redis pub/sub for SSE event streaming
├── apps/
│   ├── replay/              CLI: run the agent against fixture incidents
│   ├── eval/                CLI: replay against closed incidents + metrics
│   ├── worker/              Long-running service that consumes the queue
│   └── web/                 Next.js 15 UI (queue, incident detail, eval)
├── docker-compose.yml       Local Postgres + Redis
└── .env.example

Key design decisions

Two-agent split (Triage + Investigation) — the Triage Agent runs fast and cheap on every incident; the Investigation Agent only runs when an analyst clicks "Proceed deep dive." This keeps per-incident cost down when 80% close at triage, and means the analyst is approving "go deeper," not "do whatever."
submit_verdict as the completion signal — the agent doesn't free-text its answer; it calls a tool whose zod schema is the verdict. Schema validation enforces evidence-with-every-claim by construction.
tools: [] in the agent config strips all built-in tools. The agent can only call SOC MCP tools and submit_verdict — no file system, no shell, no surprises.
Read-only at the agent layer. Response actions (isolate host, disable user, revoke sessions) are recommended in the verdict but only executed by an analyst clicking through the UI's decision panel. The audit trail is the analyst's, not the agent's.
Cite numbers, not adjectives. The system prompt drills this in: "8 failed sign-ins followed by a successful auth at 14:30:08 UTC" beats "had some failures then succeeded." Tier-2 needs to reproduce the evidence.
Eval gates autonomy. The dangerousAutoCloses metric in the eval harness must be zero before enabling any auto-closure path in production.

What's stubbed

Entra SSO — apps/web/src/lib/auth.ts returns a dev user from a cookie. Replace with NextAuth + the Azure AD provider for production. Every UI/API surface reads through getCurrentUser(), so it's literally one file.
Investigation Agent — the second-phase agent (broader tools, runs after an analyst clicks "Proceed deep dive") isn't implemented yet. The decision panel already calls getTriageQueue().add("deep_dive", ...) so the wiring is there.
Production deployment — no infra-as-code yet. The intended target is Azure Container Apps for both worker and web, Azure Database for PostgreSQL, Azure Cache for Redis, managed identity for auth.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.claude		.claude
apps		apps
packages		packages
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json
tsconfig.base.json		tsconfig.base.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SIEMTriage

What it does

Architecture

Quickstart — demo mode (no infrastructure)

Quickstart — full stack (local DB + queue)

Wiring to live Microsoft Sentinel + Defender XDR

Eval — measure agent agreement against analyst-closed incidents

Repo layout

Key design decisions

What's stubbed

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SIEMTriage

What it does

Architecture

Quickstart — demo mode (no infrastructure)

Quickstart — full stack (local DB + queue)

Wiring to live Microsoft Sentinel + Defender XDR

Eval — measure agent agreement against analyst-closed incidents

Repo layout

Key design decisions

What's stubbed

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages