Skip to content

sourcehawk/triagent

Repository files navigation

Triagent

Agentic incident investigation, driven from your browser.

Triagent is a localhost web app that pairs the Claude reasoning agent with read-only Kubernetes access, an extensible MCP catalog (Prometheus, Slack, GitHub, incident.io, your own), a guided playbook walker, and a persistent team wiki. You run triagent start, hand it the symptom, and it drives a focused diagnosis you can paste into a ticket when it's done.

Every tool call stays visible, so you can audit the chain or interrupt at any point. Finished sessions can be shared so the next operator starts from where you ended, not from the alert.

📚 Read the full documentation →

Live investigation: sidebar of past sessions, the agent's diagnosis in the main pane, and the activity panel streaming every tool call on the right.

What it does

Kubernetes triage isn't a kubectl command. It's a multi-tab scramble across half a dozen surfaces. Triagent collapses that scramble into one conversation against one audit trail:

  • The agent reads the procedure, doesn't memorise it. Domain knowledge lives in playbooks loaded at runtime, not in a system prompt or a fine-tune. Updating what the system can diagnose is a YAML edit.
  • The tools are a typed catalog, not a shell. Every action the agent can take is a curated MCP tool with a schema'd input. The agent can't go off-piste, and the catalog doubles as documentation.
  • Knowledge accumulates as data. Each investigation can deposit a playbook (procedural) or a wiki entry (factual). Tomorrow's recall is a single tool call instead of a Slack archaeology dig.

With watches on the source (Slack channels, GitHub issue queries, more on the way), the launcher pre-classifies new items and proposes investigations on its own. With auto mode on, routine ones run end-to-end before you've read the page. You can take over at any moment.

What's in the box

Four surfaces, each documented in depth on the docs site:

  • Investigations: the live triage view. Hand the agent a symptom and a context (cluster, Slack thread, incident.io link, notes), watch the walker drive the diagnosis, ship the markdown summary.
  • Playbooks: the YAML-defined guided walker the agent follows. Author them in-browser with an AI co-editor.
  • Wiki: the team's persistent knowledge base of failure patterns and prior art, queryable by the agent.
  • Watches: polling rules that turn Slack messages, GitHub issues, or alerts into proposed investigations.

Tool catalog

Typed tool catalog, not a shell. Every action the agent can take is a schema'd MCP call. The same surface the agent reads is the surface you author against.

Playbook editor

Playbooks as data. YAML graphs the walker follows, authored in-browser with an AI co-editor and shipped as PRs to the playbooks repo.

Wiki editor

Wiki that compounds. Every finished investigation can deposit an entry; tomorrow's recall is a single tool call instead of a Slack archaeology dig.

Watches

Watches close the loop. Slack channels and GitHub queries become pre-classified signals. Routine ones auto-spawn an investigation before the pager fires.

Quick start

Requirements

  • claude CLI on $PATH, authenticated. See Claude Code.
  • A working kubeconfig with read access to the namespace you want to triage. Triagent talks to the cluster via client-go. kubectl is not required but most operators have it.
  • tsh if you use Teleport-backed cluster discovery (optional).
  • Kubernetes permissions to read pods/logs in the target namespace. Triagent does not create RBAC. It refuses to start if your existing permissions are insufficient.

Install

macOS / Linux:

curl -fsSL https://sourcehawk.github.io/triagent/install.sh | sh

Windows (PowerShell):

irm https://sourcehawk.github.io/triagent/install.ps1 | iex

Homebrew (macOS):

brew install --cask sourcehawk/tap/triagent

Manual download: grab the archive for your OS/arch from the latest release and put triagent + triagent-mcp somewhere on your $PATH.

The install script downloads both triagent (the launcher) and triagent-mcp (the MCP multiplexer) to ~/.local/bin (or %LOCALAPPDATA%\Programs\triagent on Windows). The launcher locates triagent-mcp adjacent to itself or anywhere on $PATH. The Next.js frontend is embedded in the launcher, so the runtime ships as a single executable per binary.

Build from source (requires Node 20+ and Go; see .tool-versions):

make build

Run

triagent start

This boots a localhost HTTP server, prints its URL with a per-launch token, and opens your browser to it. Press Ctrl-C to stop. It works out of the box on the embedded default profile; see Customising the profile below to teach the agent your stack and wire upstream repos.

In the browser:

  1. Pick a cluster: directly from kubeconfig, or via Teleport.
  2. Log in if prompted (SSO/2FA prompts go to the launcher terminal).
  3. Enter the namespace and optional notes, Slack channel, or incident URL.
  4. Preflight runs: namespace exists, you can list pods. If anything's missing, the launcher tells you why and stops.
  5. Investigate: the agent walks the playbook, calls tools, and writes a summary you can copy or push upstream as a PR (once you've wired an upstream repo; see below).

A few useful commands

triagent start                       # boot the launcher
triagent start --profile my-profile  # use a custom embedded profile by name
triagent start --profile ./my-prof   # use an on-disk profile (dir or yaml path)
triagent create-profile my-team      # fork the embedded default into ./my-team/ for editing
triagent clean                       # reset launcher caches (sessions, clones, etc.)
triagent clean --dry-run             # show what would be deleted

--profile accepts either an embedded profile name or a filesystem path; TRIAGENT_PROFILE is the env-var equivalent.

Customising the profile

A profile is the deployment-specific config that fits triagent to your platform: which playbooks the agent walks, which MCPs attach, what the preflight form asks for, and what the agent already knows about your stack before it starts. The embedded default runs as-is but is platform-neutral. Customising the profile is the highest-leverage step in a triagent setup. Two overrides matter most:

  • architecture.md: the prompt the agent reads before every triage. Teach it your platform's CRDs, namespace conventions, dependency direction, and recurring failure modes. Every investigation starts informed instead of rediscovering your stack.
  • Upstream repos (defaults.playbooks_repo, defaults.wiki_repo, defaults.sessions_repo): the GitHub repos backing the playbook set, team wiki, and committed session transcripts. Wiring these enables sync-from-upstream and push-as-PR; without them, edits stay local-only. Each repo is independent; wire any subset.

The recommended setup is a tiny overlay that inherits from default and only spells out what you're overriding:

mkdir -p ~/.config/triagent/profile
cat > ~/.config/triagent/profile/profile.yaml <<'YAML'
name: my-team
base: default

defaults:
  playbooks_repo: my-org/triagent-playbooks   # GitHub OWNER/REPO
  wiki_repo:      my-org/triagent-wiki
  sessions_repo:  my-org/triagent-sessions

prompt_files:
  architecture.md: architecture.md
YAML

$EDITOR ~/.config/triagent/profile/architecture.md     # describe your platform
triagent start --profile ~/.config/triagent/profile

Everything you leave out (paths, other prompts, investigation inputs, kinds.json, extra MCPs, Prometheus, model selection, auth) is inherited from default. See Profiles for the full schema, alternative layouts (full fork via triagent create-profile, air-gapped mode), and the longer narrative on each block.

Contributing

PRs welcome. See DEVELOPER_GUIDE.md for the full contributor setup, CLAUDE.md for the durable conventions, and open issues for ideas worth picking up.

Quick loop:

make test    # Go race tests + frontend vitest (wholesale)
make lint    # Go lint
make build   # frontend bundle + both binaries

# UI dev loop (no Go rebuild for frontend changes):
go run . start                       # terminal 1
cd frontend && npm run dev           # terminal 2, proxies /api/* to :8080

License

Apache 2.0

Packages

 
 
 

Contributors

Languages