Pluggable read-only cloud-context MCP (GCP and AWS)

## Problem

When triaging a Kubernetes incident, the operator agent can inspect the cluster (`triagent-k8s`) and reach it through Teleport (`triagent-teleport`), but it is blind to the cloud layer the cluster sits on. Reachability, permissions, managed-cluster config, logs, and "what changed right before this broke" all live in the cloud, not the cluster, and the two clouds we run on (GCP and AWS) answer them through different APIs.

This epic adds one read-only cloud-context MCP that gives the agent that context without ever being able to mutate cloud state or escalate its own privilege. Adding coverage is a config edit, not new Go; adding a cloud is a new provider behind one interface, not a parallel MCP.

## In scope

- A `pkg/mcp/cloud/` package, provider-selected via `--kind=cloud --provider=<gcp|aws>`, aliased `triagent-cloud-<alias>`; thin typed tools (`list_inventory`, `session_status`) plus a gated read-only `run_cli` + `list_allowed_commands` for the long tail.
- A bypass-resistant command harness: argv-only input, direct `execve` (no shell), a profile-overridable allowlist, a hardcoded deny floor (subcommands, flags, arg-prefixes) the config cannot re-enable, scope validation, and output truncation.
- A GCP provider and an AWS provider implementing the interface over `gcloud` / `aws`.
- Launcher integration and pre-session auth visibility: profile `cloud:` config, per-session aliasing + pinned-identity env injection, a preflight identity probe with visible degrade, and a read-only cloud status pill in the connections panel.

## Out of scope

- Any write, create, update, or delete operation against either cloud. Read-only by construction and by harness.
- Clouds beyond GCP and AWS.
- Reading secrets, downloading bucket objects, shelling into instances, or agent-chosen identity impersonation — all on the hardcoded deny floor.
- OAuth / SSO login flows inside triagent (a deferred future enhancement); the static-key connection realization (deferred fallback).
- Billing, cost, or quota reporting.

## Risks & mitigations

- **The agent bypasses the command safety net.** Structural defenses, not string filtering: no shell ever (argv + direct `execve`); a deny floor over subcommands, flags, and argument prefixes; scope validation. The deployment's read-only IAM grant on the pinned identity is an independent backstop.
- **Advertised commands drift from enforced commands.** `list_allowed_commands` and `run_cli` read one config — the single source of truth.
- **The agent picks its own identity.** The pinned identity and command allowlist load server-side from the profile; the agent can read them, never mutate them. Impersonation is pinned in harness-controlled env, never agent argv.
- **Raw CLI output blows the context budget.** Output truncation plus typed tools for the orientation path.
- **Soft-degrade is new preflight behavior.** Cloud-source-scoped and explicit; the existing k8s block-on-failure is unchanged.

## Design overview

One package (`pkg/mcp/cloud/`), one `case "cloud"` in `cmd/triagent-mcp/serve.go` (ADR-0001), parameterized by `--provider`, aliased `triagent-cloud-<alias>` at the `mcpconfig.go` wiring layer (ADR-0003) — the git-MCP pattern with a cloud provider as the bound target. Deployment config loads from the runtime profile (ADR-0008).

Provider behaviour sits behind an injectable `Provider` interface (the teleport pattern); `gcp` and `aws` implementations live in subpackages wired by `serve.go`. All cloud access shells the provider CLI through one exec core; no cloud SDK dependency, so auth and impersonation stay uniform. The command allowlist mirrors `pkg/mcp/k8s`'s `LoadAllowlist`: embedded default, profile-overridable, with a hardcoded floor the override can never re-enable (the way `Secret` is always filtered).

The pinned read-only identity is a deployment-chosen principal the agent can neither select nor authenticate. v1 realizes it via operator-ambient base auth plus harness-pinned impersonation injected through `cmd.Env` (`CLOUDSDK_AUTH_IMPERSONATE_SERVICE_ACCOUNT` for GCP; `AWS_PROFILE` with an assume-role profile for AWS); Workload Identity / IRSA falls out of the same env path for server deployments. A single whoami probe validates the identity chain and feeds three surfaces that therefore cannot disagree: the read-only connections pill (pre-session visibility), `preflight.Run()` (the gate), and the `session_status` tool. A failed cloud probe degrades the cloud source visibly rather than blocking the session, so Kubernetes triage proceeds.

```mermaid
flowchart TD
 operator[operator agent] --> typed["typed tools list_inventory · session_status"]
 operator --> disc["list_allowed_commands"]
 operator --> cli["run_cli (argv tokens only)"]
 typed --> iface{{Provider interface}}
 cli --> harness["safe harness no shell · fixed binary · allowlist + deny floor (subcommands & flags) + scope check + truncate"]
 cfg[("command allowlist embedded default, profile-overridable")] --> harness
 cfg --> disc
 harness --> iface
 iface --> gcp["gcp provider gcloud + defaults"]
 iface --> aws["aws provider aws + defaults"]
 id[("pinned read-only identity impersonated via harness env")] -.outer floor.-> gcp
 id -.outer floor.-> aws
```

## Sub-issues

_Linked below: scaffold + harness, GCP provider, AWS provider, launcher integration._

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pluggable read-only cloud-context MCP (GCP and AWS) #44

Problem

In scope

Out of scope

Risks & mitigations

Design overview

Sub-issues

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Pluggable read-only cloud-context MCP (GCP and AWS) #44

Description

Problem

In scope

Out of scope

Risks & mitigations

Design overview

Sub-issues

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions