fix(cloud): address review findings in the harness, allowlist, and providers#58
Merged
sourcehawk merged 8 commits intoMay 30, 2026
Conversation
execCLI used cmd.Output(), discarding the child's stderr where gcloud and aws write their error context. A non-zero run_cli returned an empty stdout and no explanation. Capture stderr into a capped buffer, surface it as CLIResult.Stderr (json "stderr,omitempty"), and truncate it at the same byte limit as stdout. The no-shell, minimal-env, closed-stdin guarantees and the "non-zero exit is a normal result" contract are unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…character tokens Allows exact-matched the positional subcommand path, so an allowlisted "compute instances describe" rejected "compute instances describe my-vm" — the resource operand made the path unequal, leaving most describe/get commands advertised but unusable. Match the allowlisted path as a token-wise prefix of argv's leading positionals so trailing resource operands ride through. There is no shell, so a trailing token is an inert argument. As defense in depth, validateArgv now rejects any argv token that is or contains a shell-control sequence (`;`, `|`, `&`, backtick, `$(`, `>`, `<`, newline). A metacharacter token like ["...","list",";","rm"] is refused by this check rather than by the allowlist; a literal resource name or a key=value filter passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…account `gcloud auth list` reports the operator's base active account, not the SA selected by CLOUDSDK_AUTH_IMPERSONATE_SERVICE_ACCOUNT, so the old active==expected check marked correctly-configured impersonation invalid. Validity now means "impersonation is pinned to the expected SA and the pin works": read the in-process impersonation env, confirm it equals the expected target, then run a minimal impersonated read (gcloud auth print-access-token) to prove the grant is active. AssumedIdentity is the SA on success; failures degrade through Valid/Hint with the captured stderr. NOTE: needs verification against a live gcloud before relying on the exact print-access-token shape (flagged in a code comment). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
execCLI returns a non-zero exit as CLIResult{ExitCode:n} with err==nil, so
a failed `gcloud projects list` was JSON-parsed and surfaced as a
misleading parse error. Check ExitCode before unmarshalling and return the
exit code plus captured stderr, mirroring the AWS provider. (The gcp
identity probe added in the prior commit already checks ExitCode.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…vailable Inventory fell back to the single-account projection on ANY non-zero exit or transport error, masking throttling, network faults, and other real failures. Now that stderr is captured, fall back only when the stderr names an Organizations-unavailable condition (AccessDenied, "not a member of an organization", or AWSOrganizationsNotInUseException). Any other non-zero exit returns the exit code plus stderr; a transport error is surfaced rather than silently degrading. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
assumedRoleARN hardcoded arn:aws:sts:: and Cut at the first slash, so GovCloud (arn:aws-us-gov:) / China (arn:aws-cn:) ARNs and roles carrying an IAM path (assumed-role/path/to/Role/session) misparsed. Accept any partition, and split so the role path-and-name is everything between assumed-role/ and the final /<session>, rebuilding the IAM role ARN under the same partition. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The doc comment said the agent "cannot pivot to an un-allowlisted project, account, or region", but allowedFor only maps --project and --region/--zone — Accounts is not argv-enforced. State it accurately: project and region/zone are enforced against argv; account reach is constrained by the pinned identity/role and the deny-floored --account / --profile flags, not validated here. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
parseCloudScope swallowed a malformed TRIAGENT_CLOUD_SCOPE and returned an empty (unconstrained) ScopeAllowlist, failing OPEN and silently widening run_cli. It now returns an error, and runCloud parses the scope before resolving the provider and aborts startup on a malformed value, so a misconfigured scope can never silently drop the deployment's restrictions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Towards #44
Addresses verified review findings in the read-only cloud-context MCP's parent package, GCP/AWS providers, and serve wiring. The headline correctness issue: most describe/get commands were advertised but unusable because the allowlist exact-matched the verb chain and rejected the resource operand, and a non-zero CLI exit surfaced an empty result with no error context. This PR makes the harness report stderr, the allowlist accept resource operands while rejecting shell-control tokens, the identity probes validate the actually-pinned identity, and the scope parser fail closed.
Changes
CLIResultso a non-zerorun_clicarries the provider's error context instead of an empty stdout. Stderr is capped at the same byte limit as stdout (json:"stderr,omitempty").compute instances describe my-vmpasses (the operand is an inert argument — there is no shell). Defense in depth:validateArgvnow rejects any argv token that is or contains a shell-control sequence (;|&backtick$(><newline).gcloud auth print-access-token) to prove the pin took effect.gcloud projects listreturns the exit code + stderr, not a misleading parse error.AWSOrganizationsNotInUseException); any other failure (throttling, transport) is surfaced, not masked.aws,aws-us-gov,aws-cn) and roles carrying IAM paths (assumed-role/path/to/Role/session).parseCloudScopefails closed: a malformedTRIAGENT_CLOUD_SCOPEaborts cloud-server startup instead of silently returning an unconstrained scope.ScopeAllowlistno longer claims account-axis argv enforcement it doesn't perform.Challenges
The GCP impersonation probe is the one change that can't be fully proven in unit tests:
gcloud auth listreports the base account, not the impersonated SA, so the prioractive == expectedcheck marked correct impersonation invalid. The new probe readsCLOUDSDK_AUTH_IMPERSONATE_SERVICE_ACCOUNT, confirms it equals the expected target, then mints a token to prove the grant is live. The exactprint-access-tokenshape is flagged in a code comment as needing verification against a live gcloud.Testing
make test-go(race,-count=1) green across the whole module;make lintreports 0 issues. Each finding landed test-first: stderr capture and truncation, prefix + metacharacter rejection (including embedded$(), GCP impersonation (no-target / mismatched-pin / impersonated-read success / failure), exit-code-before-parse for both providers, AWS Organizations-unavailable vs. throttling vs. transport fallback behavior, ARN parsing across partitions and IAM paths, and the malformed-scope fail-closed path inrunCloud.frontend/untouched.🤖 Generated with Claude Code