features: notification sinks, kubernetes RBAC, remediation, lambda policy audit by stxkxs · Pull Request #2 · nanohype/cloudgov

stxkxs · 2026-05-26T23:02:47Z

Summary

Phase 3 of the maintenance + enhancement plan. Four independent additions, stacked on top of #1 (modernization). Diff shown here is only the Phase 3 work — review #1 first, this PR will retarget to main after that lands.

Commit	Surface	What
`c40189d`	`matlock audit --sink ...`	Notification sinks — Slack / PagerDuty / generic webhook delivery after each audit. Best-effort (one bad sink doesn't block others); PagerDuty only fires when there's at least one critical/high to avoid alert fatigue.
`46114fe`	`matlock k8s rbac`	New Kubernetes domain — over-privileged ClusterRoles (wildcard verbs/resources, dangerous verbs on wildcards) and broad ClusterRoleBindings (system:authenticated, system:masters, cluster-admin to specific users). Built on `k8s.io/client-go`; skips kubernetes-system default ClusterRoles so output focuses on user-introduced risk.
`51c6c8f`	`matlock network audit --fix`, `matlock remediate`	Remediation depth — `network audit --fix` mirrors the existing `storage audit --fix` (emit shell scripts per provider). New top-level `matlock remediate --type {storage,network} --from report.json` unlocks the offline "scan → review → apply" workflow CI pipelines want.
`a33585f`	`matlock lambda audit`	Lambda resource-policy scanner — AWS only. Inspects each function's `lambda:GetPolicy` for `Principal: \"*\"` (public invoke), cross-account principals, service principals without `aws:SourceAccount` condition (confused-deputy risk), and wildcard actions. Complements the identity-based IAM scan with the resource-based attack surface.

New CLI surface

```

new commands

matlock k8s rbac
matlock remediate --type {storage,network} --from <report.json>
matlock lambda audit

new flags on existing commands

matlock audit --sink : --report-url
matlock network audit --fix --out

```

New dependencies

k8s.io/client-go, k8s.io/api, k8s.io/apimachinery at v0.36.1 — pulled in by the K8s domain. ~50MB of transitive deps; standard for any Go K8s integration. go mod tidy applied.

Detection rules (highlights)

Sinks: spec form <kind>:<url-or-key>, repeatable. PagerDuty Events API v2 dedup key is matlock-<source>-<provider>.

K8s RBAC:

ClusterRoles: verbs:["*"] + resources:["*"] → HIGH; wildcard verb alone → HIGH; dangerous verb on wildcard resource → HIGH; read-only verbs on wildcards → silent
ClusterRoleBindings: bound to system:authenticated/system:unauthenticated/system:anonymous/system:serviceaccounts → CRITICAL; to system:masters → HIGH (legitimately used by kubeadm); to cluster-admin with any specific subject → HIGH
Built-in cluster-admin, admin, edit, view, system:*, kubeadm:* are skipped

Lambda policy:

Principal: \"*\" or Principal: {\"AWS\": \"*\"} → CRITICAL public invoke
Principal: {\"AWS\": \"arn:...other-account...\"} → HIGH cross-account
Principal: {\"Service\": \"...\"} without aws:SourceAccount/aws:SourceArn condition → HIGH confused deputy
Action: \"*\" or Action: \"lambda:*\" → HIGH wildcard
Same-account principals, Deny statements, and functions without a resource policy → silent

Test plan

CI green (build, test, vet, lint, coverage floor)
./matlock k8s rbac --help and ./matlock lambda audit --help render
./matlock audit --sink slack:fake-url parses; spec validation rejects unknown kinds
./matlock remediate --type network --from /tmp/net.json --out /tmp/fixes/ produces an executable shell script
(manual, optional) Slack/PagerDuty integration sanity-check against a real webhook

Squash-merge message (for use at merge time)

```
features: notification sinks, kubernetes RBAC, remediation, lambda policy scanner

Four independent additions completing Phase 3 of the maintenance plan:

matlock audit --sink: Slack, PagerDuty, and generic webhook notifications
delivered after each audit run. Best-effort (one bad sink doesn't block
the others); PagerDuty only fires when the digest contains at least one
critical or high finding to avoid alert fatigue.
matlock k8s rbac: new Kubernetes domain. Scans cluster-scoped ClusterRoles
and ClusterRoleBindings for wildcard verbs/resources, dangerous verbs on
wildcard resources, and bindings to broad subject groups (system:authenticated,
system:masters). Built on k8s.io/client-go (new dep); skips built-in
default ClusterRoles so output focuses on user-introduced risk.
matlock network audit --fix + matlock remediate: emit shell remediation
scripts per provider, either inline during a scan or from a saved JSON
report. Mirrors and extends the existing storage audit --fix. Unlocks
the offline "scan -> review -> apply" workflow CI pipelines want.
matlock lambda audit: scan AWS Lambda resource-based policies for
Principal:"*" (public invoke), cross-account principals, service
principals without aws:SourceAccount condition (confused-deputy risk),
and wildcard actions. Complements the identity-based IAM scan with
resource-based policy analysis.

Adds k8s.io/client-go (and k8s.io/api, k8s.io/apimachinery) at v0.36.1
to support the Kubernetes domain.
```

… into matlock audit ─── New package ─── internal/output/sinks/ implements three concrete sinks behind a single Sink interface (Name, Send). Each accepts a normalized Digest containing counts by severity, the run's domain list, the top 10 highest-severity findings, and an optional report URL. Spec parsing accepts the form "<kind>:<url-or-key>", repeatable on the CLI. slack:<webhook-url> Block Kit message with severity-coded header, context line, domain list, top-10 findings as bullet points, optional report link webhook:<url> generic POST of the raw Digest JSON — receivers parse it however they like pagerduty:<routing-key> PagerDuty Events API v2 trigger. Only fires when the digest has at least one critical or high finding so we don't page on-call about LOW noise SendAll fires all configured sinks in sequence and aggregates their errors via errors.Join — one bad sink does not block the others, since audit should always succeed even if notification delivery fails. ─── CLI integration ─── matlock audit gains two flags: --sink repeatable; "<kind>:<url-or-key>" form --report-url optional URL embedded in every sink notification After the audit completes (and any --output-file is written), sinks are parsed and fired. Failures are logged to stderr but never affect the audit's exit code — the user sees the full report regardless. The audit-to-Digest mapping is the responsibility of cmd/audit.go (auditDigest / topAuditFindings / digestProvider / sortFindingsBySeverity) rather than the sinks package, so sinks stay generic and the audit-specific shaping lives next to the audit command. ─── Tests ─── internal/output/sinks/sinks_test.go uses httptest to spin up local servers and verify: - Parse handles every supported kind, rejects unknown kinds and missing URLs/keys, tolerates whitespace - SendAll keeps firing good sinks past a failing one and aggregates errors - WebhookSink rejects invalid URLs and surfaces 5xx responses - SlackSink produces valid JSON Block Kit with expected content markers, truncates Top findings to 10 even when given 25 - PagerDutySink suppresses medium-only digests, fires on high/critical, rejects missing routing key, sends valid Events API v2 shape Build green, all tests pass, go vet clean.

─── Why ─── matlock's mission is cloud-hygiene, and Kubernetes is now part of every platform team's cloud surface. RBAC misconfiguration is the most common real-world incident driver for K8s clusters — wildcard verbs, dangerous verbs on wildcard resources, and bindings to broad subject groups (system:authenticated et al.) are the patterns CTF writeups and pentests keep finding in production. ─── New CLI surface ─── matlock k8s parent command (no behavior of its own) matlock k8s rbac scan ClusterRoles + ClusterRoleBindings Connection follows the standard kubeconfig chain: --kubeconfig flag, then $KUBECONFIG, then ~/.kube/config, then in-cluster service-account token. Provider.Detect() returns true if any of those is present. ─── New package ─── internal/cloud/k8s/ holds the provider; built on k8s.io/client-go with the same adapter+mockable pattern as the other clouds: rbacAPI interface ListClusterRoles, ListClusterRoleBindings rbacAdapter wraps *kubernetes.Clientset.RbacV1() Provider.ScanRBAC(ctx) returns []cloud.K8sFinding internal/cloud/provider.go now defines: K8sFindingType CLUSTER_ADMIN, WILDCARD_PERMISSION, BINDING_TOO_BROAD, DANGEROUS_VERB K8sFinding kind/name/namespace/context/detail/remediation KubernetesProvider base interface (Name, Detect, ContextName) K8sRBACProvider adds ScanRBAC ─── Detection rules ─── ClusterRoles: - verbs:["*"] + resources:["*"] → HIGH CLUSTER_ADMIN - verbs:["*"] on any resource list → HIGH WILDCARD_PERMISSION - dangerous verb (create/update/patch/delete) on resources:["*"] → HIGH WILDCARD_PERMISSION - read-only verbs on resources:["*"] → silent - default ClusterRoles (cluster-admin, admin, edit, view, system:*, kubeadm:*) → skipped entirely ClusterRoleBindings: - binds anything to system:authenticated / system:unauthenticated / system:anonymous / system:serviceaccounts → CRITICAL BINDING_TOO_BROAD - binds anything to system:masters → HIGH BINDING_TOO_BROAD (legitimately used by kubeadm) - binds cluster-admin to any specific subject → HIGH CLUSTER_ADMIN These are conservative — we don't try to be a CNAPP. The goal is "if you ran this in CI you'd catch the misconfigs that show up in pentest reports." ─── Output renderers ─── internal/output/table.go gains K8sFindings (severity-colored table with severity counts summary). internal/output/json.go gains WriteK8sFindings (wrapped report). ─── Tests ─── internal/cloud/k8s/rbac_test.go covers each detection rule with a hand-written mockRBAC: wildcard-everything, wildcard-verb-only, wildcard-resource with dangerous verb, wildcard-resource with read-only (silent), system-role skipping, broad-group binding (CRITICAL), system:masters binding (HIGH), cluster-admin binding to user (HIGH), specific-group binding (silent), and error paths for both list calls. ─── Dependencies ─── Added k8s.io/client-go, k8s.io/api, k8s.io/apimachinery at v0.36.1 plus their transitive Kubernetes ecosystem deps. go mod tidy applied. Build green, all tests pass, go vet clean.

…te command ─── network audit --fix ─── Mirrors the existing storage audit --fix pattern: group findings by provider, emit one fix-network-<provider>.sh per cloud with a bash shebang, set -euo pipefail, a header noting provider/timestamp/finding-count, and per-finding comment blocks (severity, type, resource, region, protocol/port/cidr, detail) followed by the Remediation command string that the network scanners already populate (aws ec2 revoke-security-group-ingress, gcloud compute firewall-rules delete, az network nsg rule update). The generated scripts are intentionally cautious — they revoke rules, so the header warns the operator to review before running. Findings without a Remediation string are skipped (nothing to script). internal/network/fix.go is the implementation; tests cover provider grouping, skipping findings missing remediation, file executable bit (0o755), and mkdir -p of nested output directories. ─── matlock remediate ─── New top-level command that reads a previously-saved JSON scan report (--output json on storage or network audit) and emits fix scripts. --type storage | network (required) --from path to JSON file (required) --out output directory (default ".") --severity minimum severity (default "LOW") Unmarshals both the standard {findings:[...], total:N} envelope shape that matlock writes and a bare-array form for hand-crafted inputs. Filters by severity before dispatching to the existing WriteFixScripts in the storage and network packages — no per-domain logic duplicated here. This unlocks the "scan in one job, review, then remediate in a separate gated job" workflow that CI pipelines actually want, instead of forcing scan+fix into a single non-reviewable invocation. ─── End-to-end verified ─── Synthesized a network scan JSON, ran matlock remediate against it, and confirmed the resulting fix-network-aws.sh has the expected bash header, finding comment block, and revoke-security-group-ingress command verbatim from the report. Build green, all tests pass, go vet clean.

…and confused-deputy ─── Why ─── matlock's IAM scan analyzes identity-based permissions — what a role or user can do *from* the inside. Lambda functions also have resource-based policies (lambda:GetPolicy) that control who can invoke them *from* the outside. These are a separate attack surface, and they're the source of real incidents: internet-exposed Lambdas via API Gateway integrations with no source-account condition, functions inadvertently granted to a contractor account that was never revoked, AWS-service principals without confused-deputy mitigation. This commit adds a focused scanner for that resource-based surface. ─── New CLI surface ─── matlock lambda parent matlock lambda audit scan resource policies on every function Currently AWS only — the LambdaPolicyProvider interface in internal/cloud/provider.go is designed so GCP/Azure equivalents can plug in later (Cloud Functions invokerBindings, Azure Functions auth keys). ─── Detection rules ─── Principal: "*" → CRITICAL PUBLIC_INVOKE Principal: {"AWS": "*"} → CRITICAL PUBLIC_INVOKE Principal: {"AWS": "arn:...other-account..."} → HIGH CROSS_ACCOUNT_INVOKE Principal: {"Service": "..."} without aws:SourceAccount / aws:SourceArn condition → HIGH CONFUSED_DEPUTY_RISK Action: "*" or "lambda:*" → HIGH WILDCARD_ACTION Deny statements are ignored. Same-account principals are silent (legitimate internal grants). Functions without a resource policy are silently skipped (they're identity-only, covered by IAM scan). ─── Implementation ─── internal/cloud/provider.go gains LambdaPolicyFinding / LambdaPolicyFindingType and the LambdaPolicyProvider interface. internal/cloud/aws/tags.go's lambdaAPI is extended with GetPolicy; all four existing test mocks (mockLambda, invMockLambda, secretsMockLambda, quotaMockLambda) get the corresponding no-op stub. internal/cloud/aws/lambda_policy.go is the new scanner: - paginates ListFunctions - GetPolicy per function (ResourceNotFoundException = no policy, silent) - unmarshals the AWS policy JSON; classifyLambdaPolicy applies the rules - extracts the function's own account ID from its ARN for cross-account comparison - hasSourceCondition recognizes the standard aws:SourceAccount and aws:SourceArn keys under StringEquals / ArnLike condition operators internal/output/{json,table}.go gain WriteLambdaPolicy and LambdaPolicyFindings respectively. ─── Tests ─── internal/cloud/aws/lambda_policy_test.go covers every rule plus negative cases: same-account principal silent, deny statement silent, service principal with SourceAccount condition silent, function with no resource policy silent. lambdaAccountFromArn and hasSourceCondition have table-driven unit tests. End-to-end: matlock lambda audit --help renders correctly; build green, all tests pass, go vet clean.

stxkxs force-pushed the phase-3-features branch from a33585f to f7f77b0 Compare May 26, 2026 23:33

stxkxs force-pushed the phase-1-2-modernization branch from 0aeac88 to 5618a74 Compare May 26, 2026 23:37

stxkxs force-pushed the phase-3-features branch from f7f77b0 to c2df7d8 Compare May 26, 2026 23:38

stxkxs force-pushed the phase-1-2-modernization branch from 5618a74 to b2c957f Compare May 26, 2026 23:41

stxkxs force-pushed the phase-3-features branch from c2df7d8 to 935abbc Compare May 26, 2026 23:41

stxkxs force-pushed the phase-1-2-modernization branch from b2c957f to 19517fc Compare May 26, 2026 23:43

stxkxs force-pushed the phase-3-features branch from 935abbc to 2e618cc Compare May 26, 2026 23:44

stxkxs force-pushed the phase-1-2-modernization branch from 19517fc to 1e612da Compare May 26, 2026 23:47

stxkxs force-pushed the phase-3-features branch from 2e618cc to 9be0af7 Compare May 26, 2026 23:47

stxkxs changed the base branch from phase-1-2-modernization to main May 27, 2026 06:17

stxkxs added 4 commits May 26, 2026 23:18

stxkxs force-pushed the phase-3-features branch from 9be0af7 to 64c2494 Compare May 27, 2026 06:18

stxkxs merged commit cab3e3a into main May 27, 2026
2 of 4 checks passed

stxkxs deleted the phase-3-features branch May 27, 2026 06:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

features: notification sinks, kubernetes RBAC, remediation, lambda policy audit#2

features: notification sinks, kubernetes RBAC, remediation, lambda policy audit#2
stxkxs merged 4 commits into
mainfrom
phase-3-features

stxkxs commented May 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

stxkxs commented May 26, 2026

Summary

New CLI surface

new commands

new flags on existing commands

New dependencies

Detection rules (highlights)

Test plan

Squash-merge message (for use at merge time)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant