features: notification sinks, kubernetes RBAC, remediation, lambda policy audit#2
Merged
Conversation
0aeac88 to
5618a74
Compare
5618a74 to
b2c957f
Compare
b2c957f to
19517fc
Compare
19517fc to
1e612da
Compare
… into matlock audit
─── New package ───
internal/output/sinks/ implements three concrete sinks behind a single Sink
interface (Name, Send). Each accepts a normalized Digest containing counts
by severity, the run's domain list, the top 10 highest-severity findings,
and an optional report URL. Spec parsing accepts the form "<kind>:<url-or-key>",
repeatable on the CLI.
slack:<webhook-url> Block Kit message with severity-coded header,
context line, domain list, top-10 findings as
bullet points, optional report link
webhook:<url> generic POST of the raw Digest JSON — receivers
parse it however they like
pagerduty:<routing-key> PagerDuty Events API v2 trigger. Only fires when
the digest has at least one critical or high
finding so we don't page on-call about LOW noise
SendAll fires all configured sinks in sequence and aggregates their errors
via errors.Join — one bad sink does not block the others, since audit
should always succeed even if notification delivery fails.
─── CLI integration ───
matlock audit gains two flags:
--sink repeatable; "<kind>:<url-or-key>" form
--report-url optional URL embedded in every sink notification
After the audit completes (and any --output-file is written), sinks are
parsed and fired. Failures are logged to stderr but never affect the
audit's exit code — the user sees the full report regardless.
The audit-to-Digest mapping is the responsibility of cmd/audit.go
(auditDigest / topAuditFindings / digestProvider / sortFindingsBySeverity)
rather than the sinks package, so sinks stay generic and the audit-specific
shaping lives next to the audit command.
─── Tests ───
internal/output/sinks/sinks_test.go uses httptest to spin up local servers
and verify:
- Parse handles every supported kind, rejects unknown kinds and missing
URLs/keys, tolerates whitespace
- SendAll keeps firing good sinks past a failing one and aggregates errors
- WebhookSink rejects invalid URLs and surfaces 5xx responses
- SlackSink produces valid JSON Block Kit with expected content markers,
truncates Top findings to 10 even when given 25
- PagerDutySink suppresses medium-only digests, fires on high/critical,
rejects missing routing key, sends valid Events API v2 shape
Build green, all tests pass, go vet clean.
─── Why ───
matlock's mission is cloud-hygiene, and Kubernetes is now part of every
platform team's cloud surface. RBAC misconfiguration is the most common
real-world incident driver for K8s clusters — wildcard verbs, dangerous
verbs on wildcard resources, and bindings to broad subject groups
(system:authenticated et al.) are the patterns CTF writeups and pentests
keep finding in production.
─── New CLI surface ───
matlock k8s parent command (no behavior of its own)
matlock k8s rbac scan ClusterRoles + ClusterRoleBindings
Connection follows the standard kubeconfig chain: --kubeconfig flag,
then $KUBECONFIG, then ~/.kube/config, then in-cluster service-account
token. Provider.Detect() returns true if any of those is present.
─── New package ───
internal/cloud/k8s/ holds the provider; built on k8s.io/client-go with
the same adapter+mockable pattern as the other clouds:
rbacAPI interface ListClusterRoles, ListClusterRoleBindings
rbacAdapter wraps *kubernetes.Clientset.RbacV1()
Provider.ScanRBAC(ctx) returns []cloud.K8sFinding
internal/cloud/provider.go now defines:
K8sFindingType CLUSTER_ADMIN, WILDCARD_PERMISSION,
BINDING_TOO_BROAD, DANGEROUS_VERB
K8sFinding kind/name/namespace/context/detail/remediation
KubernetesProvider base interface (Name, Detect, ContextName)
K8sRBACProvider adds ScanRBAC
─── Detection rules ───
ClusterRoles:
- verbs:["*"] + resources:["*"] → HIGH CLUSTER_ADMIN
- verbs:["*"] on any resource list → HIGH WILDCARD_PERMISSION
- dangerous verb (create/update/patch/delete)
on resources:["*"] → HIGH WILDCARD_PERMISSION
- read-only verbs on resources:["*"] → silent
- default ClusterRoles (cluster-admin, admin,
edit, view, system:*, kubeadm:*) → skipped entirely
ClusterRoleBindings:
- binds anything to system:authenticated /
system:unauthenticated / system:anonymous /
system:serviceaccounts → CRITICAL BINDING_TOO_BROAD
- binds anything to system:masters → HIGH BINDING_TOO_BROAD
(legitimately used by kubeadm)
- binds cluster-admin to any specific subject → HIGH CLUSTER_ADMIN
These are conservative — we don't try to be a CNAPP. The goal is
"if you ran this in CI you'd catch the misconfigs that show up in
pentest reports."
─── Output renderers ───
internal/output/table.go gains K8sFindings (severity-colored table
with severity counts summary).
internal/output/json.go gains WriteK8sFindings (wrapped report).
─── Tests ───
internal/cloud/k8s/rbac_test.go covers each detection rule with a
hand-written mockRBAC: wildcard-everything, wildcard-verb-only,
wildcard-resource with dangerous verb, wildcard-resource with read-only
(silent), system-role skipping, broad-group binding (CRITICAL),
system:masters binding (HIGH), cluster-admin binding to user (HIGH),
specific-group binding (silent), and error paths for both list calls.
─── Dependencies ───
Added k8s.io/client-go, k8s.io/api, k8s.io/apimachinery at v0.36.1
plus their transitive Kubernetes ecosystem deps. go mod tidy applied.
Build green, all tests pass, go vet clean.
…te command
─── network audit --fix ───
Mirrors the existing storage audit --fix pattern: group findings by provider,
emit one fix-network-<provider>.sh per cloud with a bash shebang, set -euo
pipefail, a header noting provider/timestamp/finding-count, and per-finding
comment blocks (severity, type, resource, region, protocol/port/cidr,
detail) followed by the Remediation command string that the network
scanners already populate (aws ec2 revoke-security-group-ingress, gcloud
compute firewall-rules delete, az network nsg rule update).
The generated scripts are intentionally cautious — they revoke rules, so
the header warns the operator to review before running. Findings without
a Remediation string are skipped (nothing to script).
internal/network/fix.go is the implementation; tests cover provider grouping,
skipping findings missing remediation, file executable bit (0o755), and
mkdir -p of nested output directories.
─── matlock remediate ───
New top-level command that reads a previously-saved JSON scan report
(--output json on storage or network audit) and emits fix scripts.
--type storage | network (required)
--from path to JSON file (required)
--out output directory (default ".")
--severity minimum severity (default "LOW")
Unmarshals both the standard {findings:[...], total:N} envelope shape that
matlock writes and a bare-array form for hand-crafted inputs. Filters by
severity before dispatching to the existing WriteFixScripts in the storage
and network packages — no per-domain logic duplicated here.
This unlocks the "scan in one job, review, then remediate in a separate
gated job" workflow that CI pipelines actually want, instead of forcing
scan+fix into a single non-reviewable invocation.
─── End-to-end verified ───
Synthesized a network scan JSON, ran matlock remediate against it, and
confirmed the resulting fix-network-aws.sh has the expected bash header,
finding comment block, and revoke-security-group-ingress command verbatim
from the report.
Build green, all tests pass, go vet clean.
…and confused-deputy
─── Why ───
matlock's IAM scan analyzes identity-based permissions — what a role or user
can do *from* the inside. Lambda functions also have resource-based policies
(lambda:GetPolicy) that control who can invoke them *from* the outside.
These are a separate attack surface, and they're the source of real incidents:
internet-exposed Lambdas via API Gateway integrations with no source-account
condition, functions inadvertently granted to a contractor account that was
never revoked, AWS-service principals without confused-deputy mitigation.
This commit adds a focused scanner for that resource-based surface.
─── New CLI surface ───
matlock lambda parent
matlock lambda audit scan resource policies on every function
Currently AWS only — the LambdaPolicyProvider interface in
internal/cloud/provider.go is designed so GCP/Azure equivalents can plug
in later (Cloud Functions invokerBindings, Azure Functions auth keys).
─── Detection rules ───
Principal: "*" → CRITICAL PUBLIC_INVOKE
Principal: {"AWS": "*"} → CRITICAL PUBLIC_INVOKE
Principal: {"AWS": "arn:...other-account..."} → HIGH CROSS_ACCOUNT_INVOKE
Principal: {"Service": "..."} without
aws:SourceAccount / aws:SourceArn condition → HIGH CONFUSED_DEPUTY_RISK
Action: "*" or "lambda:*" → HIGH WILDCARD_ACTION
Deny statements are ignored. Same-account principals are silent (legitimate
internal grants). Functions without a resource policy are silently skipped
(they're identity-only, covered by IAM scan).
─── Implementation ───
internal/cloud/provider.go gains LambdaPolicyFinding / LambdaPolicyFindingType
and the LambdaPolicyProvider interface.
internal/cloud/aws/tags.go's lambdaAPI is extended with GetPolicy; all four
existing test mocks (mockLambda, invMockLambda, secretsMockLambda,
quotaMockLambda) get the corresponding no-op stub.
internal/cloud/aws/lambda_policy.go is the new scanner:
- paginates ListFunctions
- GetPolicy per function (ResourceNotFoundException = no policy, silent)
- unmarshals the AWS policy JSON; classifyLambdaPolicy applies the rules
- extracts the function's own account ID from its ARN for cross-account
comparison
- hasSourceCondition recognizes the standard aws:SourceAccount and
aws:SourceArn keys under StringEquals / ArnLike condition operators
internal/output/{json,table}.go gain WriteLambdaPolicy and
LambdaPolicyFindings respectively.
─── Tests ───
internal/cloud/aws/lambda_policy_test.go covers every rule plus negative
cases: same-account principal silent, deny statement silent, service
principal with SourceAccount condition silent, function with no resource
policy silent. lambdaAccountFromArn and hasSourceCondition have table-driven
unit tests.
End-to-end: matlock lambda audit --help renders correctly; build green,
all tests pass, go vet clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 3 of the maintenance + enhancement plan. Four independent additions, stacked on top of #1 (modernization). Diff shown here is only the Phase 3 work — review #1 first, this PR will retarget to
mainafter that lands.c40189dmatlock audit --sink ...46114fematlock k8s rback8s.io/client-go; skips kubernetes-system default ClusterRoles so output focuses on user-introduced risk.51c6c8fmatlock network audit --fix,matlock remediatenetwork audit --fixmirrors the existingstorage audit --fix(emit shell scripts per provider). New top-levelmatlock remediate --type {storage,network} --from report.jsonunlocks the offline "scan → review → apply" workflow CI pipelines want.a33585fmatlock lambda auditlambda:GetPolicyforPrincipal: \"*\"(public invoke), cross-account principals, service principals withoutaws:SourceAccountcondition (confused-deputy risk), and wildcard actions. Complements the identity-based IAM scan with the resource-based attack surface.New CLI surface
```
new commands
matlock k8s rbac
matlock remediate --type {storage,network} --from <report.json>
matlock lambda audit
new flags on existing commands
matlock audit --sink : --report-url
matlock network audit --fix --out
```
New dependencies
k8s.io/client-go,k8s.io/api,k8s.io/apimachineryat v0.36.1 — pulled in by the K8s domain. ~50MB of transitive deps; standard for any Go K8s integration.go mod tidyapplied.Detection rules (highlights)
Sinks: spec form
<kind>:<url-or-key>, repeatable. PagerDuty Events API v2 dedup key ismatlock-<source>-<provider>.K8s RBAC:
verbs:["*"]+resources:["*"]→ HIGH; wildcard verb alone → HIGH; dangerous verb on wildcard resource → HIGH; read-only verbs on wildcards → silentsystem:authenticated/system:unauthenticated/system:anonymous/system:serviceaccounts→ CRITICAL; tosystem:masters→ HIGH (legitimately used by kubeadm); tocluster-adminwith any specific subject → HIGHcluster-admin,admin,edit,view,system:*,kubeadm:*are skippedLambda policy:
Principal: \"*\"orPrincipal: {\"AWS\": \"*\"}→ CRITICAL public invokePrincipal: {\"AWS\": \"arn:...other-account...\"}→ HIGH cross-accountPrincipal: {\"Service\": \"...\"}withoutaws:SourceAccount/aws:SourceArncondition → HIGH confused deputyAction: \"*\"orAction: \"lambda:*\"→ HIGH wildcardTest plan
./matlock k8s rbac --helpand./matlock lambda audit --helprender./matlock audit --sink slack:fake-urlparses; spec validation rejects unknown kinds./matlock remediate --type network --from /tmp/net.json --out /tmp/fixes/produces an executable shell scriptSquash-merge message (for use at merge time)
```
features: notification sinks, kubernetes RBAC, remediation, lambda policy scanner
Four independent additions completing Phase 3 of the maintenance plan:
matlock audit --sink: Slack, PagerDuty, and generic webhook notifications
delivered after each audit run. Best-effort (one bad sink doesn't block
the others); PagerDuty only fires when the digest contains at least one
critical or high finding to avoid alert fatigue.
matlock k8s rbac: new Kubernetes domain. Scans cluster-scoped ClusterRoles
and ClusterRoleBindings for wildcard verbs/resources, dangerous verbs on
wildcard resources, and bindings to broad subject groups (system:authenticated,
system:masters). Built on k8s.io/client-go (new dep); skips built-in
default ClusterRoles so output focuses on user-introduced risk.
matlock network audit --fix + matlock remediate: emit shell remediation
scripts per provider, either inline during a scan or from a saved JSON
report. Mirrors and extends the existing storage audit --fix. Unlocks
the offline "scan -> review -> apply" workflow CI pipelines want.
matlock lambda audit: scan AWS Lambda resource-based policies for
Principal:"*" (public invoke), cross-account principals, service
principals without aws:SourceAccount condition (confused-deputy risk),
and wildcard actions. Complements the identity-based IAM scan with
resource-based policy analysis.
Adds k8s.io/client-go (and k8s.io/api, k8s.io/apimachinery) at v0.36.1
to support the Kubernetes domain.
```