Skip to content

feat(compliance): U7 — S3 Object Lock anchor bucket Terraform module#917

Merged
ericodom merged 2 commits into
mainfrom
feat/compliance-u7-anchor-bucket
May 7, 2026
Merged

feat(compliance): U7 — S3 Object Lock anchor bucket Terraform module#917
ericodom merged 2 commits into
mainfrom
feat/compliance-u7-anchor-bucket

Conversation

@ericodom
Copy link
Copy Markdown
Contributor

@ericodom ericodom commented May 7, 2026

Summary

  • New terraform/modules/data/compliance-audit-bucket/ module provisions a WORM-protected S3 bucket (thinkwork-${stage}-compliance-anchors) for SOC2 Type 1 tamper-evident audit anchoring — first Object Lock implementation in the org and first real consumer of module.kms.aws_kms_key.main. Object Lock enabled at create time, GOVERNANCE-mode default (master plan Decision fix: episodes reflection config + memory resolver UUID/slug #2 — flip to COMPLIANCE in prod tfvars at audit-engagement time, gated by Terraform precondition on prod stages), 365-day retention, SSE-KMS via the existing thinkwork CMK with bucket_key_enabled = true, lifecycle scoped to anchors/ → Glacier IR @ 90d (no expiration; Object Lock is the deletion gate).
  • Bucket policy carries three Deny statements: aws:SecureTransport = false (HTTPS-only), s3:DeleteObject/s3:DeleteObjectVersion on object scope (defense-in-depth on top of Object Lock), and s3:DeleteBucket on bucket scope. Co-located IAM role (thinkwork-${stage}-compliance-anchor-lambda-role) with path-scoped Allow on anchors/* + proofs/*, KMS Allow for GenerateDataKey + Decrypt + DescribeKey, explicit Deny on s3:BypassGovernanceRetention + s3:PutObjectLegalHold (master plan line 517 — survives any future broadening of the role's grants), and aws:SourceAccount confused-deputy guard on the trust policy.
  • Inert in this PR. The IAM role exists but no Lambda assumes it until U8a (master plan Decision fix: reorder docs sidebar — Threads before Agents, Control before Automations #9 — inert→live seam swap). Three new default = "" variables in lambda-api/variables.tf reserve the U8a wiring shape; no resources reference them yet. force_destroy = false is hardcoded — Object Lock + force_destroy is incompatible by design; dev cleanup playbook documented in module README.

Part of the master plan: docs/plans/2026-05-06-011-feat-compliance-audit-event-log-plan.md (U7 of 11; U1–U6 already merged + deployed).

Plan: docs/plans/2026-05-07-009-feat-compliance-u7-anchor-bucket-plan.md.

Test plan

  • terraform init -backend=false && terraform validate succeeds in the new module directory in isolation.
  • terraform init -backend=false && terraform validate succeeds at the full greenfield composite root (terraform/examples/greenfield/).
  • terraform fmt -check -recursive clean across the module and the modified composite-root files.
  • Dev deploy is the integration test (repo pattern). After CI + merge, the deploy.yml workflow's terraform-apply against dev provisions the bucket. Operator-laptop smoke commands then verify configuration:
    • aws s3api get-object-lock-configuration --bucket thinkwork-dev-compliance-anchors --query 'ObjectLockConfiguration.Rule.DefaultRetention' → returns Mode: GOVERNANCE, Days: 365.
    • aws s3api get-bucket-versioning --bucket thinkwork-dev-compliance-anchors --query 'Status'Enabled.
    • aws s3api get-public-access-block --bucket thinkwork-dev-compliance-anchors → all four flags true.
    • aws s3api get-bucket-encryption --bucket thinkwork-dev-compliance-anchors → SSE-KMS + BucketKeyEnabled: true + thinkwork CMK ARN.
    • aws s3api get-bucket-lifecycle-configuration --bucket thinkwork-dev-compliance-anchors → one rule scoped to anchors/ transitioning to GLACIER_IR at 90 days; no expiration.
    • aws s3api get-bucket-policy --bucket thinkwork-dev-compliance-anchors → contains EnforceHTTPS, DenyDeleteObject, and DenyBucketDelete Sids.
    • aws iam get-role --role-name thinkwork-dev-compliance-anchor-lambda-role + aws iam list-role-policies --role-name <role> → role exists; inline policies anchor-s3 and anchor-kms attached.
    • aws cloudtrail lookup-events --lookup-attributes AttributeKey=ResourceName,AttributeValue=thinkwork-dev-compliance-anchor-lambda-role → only the create event (role is unassumed; no Lambda wires it until U8a).
  • Subsequent terraform plan after apply shows zero diff (no provider drift on the standalone Object Lock configuration resource).

A scripted version of these smoke commands (scripts/smoke-compliance-anchor-bucket.sh mirroring the flue-smoke-test pattern) is deferred to a follow-up PR per the plan.

Residual Review Findings

ce-code-review autofix pass on a40fe64d applied 4 safe-auto fixes (committed as abf0af0d). The 5 residual findings below are all routed to downstream-resolver units (U8a/U8b) or deferred as architectural — none are merge-blocking.

  • fix: Built-in Tools row style — single line, 40px #5 — P2 — KMS key policy not pinned to anchor role. terraform/modules/foundation/kms/main.tf (cross-cutting). The module relies on the existing thinkwork CMK's default root-account statement. If a future PR tightens the key policy without naming the anchor role, anchor writes 403 silently — no plan-time signal. Module README documents this risk. Owner: U8b downstream-resolver — add the anchor Lambda role ARN to the KMS key resource policy explicitly when U8b ships.
  • fix: wakeup-processor honors tenant_builtin_tools for web_search #6 — P2 — Trust policy aws:SourceArn pinning deferred. terraform/modules/data/compliance-audit-bucket/main.tf:227-244. aws:SourceAccount shipped in this PR (confused-deputy defense). aws:SourceArn requires the anchor Lambda function ARN, which is known-after-apply and doesn't exist until U8a. Owner: U8a downstream-resolver — add aws:SourceArn = aws_lambda_function.anchor.arn to the trust-policy condition alongside the function definition.
  • feat: add deployment status to Settings page #7 — P2 — proofs/ prefix lifecycle billing trap. terraform/modules/data/compliance-audit-bucket/main.tf:148. Lifecycle rule scoped to prefix = "anchors/". The proofs/ prefix has no transition rule; if U8b produces noncurrent versions of proof slices (overwrite, delete-marker creation), they accumulate at full Standard storage class indefinitely. Owner: U8b downstream-resolver — declare proofs/ retention semantics and either add a separate lifecycle rule or set per-object expiration via U8b's PutObject calls.
  • feat: add copy-to-clipboard buttons on Settings page #8 — P2 — First-KMS-consumer integration smoke gap. terraform/modules/data/compliance-audit-bucket/main.tf (whole module). The module is the org's first real consumer of module.kms.aws_kms_key.main. CI never exercises the full Put/Get/SSE-KMS path until U8b. A future PR breaking the key policy or KMS access would surface only at U8b deploy. Owner: U8b downstream-resolver — body-swap safety integration test must perform a real PutObject + GetObject through the anchor role, exercising both S3 and KMS, before declaring U8b success. Stub responses do not count.
  • fix: reorder docs sidebar — Threads before Agents, Control before Automations #9 — P3 — Post-365-day anchor disposition. terraform/modules/data/compliance-audit-bucket/main.tf:148-181. No expiration rule fires after Object Lock retention expires. At ~35k anchors/year/stage, the bucket grows indefinitely. Auditor question "how is post-retention metadata disposed?" has no current answer. Owner: human (architectural) — add an expiration rule once SOC2 auditor guidance shapes the retention story.

Run artifact: /tmp/compound-engineering/ce-code-review/20260507-141057-d5dc1879/.

🤖 Generated with Claude Code

ericodom and others added 2 commits May 7, 2026 14:10
New `terraform/modules/data/compliance-audit-bucket/` module provisions the
WORM-protected S3 bucket that U8a/U8b will write Merkle-anchor evidence into.
Object Lock enabled at create time (one-way commit per AWS), GOVERNANCE-mode
default (master plan Decision #2 — flip to COMPLIANCE in prod tfvars at
audit-engagement time), 365-day default retention, SSE-KMS via the existing
thinkwork CMK (org's first real KMS consumer), `bucket_key_enabled = true`,
public access fully blocked, lifecycle scoped to `anchors/` → Glacier IR @
90 days (no expiration; Object Lock retention is the deletion gate).

Bucket policy carries two Deny statements: `aws:SecureTransport = false`
(HTTPS-only) and `s3:DeleteObject`/`s3:DeleteObjectVersion` from any
principal (master plan line 518 — defense-in-depth on top of Object Lock).

Co-located IAM role (`thinkwork-${stage}-compliance-anchor-lambda-role`)
that U8a's anchor Lambda will assume. Inline policies: path-scoped Allow on
`anchors/*` + `proofs/*` for Put/Get + Retention actions, bucket-side Allow
on `s3:GetBucketObjectLockConfiguration`, KMS Allow for `GenerateDataKey` +
`Decrypt` + `DescribeKey`, and **explicit Deny** on
`s3:BypassGovernanceRetention` + `s3:PutObjectLegalHold` so the deny
survives any future broadening of the role's IAM grants (master plan line
517).

Inert in this PR: the IAM role exists but no Lambda assumes it until U8a
(master plan Decision #9 — inert→live seam swap). Three new variables in
`lambda-api/variables.tf` (`default = ""`) reserve shape for U8a's wiring;
no resources reference them yet.

`force_destroy = false` is hardcoded; Object Lock + force_destroy is
incompatible by design. Dev cleanup playbook documented in module README.

Composite-root wiring follows the established three-tier pattern:
`module.compliance_anchors` instantiation in `thinkwork/main.tf`, three
new outputs in `thinkwork/outputs.tf`, `compliance_anchor_object_lock_mode`
+ `compliance_anchor_retention_days` variables in `thinkwork/variables.tf`
with validation, and three forward-arguments into `module "api"`.

Validation: `terraform init -backend=false` + `terraform validate` succeed
on both the module in isolation and the full greenfield composite root.
Post-deploy verification (R10) is operator-CLI: `aws s3api
get-object-lock-configuration --bucket thinkwork-dev-compliance-anchors`
returns `Mode: GOVERNANCE`, `Days: 365`.

Plan: `docs/plans/2026-05-07-009-feat-compliance-u7-anchor-bucket-plan.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four safe-auto fixes applied from ce-code-review pass on a40fe64:

1. **Prod-stage GOVERNANCE guardrail (P0).** Added a `lifecycle.precondition`
   on `aws_s3_bucket_object_lock_configuration.anchor` that rejects plan when
   `var.stage ∈ {"prod", "production"}` AND `var.mode == "GOVERNANCE"`. This
   closes the "operator forgets the COMPLIANCE flip" failure mode that both
   ce-security-reviewer and ce-adversarial-reviewer flagged as the most
   plausible 3-month audit-finding path. Override at audit-engagement time
   via the composite-root tfvar `compliance_anchor_object_lock_mode =
   "COMPLIANCE"`.

2. **Bucket-policy DenyBucketDelete (P2).** Added `s3:DeleteBucket` deny on
   the bucket ARN — defense-in-depth for the post-retention window when
   objects become deletable. Did NOT add `s3:PutBucketPolicy` /
   `s3:DeleteBucketPolicy` denies because Terraform itself uses
   `s3:PutBucketPolicy` to manage this resource; denying them would lock
   the deploying principal out of subsequent applies. Policy-rewrite
   defense belongs at the IAM-policy layer on the deploying principal.

3. **Trust-policy `aws:SourceAccount` condition (P2).** Added
   `aws:SourceAccount = var.account_id` to the anchor Lambda role's
   trust policy. Confused-deputy defense; works without depending on
   U8a's not-yet-existent function ARN. New `var.account_id` input wired
   through the composite root from the existing `var.account_id` at
   `terraform/modules/thinkwork/variables.tf`. `aws:SourceArn` pinning
   waits for U8a (function ARN is known-after-apply).

4. **README dev-cleanup playbook accuracy (P3).** Corrected the playbook
   to acknowledge that no break-glass admin role with
   `s3:BypassGovernanceRetention` is currently provisioned in the repo
   — operators must grant themselves the action ad-hoc. Added a
   "Bucket-policy interaction" subsection explaining that the cleanup
   path requires temporarily replacing the bucket policy because
   `DenyDeleteObject` applies to all principals including admins. The
   prior README implied an admin role existed, which adversarial review
   confirmed it does not.

Validation: `terraform validate` succeeds on both the module in
isolation and the full greenfield composite root after all fixes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ericodom ericodom merged commit 6531601 into main May 7, 2026
5 checks passed
@ericodom ericodom deleted the feat/compliance-u7-anchor-bucket branch May 7, 2026 19:38
ericodom added a commit that referenced this pull request May 8, 2026
…ance arc) (#953)

Knowledge-track architecture-pattern doc capturing the meta-pattern
that shipped the master compliance arc (~17 PRs over 2 days,
2026-05-07–2026-05-08). Extends the existing
inert-to-live-seam-swap-pattern-2026-04-25.md (Python-module scoped)
with two dimensions surfaced during the compliance arc:

1. Substrate-first multi-layer ordering — DB schema → Terraform/IAM
   → Lambda shell → consumer code. The 2026-04-25 doc covered factory
   closures + seam_fn defaults at the Python-module scope; this doc
   generalizes to multi-layer infrastructure arcs spanning Aurora,
   S3 Object Lock buckets, SQS queues, and admin SPA.

2. Throw-don't-no-op rule for stubs — the inert state must be
   operator-visible (DLQ depth alarm, smoke-test failure). Silent
   no-op stubs that ack messages without doing work were rejected
   explicitly in the U11.U2 plan.

Three case studies with verbatim PR + file citations:
- U7→U8a→U8b — WORM anchor bucket + inert Lambda body + live S3 write
  (#917, #921, #927)
- U10 backend → extensions → admin UI (#937, #939, #941)
- U11 four-PR sequence: mutation → Terraform + stub → live runner
  → admin Exports page (#944, #948, #950, #951)

Includes:
- Stable-seam invariant (body swaps, contracts don't)
- Body-swap forcing functions in integration tests with call-count
  assertions (not just return-shape) to catch sibling-function escape
- CloudWatch alarm posture mirroring inert/live state
  (treat_missing_data flips on the live PR)
- Independent revertibility — substrate alone leaves a known-good
  inert state

Frontmatter validated parser-safe via the plugin-bundled
validate-frontmatter.py.

Also adds a one-line backlink in the prior-art doc so a reader landing
on the 2026-04-25 doc finds the multi-layer extension.

Generated via /ce-compound full mode (3 parallel research subagents +
ce-session-historian foreground).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant