Skip to content

feat(bootstrap): dedicated AWS state backend + scoped IAM role via terraform-aws-template#5

Merged
JacobPEvans-personal merged 1 commit into
mainfrom
feat/bootstrap-aws-state
May 31, 2026
Merged

feat(bootstrap): dedicated AWS state backend + scoped IAM role via terraform-aws-template#5
JacobPEvans-personal merged 1 commit into
mainfrom
feat/bootstrap-aws-state

Conversation

@JacobPEvans-personal
Copy link
Copy Markdown
Member

Summary

terragrunt plan against this stack was failing with S3 403 because state was pointing at the terraform-proxmox-state-useast2-<account> bucket (cross-stack) and the day-to-day terraform IAM user lacks s3:ListBucket on that bucket. Rather than widen the existing user's S3 grants across stacks, follow the canonical dryvist/terraform-aws-template pattern: one bucket per stack, one scoped IAM role per stack, operator assumes the role with MFA via aws-vault.

What lands

  • bootstrap/ — thin module call to dryvist/terraform-aws-template pinned to commit c85894b (the resolved SHA for v0.1.0, satisfying Checkov CKV_TF_1 which forbids tag-pinned module sources). project = "github" yields:

    • S3 bucket tfstate-github-<account> (AES256 SSE-S3, versioning, public-access-block, TLS-only bucket policy, 90-day noncurrent-version expiry, S3 native locking via use_lockfile = true — no DynamoDB).
    • IAM role tf-github with combined trust: GitHub Actions OIDC (repo:<github_org>/<github_repo> on push to main and on pull_request) plus MFA-gated AssumeRole from the operator IAM user.
    • IAM role policy scoped to that one bucket only.
    • Account-wide GitHub Actions OIDC provider (singleton; if it already exists from another stack, tofu import once per the README and re-apply).

    github_org, github_repo, and operator_user_arns are required vars with no defaults — operator supplies via gitignored terraform.tfvars. Source tree stays identity-free.

  • bootstrap/README.md — full operator walkthrough: prerequisites, apply, state migration into the new bucket at _bootstrap/terraform.tfstate, ~/.aws/config profile addition shape, verification, the OIDC-already-exists import path, and the don't-touch rule for ongoing changes.

  • Root terragrunt.hcl — switched bucket from terraform-proxmox-state-useast2-<acct> to tfstate-github-<acct>, key from terraform-github/terraform.tfstate to github/terraform.tfstate (matches the template's state_key_prefix formula), dropped dynamodb_table. No state to migrate — no apply has ever run against this stack.

  • Root versions.tf — bumped required_version to >= 1.10 for use_lockfile support.

  • AGENTS.md — new "State backend" section: bucket-per-stack rationale, the operator identity flow (operator IAM user → MFA AssumeRole → STS → terragrunt), the aws-vault profile naming convention (profile name == role name == tf-<project>), future CI via OIDC, and the rule "never run this stack with elevated bootstrap creds — only with tf-github STS".

  • Root README.md — replaces the manual tofu init -backend-config=... snippet with aws-vault exec tf-github -- terragrunt … and points to bootstrap/README.md for first-time setup.

  • .pre-commit-config.yamlterraform_validate gains --hook-config=--retry-once-with-cleanup=true, the documented antonbabenko/pre-commit-terraform escape hatch for module sources pinned to commit SHAs. Without it the hook fails on the first call after any module-SHA bump (prior .terraform/modules/ install doesn't match the new SHA, validate fails, hook auto-inits but doesn't retry validate). With it the hook auto-cleans and re-inits before retrying validate, then passes cleanly.

What this does NOT do

  • No tofu apply from this PR. The operator runs the one-time bootstrap apply with the elevated iam-user creds after merge (per bootstrap/README.md), then switches to tf-github via aws-vault for everything else.
  • No .github/workflows/terragrunt.yml. The role IS OIDC-ready for repo:<github_org>/<github_repo> on push to main and on pull_request, but CI wiring is a future PR.
  • Does not touch terraform-proxmox or terraform-aws state. They keep their existing shared bucket. Only this stack carves out its own.

Cost impact

Free. S3 for a < 1 KB state file plus a few noncurrent versions sits well under any free-tier ceiling. AES256 SSE-S3 is free (no KMS per-key or per-API-call charges). S3 native locking is free (no DynamoDB provisioned). IAM role + inline policies + OIDC provider are free. Lifecycle expiry, public-access block, TLS-only bucket policy: all free.

Verification

  • tofu init -backend=false && tofu validate in both root and bootstrap/ — green
  • pre-commit run --all-files from a cold .terraform/ — green (the --retry-once-with-cleanup config makes terraform_validate recover cleanly from the first-call module-SHA mismatch)
  • Checkov CKV_TF_1 satisfied (module source pinned to commit SHA, not tag name)
  • Commit GPG-signed
  • No identities in source tree (verified by grep; all identity values are required vars supplied via gitignored tfvars)
  • Operator runs the bootstrap apply with elevated creds (one-time)
  • Operator adds the tf-github profile to ~/.aws/config and verifies aws-vault exec tf-github -- aws sts get-caller-identity returns the assumed-role identity
  • Operator runs aws-vault exec tf-github -- terragrunt plan and reviews the full plan (the original goal — should show 2 imports + 1 add for the rulesets in PR feat(rulesets): reverse-engineer org branch protection + add review gate, conventional commits #4)

…rraform-aws-template

terragrunt plan against this stack was failing with S3 403 because state
was pointing at the terraform-proxmox bucket (cross-stack) and the
day-to-day terraform IAM user lacks ListBucket on that bucket. Rather
than widen the existing user's S3 grants across stacks, follow the
canonical terraform-aws-template pattern: one bucket per stack, one
scoped IAM role per stack, operator assumes the role with MFA via
aws-vault.

What lands

- bootstrap/ subdirectory containing a thin module call to
  dryvist/terraform-aws-template pinned to commit c85894b (v0.1.0)
  with project = github. Yields S3 bucket tfstate-github-<account>
  (AES256, versioning, public-access-block, TLS-only policy, 90-day
  noncurrent expiry, S3 native locking) and IAM role tf-github with
  combined OIDC + MFA-gated AssumeRole trust. github_org / github_repo
  / operator_user_arns are required vars supplied via gitignored
  terraform.tfvars - no identities in source tree. GitHub Actions
  OIDC provider also created (account-wide singleton; import path
  documented if it pre-exists).
- bootstrap/README.md walks the operator through one-time apply,
  state migration into the new bucket at key
  _bootstrap/terraform.tfstate, ~/.aws/config profile addition,
  verification.
- terragrunt.hcl switched from terraform-proxmox-state-useast2-<acct>
  to tfstate-github-<acct>, key to github/terraform.tfstate, dropped
  dynamodb_table (S3 native lock only). No state to migrate since no
  apply has ever run against this stack.
- versions.tf bumped required_version to >= 1.10 (for use_lockfile).
- AGENTS.md adds a State backend section: bucket-per-stack rationale,
  operator identity flow, aws-vault profile naming convention, future
  CI via OIDC, and the rule never run this stack with elevated
  bootstrap creds.
- README.md replaces the manual tofu init -backend-config snippet
  with aws-vault exec tf-github -- terragrunt and points to
  bootstrap/README.md for first-time setup.
- .pre-commit-config.yaml adds --hook-config=--retry-once-with-cleanup
  to terraform_validate, the documented antonbabenko/pre-commit-terraform
  escape hatch for module sources pinned to commit SHAs. Without it
  the hook fails on first call after a module-SHA bump because the
  prior .terraform/modules/ install no longer matches; with it the
  hook auto-cleans and re-inits before the validate retry.

Cost impact: free. S3 for a small state file plus a few noncurrent
versions sits well under any free-tier ceiling. AES256 SSE-S3 (no
KMS). S3 native locking (no DynamoDB). IAM role + inline policies +
OIDC provider are all free.

Verification

- tofu validate in both root and bootstrap -> green
- pre-commit run --all-files from a cold .terraform/ -> green
- Checkov CKV_TF_1 satisfied by pinning the template module source
  to commit c85894b3667cc753a3d5ac07b50e9a7be9302331 (v0.1.0 tag's
  resolved commit), not the tag name
- Commit GPG-signed
- No identities in source tree

Assisted-by: Claude <noreply@anthropic.com>
@JacobPEvans-personal JacobPEvans-personal merged commit 70624bf into main May 31, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant