feat(bootstrap): dedicated AWS state backend + scoped IAM role via terraform-aws-template#5
Merged
Conversation
…rraform-aws-template terragrunt plan against this stack was failing with S3 403 because state was pointing at the terraform-proxmox bucket (cross-stack) and the day-to-day terraform IAM user lacks ListBucket on that bucket. Rather than widen the existing user's S3 grants across stacks, follow the canonical terraform-aws-template pattern: one bucket per stack, one scoped IAM role per stack, operator assumes the role with MFA via aws-vault. What lands - bootstrap/ subdirectory containing a thin module call to dryvist/terraform-aws-template pinned to commit c85894b (v0.1.0) with project = github. Yields S3 bucket tfstate-github-<account> (AES256, versioning, public-access-block, TLS-only policy, 90-day noncurrent expiry, S3 native locking) and IAM role tf-github with combined OIDC + MFA-gated AssumeRole trust. github_org / github_repo / operator_user_arns are required vars supplied via gitignored terraform.tfvars - no identities in source tree. GitHub Actions OIDC provider also created (account-wide singleton; import path documented if it pre-exists). - bootstrap/README.md walks the operator through one-time apply, state migration into the new bucket at key _bootstrap/terraform.tfstate, ~/.aws/config profile addition, verification. - terragrunt.hcl switched from terraform-proxmox-state-useast2-<acct> to tfstate-github-<acct>, key to github/terraform.tfstate, dropped dynamodb_table (S3 native lock only). No state to migrate since no apply has ever run against this stack. - versions.tf bumped required_version to >= 1.10 (for use_lockfile). - AGENTS.md adds a State backend section: bucket-per-stack rationale, operator identity flow, aws-vault profile naming convention, future CI via OIDC, and the rule never run this stack with elevated bootstrap creds. - README.md replaces the manual tofu init -backend-config snippet with aws-vault exec tf-github -- terragrunt and points to bootstrap/README.md for first-time setup. - .pre-commit-config.yaml adds --hook-config=--retry-once-with-cleanup to terraform_validate, the documented antonbabenko/pre-commit-terraform escape hatch for module sources pinned to commit SHAs. Without it the hook fails on first call after a module-SHA bump because the prior .terraform/modules/ install no longer matches; with it the hook auto-cleans and re-inits before the validate retry. Cost impact: free. S3 for a small state file plus a few noncurrent versions sits well under any free-tier ceiling. AES256 SSE-S3 (no KMS). S3 native locking (no DynamoDB). IAM role + inline policies + OIDC provider are all free. Verification - tofu validate in both root and bootstrap -> green - pre-commit run --all-files from a cold .terraform/ -> green - Checkov CKV_TF_1 satisfied by pinning the template module source to commit c85894b3667cc753a3d5ac07b50e9a7be9302331 (v0.1.0 tag's resolved commit), not the tag name - Commit GPG-signed - No identities in source tree Assisted-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
terragrunt planagainst this stack was failing with S3 403 because state was pointing at theterraform-proxmox-state-useast2-<account>bucket (cross-stack) and the day-to-dayterraformIAM user lackss3:ListBucketon that bucket. Rather than widen the existing user's S3 grants across stacks, follow the canonicaldryvist/terraform-aws-templatepattern: one bucket per stack, one scoped IAM role per stack, operator assumes the role with MFA viaaws-vault.What lands
bootstrap/— thin module call todryvist/terraform-aws-templatepinned to commitc85894b(the resolved SHA forv0.1.0, satisfying Checkov CKV_TF_1 which forbids tag-pinned module sources).project = "github"yields:tfstate-github-<account>(AES256 SSE-S3, versioning, public-access-block, TLS-only bucket policy, 90-day noncurrent-version expiry, S3 native locking viause_lockfile = true— no DynamoDB).tf-githubwith combined trust: GitHub Actions OIDC (repo:<github_org>/<github_repo>on push tomainand onpull_request) plus MFA-gated AssumeRole from the operator IAM user.tofu importonce per the README and re-apply).github_org,github_repo, andoperator_user_arnsare required vars with no defaults — operator supplies via gitignoredterraform.tfvars. Source tree stays identity-free.bootstrap/README.md— full operator walkthrough: prerequisites, apply, state migration into the new bucket at_bootstrap/terraform.tfstate,~/.aws/configprofile addition shape, verification, the OIDC-already-exists import path, and the don't-touch rule for ongoing changes.Root
terragrunt.hcl— switched bucket fromterraform-proxmox-state-useast2-<acct>totfstate-github-<acct>, key fromterraform-github/terraform.tfstatetogithub/terraform.tfstate(matches the template'sstate_key_prefixformula), droppeddynamodb_table. No state to migrate — no apply has ever run against this stack.Root
versions.tf— bumpedrequired_versionto>= 1.10foruse_lockfilesupport.AGENTS.md— new "State backend" section: bucket-per-stack rationale, the operator identity flow (operator IAM user → MFA AssumeRole → STS → terragrunt), theaws-vaultprofile naming convention (profile name == role name == tf-<project>), future CI via OIDC, and the rule "never run this stack with elevated bootstrap creds — only withtf-githubSTS".Root
README.md— replaces the manualtofu init -backend-config=...snippet withaws-vault exec tf-github -- terragrunt …and points tobootstrap/README.mdfor first-time setup..pre-commit-config.yaml—terraform_validategains--hook-config=--retry-once-with-cleanup=true, the documentedantonbabenko/pre-commit-terraformescape hatch for module sources pinned to commit SHAs. Without it the hook fails on the first call after any module-SHA bump (prior.terraform/modules/install doesn't match the new SHA, validate fails, hook auto-inits but doesn't retry validate). With it the hook auto-cleans and re-inits before retrying validate, then passes cleanly.What this does NOT do
tofu applyfrom this PR. The operator runs the one-time bootstrap apply with the elevatediam-usercreds after merge (perbootstrap/README.md), then switches totf-githubvia aws-vault for everything else..github/workflows/terragrunt.yml. The role IS OIDC-ready forrepo:<github_org>/<github_repo>on push tomainand onpull_request, but CI wiring is a future PR.Cost impact
Free. S3 for a < 1 KB state file plus a few noncurrent versions sits well under any free-tier ceiling. AES256 SSE-S3 is free (no KMS per-key or per-API-call charges). S3 native locking is free (no DynamoDB provisioned). IAM role + inline policies + OIDC provider are free. Lifecycle expiry, public-access block, TLS-only bucket policy: all free.
Verification
tofu init -backend=false && tofu validatein both root andbootstrap/— greenpre-commit run --all-filesfrom a cold.terraform/— green (the--retry-once-with-cleanupconfig makes terraform_validate recover cleanly from the first-call module-SHA mismatch)tf-githubprofile to~/.aws/configand verifiesaws-vault exec tf-github -- aws sts get-caller-identityreturns the assumed-role identityaws-vault exec tf-github -- terragrunt planand reviews the full plan (the original goal — should show 2 imports + 1 add for the rulesets in PR feat(rulesets): reverse-engineer org branch protection + add review gate, conventional commits #4)