Proper handling of DynamoDB replicas and GSIs with autoscaling #25331

ghost · 2022-06-14T16:32:48Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

Currently, it's impossible to use terraform to cleanly create a new DynamoDB table with replicas or GSIs unless its billing mode is PAY_PER_REQUEST. The reason for this is that:

Replicas and GSIs require autoscaling so can't be created unless off a table with autoscaling enabled
To enable autoscaling, the table must be created first, then the autoscaling resources created separately

This leads to a circular dependency, in which it's impossible to create a table using autoscaling and replicas/GSIs all in one go using proper terraform. There are two workarounds, both bad.

Using two `apply`s

This workaround is described in a comment on an earlier issue on the same subject, here.

First apply: Create a PROVISIONED table with no replicas or GSIs, plus create autoscaling resources applying to the table (one each of aws_appautoscaling_target and aws_appautoscaling_policy for reads, and one each for writes)
Second apply: Update the table to add replicas and GSIs

The advantage of this approach is that everything you create is in the terraform state so subsequent plan and apply operations will use and return correct information. The disadvantage is that it requires running multiple applys and patching code in between, which is unpleasant at the best of times and clearly impossible in a CI/CD scenario.

Using a `null_resource` with a local provisioner

You create the table with autoscaling but no replicas. You then have a null_resource which depends_on the table and the autoscaling resources. This null_resource uses the AWS CLI in a local provisioner to create the replica tables and GSIs.

This approach has the advantage of only requiring one apply and no ad hoc code patching, so can be used in CI/CD. However, the replica tables and GSIs you create are effectively not in IaC - they won't be modified by subsequent changes to config, nor will they be torn down when you run a destroy. Plus, any terraform code changes that will affect them won't be shown in a plan output. This approach is therefore also very suboptimal.

New or Affected Resource(s)

Existing:

aws_dynamodb_table

Potential new:

aws_dynamodb_replica_table
aws_dynamodb_global_secondary_index

Potential Terraform Configuration

The workaround shown above relies on the fact that you can add replicas to an existing table without recreating it. My suggestion, then, would be to add a new resource to the provider which does this.

I've added an example config here for how the new resource would work.

## Config ##

locals {
  autoscaling_config = {
    min_read_capacity                = 100
    max_read_capacity                = 1000
    read_capacity_scaling_threshold  = 70
    min_write_capacity               = 50
    max_write_capacity               = 50
    write_capacity_scaling_threshold = 70
  }
  region = "eu-west-1"
}

## Initial table ##

resource "aws_dynamodb_table" "table" {
  lifecycle {
    # Necessary when using autoscaling otherwise TF tries to reset the scaling
    ignore_changes = [read_capacity, write_capacity]
  }

  name = "ExampleTable"

  hash_key  = "WidgetID"

  attribute {
    name = "WidgetID"
    type = "S"
  }

  billing_mode    = "PROVISIONED"
  read_capacity   = local.autoscaling_config.min_read_capacity
  write_capacity  = local.autoscaling_config.min_write_capacity
}

## Autoscaling ##

resource "aws_appautoscaling_target" "table_read_target" {
  max_capacity       = local.autoscaling_config.max_read_capacity
  min_capacity       = local.autoscaling_config.min_read_capacity
  resource_id        = "table/${aws_dynamodb_table.table.name}"
  scalable_dimension = "dynamodb:table:ReadCapacityUnits"
  service_namespace  = "dynamodb"
}

resource "aws_appautoscaling_policy" "table_read_policy" {
  name               = "DynamoRCU:${aws_appautoscaling_target.table_read_target.resource_id}"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.table_read_target.resource_id
  scalable_dimension = aws_appautoscaling_target.table_read_target.scalable_dimension
  service_namespace  = aws_appautoscaling_target.table_read_target.service_namespace

  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "DynamoDBReadCapacityUtilization"
    }
    target_value = local.autoscaling_config.read_capacity_scaling_threshold
  }
}

resource "aws_appautoscaling_target" "table_write_target" {
  max_capacity       = local.autoscaling_config.max_write_capacity
  min_capacity       = local.autoscaling_config.min_write_capacity
  resource_id        = "table/${aws_dynamodb_table.table.name}"
  scalable_dimension = "dynamodb:table:WriteCapacityUnits"
  service_namespace  = "dynamodb"
}

resource "aws_appautoscaling_policy" "table_write_policy" {
  name               = "DynamoWCU:${aws_appautoscaling_target.table_write_target.resource_id}"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.table_write_target.resource_id
  scalable_dimension = aws_appautoscaling_target.table_write_target.scalable_dimension
  service_namespace  = aws_appautoscaling_target.table_write_target.service_namespace

  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "DynamoDBWriteCapacityUtilization"
    }
    target_value = local.autoscaling_config.write_capacity_scaling_threshold
  }
}

#######################
## The new resources ##
#######################

resource "aws_dynamodb_replica_tables" "example_replica" {
  # We use depends_on and reference back to the original table ARN to ensure
  # that this resource will be create only after the original table, plus its
  # auto-scaling rules, have already been created
  depends_on = [
    aws_appautoscaling_target.table_write_target,
    aws_appautoscaling_policy.table_write_policy,
  ]
  original_table_arn = aws_dynamodb_table.table.arn

  region = local.region
}

resource "aws_dynamodb_global_secondary_index" "example_gsi" {
  # We use depends_on and reference back to the original table ARN to ensure
  # that this resource will be create only after the original table, plus its
  # auto-scaling rules, have already been created
  depends_on = [
    aws_appautoscaling_target.table_write_target,
    aws_appautoscaling_policy.table_write_policy,
  ]
  table_arn = aws_dynamodb_table.table.arn

  name               = "IndexOnClientID"
  hash_key           = "ClientID"
  attribute {
    name = "ClientID"
    type = "S"
  }
  read_capacity      = local.autoscaling_config.min_read_capacity
  write_capacity     = local.autoscaling_config.min_write_capacity
}

Using new resources here would allow for multiple API calls to take place under the hood, circumventing the need for two applys.

To be clear - this is just one suggested solution to this problem. I've got no attachment to any particular solution but this seemed like a simple way to break the architectural deadlock around this.

References

DynamoDB with Global table v2019.11.21 and PROVISIONED and autscaling generates Table Capacity and/or GSI capacityValidationException #13097

The text was updated successfully, but these errors were encountered:

danquack · 2022-06-17T02:19:53Z

Is this similar to #671 #17096?

ghost · 2022-06-20T16:45:53Z

It is! Thanks for pointing me that way. The symptom is different but the solution may well end up being the same. I had a read through that thread and your PR to add a separate GSI resource (#22513). I can understand @ewbankkit's reasoning in rejecting the PR. At the same time, we're currently in a situation where Terraform fundamentally isn't really usable for managing a provisioned DynamoDB table in the context of a modern CD-based workflow. We use TF for all our infra at the moment but we're looking into Serverless Framework this week in part because we don't want to standardise on TF for full serverless services because of this bug. (Obviously Serverless may also have the same issue because of how the DynamoDB API is structured - we'll have to find out! Regardless, this is a real problem for our team and I'm sure it is for others too.)

github-actions · 2024-06-09T17:41:44Z

Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 30 days it will automatically be closed. Maintainers can also remove the stale label.

If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

ghost added the enhancement Requests to existing resources that expand the functionality or scope. label Jun 14, 2022

github-actions bot added needs-triage Waiting for first response or review from a maintainer. service/appautoscaling Issues and PRs that pertain to the appautoscaling service. service/dynamodb Issues and PRs that pertain to the dynamodb service. labels Jun 14, 2022

justinretzolk removed the needs-triage Waiting for first response or review from a maintainer. label Jun 14, 2022

tb00-cloud mentioned this issue Nov 11, 2023

[Bug]: aws_dynamodb_table with replica and autoscaling creates unmanaged cloudwatch alarms #34361

Closed

github-actions bot added the stale Old or inactive issues managed by automation, if no further action taken these will get closed. label Jun 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proper handling of DynamoDB replicas and GSIs with autoscaling #25331

Proper handling of DynamoDB replicas and GSIs with autoscaling #25331

ghost commented Jun 14, 2022 •

edited by ghost

danquack commented Jun 17, 2022 •

edited

ghost commented Jun 20, 2022

github-actions bot commented Jun 9, 2024

Proper handling of DynamoDB replicas and GSIs with autoscaling #25331

Proper handling of DynamoDB replicas and GSIs with autoscaling #25331

Comments

ghost commented Jun 14, 2022 • edited by ghost

Community Note

Description

Using two applys

Using a null_resource with a local provisioner

New or Affected Resource(s)

Potential Terraform Configuration

References

danquack commented Jun 17, 2022 • edited

ghost commented Jun 20, 2022

github-actions bot commented Jun 9, 2024

ghost commented Jun 14, 2022 •

edited by ghost

Using two `apply`s

Using a `null_resource` with a local provisioner

danquack commented Jun 17, 2022 •

edited