Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper handling of DynamoDB replicas and GSIs with autoscaling #25331

Open
ghost opened this issue Jun 14, 2022 · 3 comments
Open

Proper handling of DynamoDB replicas and GSIs with autoscaling #25331

ghost opened this issue Jun 14, 2022 · 3 comments
Labels
enhancement Requests to existing resources that expand the functionality or scope. service/appautoscaling Issues and PRs that pertain to the appautoscaling service. service/dynamodb Issues and PRs that pertain to the dynamodb service. stale Old or inactive issues managed by automation, if no further action taken these will get closed.

Comments

@ghost
Copy link

ghost commented Jun 14, 2022

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

Currently, it's impossible to use terraform to cleanly create a new DynamoDB table with replicas or GSIs unless its billing mode is PAY_PER_REQUEST. The reason for this is that:

  • Replicas and GSIs require autoscaling so can't be created unless off a table with autoscaling enabled
  • To enable autoscaling, the table must be created first, then the autoscaling resources created separately

This leads to a circular dependency, in which it's impossible to create a table using autoscaling and replicas/GSIs all in one go using proper terraform. There are two workarounds, both bad.

Using two applys

This workaround is described in a comment on an earlier issue on the same subject, here.

  1. First apply: Create a PROVISIONED table with no replicas or GSIs, plus create autoscaling resources applying to the table (one each of aws_appautoscaling_target and aws_appautoscaling_policy for reads, and one each for writes)
  2. Second apply: Update the table to add replicas and GSIs

The advantage of this approach is that everything you create is in the terraform state so subsequent plan and apply operations will use and return correct information. The disadvantage is that it requires running multiple applys and patching code in between, which is unpleasant at the best of times and clearly impossible in a CI/CD scenario.

Using a null_resource with a local provisioner

You create the table with autoscaling but no replicas. You then have a null_resource which depends_on the table and the autoscaling resources. This null_resource uses the AWS CLI in a local provisioner to create the replica tables and GSIs.

This approach has the advantage of only requiring one apply and no ad hoc code patching, so can be used in CI/CD. However, the replica tables and GSIs you create are effectively not in IaC - they won't be modified by subsequent changes to config, nor will they be torn down when you run a destroy. Plus, any terraform code changes that will affect them won't be shown in a plan output. This approach is therefore also very suboptimal.

New or Affected Resource(s)

Existing:

  • aws_dynamodb_table

Potential new:

  • aws_dynamodb_replica_table
  • aws_dynamodb_global_secondary_index

Potential Terraform Configuration

The workaround shown above relies on the fact that you can add replicas to an existing table without recreating it. My suggestion, then, would be to add a new resource to the provider which does this.

I've added an example config here for how the new resource would work.

## Config ##

locals {
  autoscaling_config = {
    min_read_capacity                = 100
    max_read_capacity                = 1000
    read_capacity_scaling_threshold  = 70
    min_write_capacity               = 50
    max_write_capacity               = 50
    write_capacity_scaling_threshold = 70
  }
  region = "eu-west-1"
}

## Initial table ##

resource "aws_dynamodb_table" "table" {
  lifecycle {
    # Necessary when using autoscaling otherwise TF tries to reset the scaling
    ignore_changes = [read_capacity, write_capacity]
  }

  name = "ExampleTable"

  hash_key  = "WidgetID"

  attribute {
    name = "WidgetID"
    type = "S"
  }

  billing_mode    = "PROVISIONED"
  read_capacity   = local.autoscaling_config.min_read_capacity
  write_capacity  = local.autoscaling_config.min_write_capacity
}

## Autoscaling ##

resource "aws_appautoscaling_target" "table_read_target" {
  max_capacity       = local.autoscaling_config.max_read_capacity
  min_capacity       = local.autoscaling_config.min_read_capacity
  resource_id        = "table/${aws_dynamodb_table.table.name}"
  scalable_dimension = "dynamodb:table:ReadCapacityUnits"
  service_namespace  = "dynamodb"
}

resource "aws_appautoscaling_policy" "table_read_policy" {
  name               = "DynamoRCU:${aws_appautoscaling_target.table_read_target.resource_id}"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.table_read_target.resource_id
  scalable_dimension = aws_appautoscaling_target.table_read_target.scalable_dimension
  service_namespace  = aws_appautoscaling_target.table_read_target.service_namespace

  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "DynamoDBReadCapacityUtilization"
    }
    target_value = local.autoscaling_config.read_capacity_scaling_threshold
  }
}

resource "aws_appautoscaling_target" "table_write_target" {
  max_capacity       = local.autoscaling_config.max_write_capacity
  min_capacity       = local.autoscaling_config.min_write_capacity
  resource_id        = "table/${aws_dynamodb_table.table.name}"
  scalable_dimension = "dynamodb:table:WriteCapacityUnits"
  service_namespace  = "dynamodb"
}

resource "aws_appautoscaling_policy" "table_write_policy" {
  name               = "DynamoWCU:${aws_appautoscaling_target.table_write_target.resource_id}"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.table_write_target.resource_id
  scalable_dimension = aws_appautoscaling_target.table_write_target.scalable_dimension
  service_namespace  = aws_appautoscaling_target.table_write_target.service_namespace

  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "DynamoDBWriteCapacityUtilization"
    }
    target_value = local.autoscaling_config.write_capacity_scaling_threshold
  }
}

#######################
## The new resources ##
#######################

resource "aws_dynamodb_replica_tables" "example_replica" {
  # We use depends_on and reference back to the original table ARN to ensure
  # that this resource will be create only after the original table, plus its
  # auto-scaling rules, have already been created
  depends_on = [
    aws_appautoscaling_target.table_write_target,
    aws_appautoscaling_policy.table_write_policy,
  ]
  original_table_arn = aws_dynamodb_table.table.arn

  region = local.region
}

resource "aws_dynamodb_global_secondary_index" "example_gsi" {
  # We use depends_on and reference back to the original table ARN to ensure
  # that this resource will be create only after the original table, plus its
  # auto-scaling rules, have already been created
  depends_on = [
    aws_appautoscaling_target.table_write_target,
    aws_appautoscaling_policy.table_write_policy,
  ]
  table_arn = aws_dynamodb_table.table.arn

  name               = "IndexOnClientID"
  hash_key           = "ClientID"
  attribute {
    name = "ClientID"
    type = "S"
  }
  read_capacity      = local.autoscaling_config.min_read_capacity
  write_capacity     = local.autoscaling_config.min_write_capacity
}

Using new resources here would allow for multiple API calls to take place under the hood, circumventing the need for two applys.

To be clear - this is just one suggested solution to this problem. I've got no attachment to any particular solution but this seemed like a simple way to break the architectural deadlock around this.

References

@ghost ghost added the enhancement Requests to existing resources that expand the functionality or scope. label Jun 14, 2022
@github-actions github-actions bot added needs-triage Waiting for first response or review from a maintainer. service/appautoscaling Issues and PRs that pertain to the appautoscaling service. service/dynamodb Issues and PRs that pertain to the dynamodb service. labels Jun 14, 2022
@justinretzolk justinretzolk removed the needs-triage Waiting for first response or review from a maintainer. label Jun 14, 2022
@danquack
Copy link
Contributor

danquack commented Jun 17, 2022

Is this similar to #671 #17096?

@ghost
Copy link
Author

ghost commented Jun 20, 2022

It is! Thanks for pointing me that way. The symptom is different but the solution may well end up being the same. I had a read through that thread and your PR to add a separate GSI resource (#22513). I can understand @ewbankkit's reasoning in rejecting the PR. At the same time, we're currently in a situation where Terraform fundamentally isn't really usable for managing a provisioned DynamoDB table in the context of a modern CD-based workflow. We use TF for all our infra at the moment but we're looking into Serverless Framework this week in part because we don't want to standardise on TF for full serverless services because of this bug. (Obviously Serverless may also have the same issue because of how the DynamoDB API is structured - we'll have to find out! Regardless, this is a real problem for our team and I'm sure it is for others too.)

Copy link

github-actions bot commented Jun 9, 2024

Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 30 days it will automatically be closed. Maintainers can also remove the stale label.

If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

@github-actions github-actions bot added the stale Old or inactive issues managed by automation, if no further action taken these will get closed. label Jun 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Requests to existing resources that expand the functionality or scope. service/appautoscaling Issues and PRs that pertain to the appautoscaling service. service/dynamodb Issues and PRs that pertain to the dynamodb service. stale Old or inactive issues managed by automation, if no further action taken these will get closed.
Projects
None yet
Development

No branches or pull requests

2 participants