MSK EBS issue on enabling storage autoscaling #20327

devops-asService · 2021-07-27T12:00:47Z

Hi,
I have enabled ebs autoscaling for kafka cluster.
resource "aws_msk_cluster" "kafka" { ..................... broker_node_group_info { instance_type = <dummy> **ebs_volume_size = 1** ............... security_groups = [aws_security_group.kafka_private_sg.id] } ................ }
I have the ebs autoscaling enabled for this cluster. The initial ebs volume was 1 GB. Later, the ebs volume size got expanded because of the autoscalig enabled. The next time I apply some changes( after the ebs resize happened); terraform tries to reduce the ebs volume back to the 1 which was set initially and fails with a error.
It fails with the following error message
Error: error updating MSK Cluster (arn:aws:kafka:us-east-2:798036251187:cluster/Edhperf-msk-cluster/f33753ee-8a35-48dd-abb1-b0262001800a-4) broker storage: BadRequestException: To update storage, you must increase it by at least 10 GiB.
Can somebody help me on this.

The text was updated successfully, but these errors were encountered:

HackerTheMonkey · 2021-08-10T10:29:41Z

I've encountered the same issue. Try to update your TF config in a way for the initial ebs volume size matches what it currently is in AWS. In my case TF was trying to reduce it back to the initial size which is no longer the case after a couple of auto-scaling events

Note that the above is just a workaround to keep things moving. IMO, this should not really be the case and the initial size is treated as per what it really is, i.e. initial

HackerTheMonkey · 2021-08-16T20:51:32Z

The following worked well for us:

  broker_node_group_info {
    ebs_volume_size = local.ebs_volume_init_size[var.environment]   
  }
  
ignore_changes = [
      broker_node_group_info.0.ebs_volume_size
    ]

A bit of an unexpected syntax, but solved the issue and TF apply won't attempt to reset it back after it has changed via either auto/manual scaling!

VladMasarik · 2021-10-12T07:43:56Z

@HackerTheMonkey hack worked for me, although I had to use a different syntax as well

  broker_node_group_info {
    ebs_volume_size = local.ebs_volume_init_size[var.environment]   
  }

  lifecycle {
    ignore_changes = [
      broker_node_group_info.0.ebs_volume_size
    ]
  }

llnformer · 2022-02-21T15:29:18Z

is there any timeline to add support for EBS autoscaling in the aws_msk_clusterresource?

pascal-hofmann · 2022-08-02T14:09:30Z

With the latest provider version the correct syntax is:

  broker_node_group_info {
    storage_info {
      ebs_storage_info {
        volume_size = …
      }
    }
  }

  lifecycle {
    ignore_changes = [
      broker_node_group_info[0].storage_info[0].ebs_storage_info[0].volume_size
    ]
  }

yermulnik · 2023-03-14T21:28:18Z

This is not directly related though is about ebs_volume_size and storage_info.ebs_storage_info.volume_size: I'm trying to figure out whether this is a bug in provider or something local to me — since I switched from ebs_volume_size to storage_info.ebs_storage_info.volume_size I've found that Terraform doesn't pick up changes to the value of storage_info.ebs_storage_info.volume_size, so that to increase cluster storage I now need to do this manually. Does someone else experience similar issue? Is this expected behavior or do I need to file an issue for AWS Provider? Thanks.

jhovell · 2023-06-03T00:55:29Z

@yermulnik do you have autoscaling enabled and the

lifecycle {
    ignore_changes = [
      broker_node_group_info[0].storage_info[0].ebs_storage_info[0].volume_size
    ]
  }

... included in your template? if so sounds like "expected behavior" given this issue is still open.

yermulnik · 2023-06-03T14:43:42Z

@jhovell Yeah, I missed to update this thread, sorry. This indeed was lifecycle to ignore changes to volume size 🤦🏻

patrickherrera · 2024-03-05T09:56:29Z

@pascal-hofmann's comment above fixed it for me and successfully excluded external changes from the plan. However as I did not specify anything for provisioned_throughput within the ebs_storage_info block, Terraform tried to pass nulls:

          ~ storage_info {
              ~ ebs_storage_info {
                    # (1 unchanged attribute hidden)
                  - provisioned_throughput {
                      - enabled           = false -> null
                      - volume_throughput = 0 -> null
                    }
                }
            }

Which AWS didn't like:

│ Error: updating MSK Cluster (arn:aws:kafka:ap-southeast-2:123456:cluster/tf-module-msk-kafka-cluster/fca403e4-d8b4-4a30-980d-c9d0a4dad3e8-2) broker storage: operation error Kafka: UpdateBrokerStorage, https response error StatusCode: 400, RequestID: 0b9fa85f-96e6-4405-a2fc-45471b619a9c, BadRequestException: The request does not include any updates to the EBS volumes of the cluster. Verify the request, then try again.

Explicitly setting enabled to false (the default) meant that it matched the real deployment and no changes were made:

      ebs_storage_info {
        volume_size = var.initial_volume_size_gib

        provisioned_throughput {
          enabled = false
        }
      }

Ignoring changes to the entire ebs_storage_info block can also work if you haven't changed anything else:

  lifecycle {
    ignore_changes = [
      broker_node_group_info[0].storage_info[0].ebs_storage_info
    ]
  }

EDIT: Forget it. I thought I tested this thoroughly but I'm still having issues. Getting the exact same error as this: #26031 (comment), although my plan now thinks that it needs to make a change whereas what I tested above produced no plan change at all. Now I get:

      ~ broker_node_group_info {
            # (4 unchanged attributes hidden)
          ~ storage_info {
              ~ ebs_storage_info {
                    # (1 unchanged attribute hidden)
                  + provisioned_throughput {
                      + enabled = false
                    }
                }
            }
            # (1 unchanged block hidden)
        }

and the same error about no change being made. Not sure why that should be an issue anyway - the request should be idempotent and simply take no action if nothing needs to change

github-actions bot added needs-triage Waiting for first response or review from a maintainer. service/kafka Issues and PRs that pertain to the kafka service. labels Jul 27, 2021

anGie44 added the question A question about existing functionality; most questions are re-routed to discuss.hashicorp.com. label Jul 27, 2021

breathingdust removed the needs-triage Waiting for first response or review from a maintainer. label Aug 27, 2021

justinretzolk mentioned this issue May 1, 2024

[Bug]: aws_msk_cluster error out updating ebs_volume_info though there is no change #37043

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MSK EBS issue on enabling storage autoscaling #20327

MSK EBS issue on enabling storage autoscaling #20327

devops-asService commented Jul 27, 2021

HackerTheMonkey commented Aug 10, 2021 •

edited

HackerTheMonkey commented Aug 16, 2021 •

edited

VladMasarik commented Oct 12, 2021

llnformer commented Feb 21, 2022

pascal-hofmann commented Aug 2, 2022

yermulnik commented Mar 14, 2023

jhovell commented Jun 3, 2023

yermulnik commented Jun 3, 2023

patrickherrera commented Mar 5, 2024 •

edited

MSK EBS issue on enabling storage autoscaling #20327

MSK EBS issue on enabling storage autoscaling #20327

Comments

devops-asService commented Jul 27, 2021

HackerTheMonkey commented Aug 10, 2021 • edited

HackerTheMonkey commented Aug 16, 2021 • edited

VladMasarik commented Oct 12, 2021

llnformer commented Feb 21, 2022

pascal-hofmann commented Aug 2, 2022

yermulnik commented Mar 14, 2023

jhovell commented Jun 3, 2023

yermulnik commented Jun 3, 2023

patrickherrera commented Mar 5, 2024 • edited

HackerTheMonkey commented Aug 10, 2021 •

edited

HackerTheMonkey commented Aug 16, 2021 •

edited

patrickherrera commented Mar 5, 2024 •

edited