Invalid ParameterGroup takes 45 minutes to fail #1014

pplu · 2021-12-21T10:17:49Z

Name of the resource

AWS::RDS::DBClusterParameterGroup

Resource Name

No response

Issue Description

When creating an invalid ParameterGroup, CloudFormation will take 45 minutes to report back an error:

Timestamp	LogicalId	Status	StatusReason
2021-12-20 23:30:07 UTC+0100	CustomersClusterParamGroup	CREATE_FAILED	An internal error has occurred. Please try your query again at a later time. (Service: AmazonRDS; Status Code: 500; Error Code: InternalFailure; Request ID: REDACTED; Proxy: null)
2021-12-20 22:46:09 UTC+0100	CustomersClusterParamGroup	CREATE_IN_PROGRESS	Resource creation initiated

Expected Behavior

CloudFormation should fail immediately if the ParameterGroup is invalid, clearly stating that the ParameterGroup is invalid

Observed Behavior

It looks like CloudFormation is retrying due to the RDS API returning HTTP 500s when setting the secure_auth parameter. I'm not sure if this affects more parameters and if non-cluster parameter groups are also affected.
I've been bitten by this bug multiple times, and have had to wait a ton of time just to get the stack rolled back, when really the rollback could have happened almost immediately.

Also the error message in CloudFormation encourages you to take the wrong path (to resubmit the template), adding to the frustration of having to wait a long time. Getting this narrowed down to the parameter that is causing it is quite hard work.

This might be an undesired interaction between CloudFormation and the RDS API (it feels kind of strange that the RDS API is returning an 500 error, and telling you to retry again later).

On the CLI, you can simulate this with:

aws rds --region eu-west-1 modify-db-cluster-parameter-group --db-cluster-parameter-group-name test --parameters ParameterName=secure_auth,ParameterValue=1,ApplyMethod=pending-reboot

An error occurred (InternalFailure) when calling the ModifyDBClusterParameterGroup operation (reached max retries: 4): An internal error has occurred. Please try your query again at a later time.
[ { "ApplyMethod": "pending-reboot", "Description": "Blocks connections from all accounts that have passwords stored in the old (pre-4.1) format.", "DataType": "boolean", "IsModifiable": true, "AllowedValues": "1", "SupportedEngineModes": [ "provisioned" ], "Source": "engine-default", "ParameterValue": "1", "ParameterName": "secure_auth", "ApplyType": "dynamic" } ]

Test Cases

AWSTemplateFormatVersion: "2010-09-09"
Description: 'ParamError'
Resources:
  CustomersClusterParamGroup:
    Type: AWS::RDS::DBClusterParameterGroup
    Properties:
      Description: 'Invalid Parameter Group handling'
      Family: aurora-mysql5.7
      Parameters: 
        secure_auth: 1

Is enough to trigger this error

Other Details

No response

The text was updated successfully, but these errors were encountered:

osdrv · 2022-09-21T16:18:38Z

Thanks very much for this report @pplu. I know it's been a while. We are looking into this issue.

osdrv · 2023-02-28T17:44:36Z

The issue has been fixed. The reason it was taking so long is because the RDS API was returning an internal error which makes CFN to retry it a few times before giving up. The server-side issue was addressed.

pplu added the bug label Dec 21, 2021

cfn-github-issues-bot added this to Researching in coverage-roadmap Dec 21, 2021

cfn-github-issues-bot moved this from Researching to We're working on it in coverage-roadmap Dec 24, 2021

pplu mentioned this issue Feb 25, 2022

Invalid ParameterGroup takes 45 minutes to roll back aws-cloudformation/aws-cloudformation-resource-providers-rds#124

Closed

cfn-github-issues-bot moved this from We're working on it to Researching in coverage-roadmap Oct 7, 2022

cfn-github-issues-bot moved this from Researching to We're working on it in coverage-roadmap Feb 21, 2023

cfn-github-issues-bot moved this from We're working on it to Coming Soon in coverage-roadmap Feb 21, 2023

cfn-github-issues-bot closed this as completed Feb 28, 2023

cfn-github-issues-bot moved this from Coming Soon to Shipped in coverage-roadmap Feb 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalid ParameterGroup takes 45 minutes to fail #1014

Invalid ParameterGroup takes 45 minutes to fail #1014

pplu commented Dec 21, 2021

osdrv commented Sep 21, 2022

osdrv commented Feb 28, 2023

Invalid ParameterGroup takes 45 minutes to fail #1014

Invalid ParameterGroup takes 45 minutes to fail #1014

Comments

pplu commented Dec 21, 2021

Name of the resource

Resource Name

Issue Description

Expected Behavior

Observed Behavior

Test Cases

Other Details

osdrv commented Sep 21, 2022

osdrv commented Feb 28, 2023