New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a workaround for CloudFormation update issues #162
Conversation
A bug in how CloudFormation processes updates for LBs can result in complete misconfiguration. The AWS::ElasticLoadBalancingV2::Listener resource must have exactly one certificate in the CertificateArn property, which is then treated as the default certificate. All the rest are managed in a separate resource of type AWS::ElasticLoadBalancingV2::ListenerCertificate. Unfortunately, the update is not performed atomically. CloudFormation updates the Listener first, and only after it's done it creates a new ListenerCertificate resource and deletes the old one. However, if the previously default certificate swaps positions with one from the ListenerCertificate list, the resulting race conditions in AWS will result in a completely broken configuration. One of the certificates will be completely missing from the LB and you might or might not get update errors on the CF stack. Implement the workaround suggested by AWS support, which is to change the name of the ListenerCertificate resource instead of updating an existing one. To make the resource name predictable, certificate ARNs are now sorted and their hash is used in the resource name. A nice side effect is that the default certificate will always be a wildcard one instead of an arbitrary one. Additionally, the first certificate is now present in both Listener and ListenerCertificate resources. This doesn't seem to break anything, but will stop CloudFormation from deleting the default certificate if it stops being the default one. Signed-off-by: Alexey Ermakov <alexey.ermakov@zalando.de>
👍 |
1 similar comment
👍 |
What if you have a second wildcard certifcate created that is lexicographically sorted before the first one? |
@szuecs the certificates were already configured in an arbitrary order (depending on hashmap ordering), so nobody should've depended on it. Making a wildcard certificate the default one wasn't really a goal, just a nice side-effect. |
@aermakov-zalando @szuecs and this should not really break as I understand it right? Because the "main" certificate will be stored in the |
@mikkeloscar @szuecs If we want to implement some logic to determine the best default certificate we can do it later, but this change shouldn't break anything. |
This fixes a couple of bugs related to exceeding the maximum number of certificates per ALB. It limits the max size to 24 instead of 25. This is done because we need to duplicate the default certificate to work around a CloudFormation bug (#162) and therefore need one extra space for this limiting the maximum unique certificates per ALB to 24 instead of the AWS limit of 25. It also fixes a bug in `AddIngress` which could potentially add some of the certificates for a single ingress to a stack and the rest to another stack resulting in an undesired state. Thirdly it adds rollback complete states to `IsComplete()` to automatically attempt to update stacks that are in a rollback complete state. Lastly it removes some of the getter methods from structs and instead exports the fields that are needed in other packages. This was done to make it easier to use the structs in tests and because we are not writing Java. Fix #176, #175 Signed-off-by: Mikkel Oscar Lyderik Larsen <m@moscar.net>
This fixes a couple of bugs related to exceeding the maximum number of certificates per ALB. It limits the max size to 24 instead of 25. This is done because we need to duplicate the default certificate to work around a CloudFormation bug (#162) and therefore need one extra space for this limiting the maximum unique certificates per ALB to 24 instead of the AWS limit of 25. It also fixes a bug in `AddIngress` which could potentially add some of the certificates for a single ingress to a stack and the rest to another stack resulting in an undesired state. Thirdly it adds rollback complete states to `IsComplete()` to automatically attempt to update stacks that are in a rollback complete state. Lastly it removes some of the getter methods from structs and instead exports the fields that are needed in other packages. This was done to make it easier to use the structs in tests and because we are not writing Java. Fix #176, #175 Signed-off-by: Mikkel Oscar Lyderik Larsen <m@moscar.net>
This fixes a couple of bugs related to exceeding the maximum number of certificates per ALB. It limits the max size to 24 instead of 25. This is done because we need to duplicate the default certificate to work around a CloudFormation bug (#162) and therefore need one extra space for this limiting the maximum unique certificates per ALB to 24 instead of the AWS limit of 25. It also fixes a bug in `AddIngress` which could potentially add some of the certificates for a single ingress to a stack and the rest to another stack resulting in an undesired state. Thirdly it adds rollback complete states to `IsComplete()` to automatically attempt to update stacks that are in a rollback complete state. Fix #176, #175 Signed-off-by: Mikkel Oscar Lyderik Larsen <m@moscar.net>
This fixes a couple of bugs related to exceeding the maximum number of certificates per ALB. It limits the max size to 24 instead of 25. This is done because we need to duplicate the default certificate to work around a CloudFormation bug (#162) and therefore need one extra space for this limiting the maximum unique certificates per ALB to 24 instead of the AWS limit of 25. It also fixes a bug in `AddIngress` which could potentially add some of the certificates for a single ingress to a stack and the rest to another stack resulting in an undesired state. Thirdly it adds rollback complete states to `IsComplete()` to automatically attempt to update stacks that are in a rollback complete state. Fix #176, #175 Signed-off-by: Mikkel Oscar Lyderik Larsen <m@moscar.net>
* Remove getter and setters for easier testing Signed-off-by: Mikkel Oscar Lyderik Larsen <m@moscar.net> * Fix handling limit of certificates per ALB This fixes a couple of bugs related to exceeding the maximum number of certificates per ALB. It limits the max size to 24 instead of 25. This is done because we need to duplicate the default certificate to work around a CloudFormation bug (#162) and therefore need one extra space for this limiting the maximum unique certificates per ALB to 24 instead of the AWS limit of 25. It also fixes a bug in `AddIngress` which could potentially add some of the certificates for a single ingress to a stack and the rest to another stack resulting in an undesired state. Thirdly it adds rollback complete states to `IsComplete()` to automatically attempt to update stacks that are in a rollback complete state. Fix #176, #175 Signed-off-by: Mikkel Oscar Lyderik Larsen <m@moscar.net> * Vendor with dep Signed-off-by: Mikkel Oscar Lyderik Larsen <m@moscar.net> * Check for nil stack Signed-off-by: Mikkel Oscar Lyderik Larsen <m@moscar.net> * Sync load balancer state with stack state Signed-off-by: Mikkel Oscar Lyderik Larsen <m@moscar.net>
A bug in how CloudFormation processes updates for LBs can result in complete misconfiguration. The
AWS::ElasticLoadBalancingV2::Listener
resource must have exactly one certificate in the CertificateArn property, which is then treated as the default certificate. All the rest are managed in a separate resource of typeAWS::ElasticLoadBalancingV2::ListenerCertificate
. Unfortunately, the update is not performed atomically. CloudFormation updates theListener
first, and only after it's done it creates a newListenerCertificate
resource and deletes the old one.However, if the previously default certificate swaps positions with one from the
ListenerCertificate
list, the resulting race conditions in AWS will result in a completely broken configuration. One of the certificates will be completely missing from the LB and you might or might not get update errors on the CF stack.Implement the workaround suggested by AWS support, which is to change the name of the
ListenerCertificate
resource instead of updating an existing one. To make the resource name predictable, certificate ARNs are now sorted and their hash is used in the resource name. A nice side effect is that the default certificate will always be a wildcard one instead of an arbitrary one.Additionally, the first certificate is now present in both
Listener
andListenerCertificate
resources. This doesn't seem to break anything, but will stop CloudFormation from deleting the default certificate if it stops being the default one.