-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent possible duplicate ECS services. #884
Conversation
Ok, after testing locally, I discovered that the ClientToken is linked to the service name, which means adding the ClientToken had no effect, since we were generating a random, non-deterministic postfix and appending that to the service name when creating a service. To fix this, I changed it so that the postfix is generated deterministically based on the properties of the ECS service that would require a replacement, if they were to change. Functionally, this works well and guarantees that duplicate ECS services don't get created, but I'm not a fan of the length and format of the postfix (base64 encoded sha1): If anyone has any suggestions on how to shorten this, but keep it sufficiently unique and avoid collisions, I'm all ears. Maybe truncating the sha would be fine? |
Maybe it's fine the way it is. My concerns are mostly cosmetic =. It does push things closer to the limit of the size of the service name (255) but you still have 228 characters, which is a pretty massive app + process name. |
a70ba39
to
323ca08
Compare
Ok, really happy with this now. The postfix is now the base62 encoded value of the Sum64 of the hash function, which means the maximum size of the string is 11 characters, and it's alphanumeric.
|
@@ -0,0 +1,29 @@ | |||
// Package base62 implements conversion to and from base62. Useful for url shorteners. | |||
package base62 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was stripped from an internal project.
Role *string | ||
TaskDefinition *string | ||
TaskDefinition *string `hash:"ignore"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty nifty. Fields that can be updated, we just add this tag.
// is deterministic based on the non updateable fields of the | ||
// service, otherwise ClientToken has no effect on idempotency. | ||
func postfix(p *ECSServiceProperties) (string, error) { | ||
h, err := p.Hash() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something to think of, it would be sufficient to just hash the request id instead of the properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this more, hashing the request id would be better. If we hash the props, you could potentially get into a scenario where a service gets re-created with the same name as a service that was previously deleted. We've seen a lot of weird behavior when that happens (complaining about the load balancer not existing, saying it's in a draining state, etc), so best to avoid it.
Ok, simplified this a bunch so we just generate a hash of the request id, then use that as a postfix for the service name. Think this is good to review now. |
@@ -292,7 +300,8 @@ func TestRequiresReplacement(t *testing.T) { | |||
}, | |||
} | |||
|
|||
for _, tt := range tests { | |||
for i, tt := range tests { | |||
t.Log(i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover?
👍 |
"Type": "AWS::SQS::Queue" | ||
"Type": "AWS::SQS::Queue", | ||
"Properties": { | ||
"VisibilityTimeout": "1800" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems really long - you handle de-duping requests here as well, right? Otherwise we could see things being re-submitted 30 minutes later after a crash or something. In the case where de-duping is handled, what's the idea behind making this so long? Would 5 minutes be sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good point. Probably the best thing to do would be to leave the default, and then use the ChangeMessageVisibility api call to increase it when needed.
I'll revert the current change for now, since the de-duping is sufficient to fix the current issue.
After #878 I discovered an issue where duplicate ECS services got created from the same CloudFormation update. This does 2 things to prevent that from happening: