Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retries for grabbing ec2 metadata #381

Merged
merged 5 commits into from
Oct 3, 2016
Merged

retries for grabbing ec2 metadata #381

merged 5 commits into from
Oct 3, 2016

Conversation

davidk-zenefits
Copy link
Contributor

When starting several agents in quick succession some of the agents fail to populate with the proper EC2 metadata for tags. I suspect it has something to do with grabbing IAM credentials and throttling on AWS side. Retries with some random delay should fix the problem.

When starting several agents in quick succession some of the agents fail to populate with the proper EC2 metadata for tags. I suspect it has something to do with grabbing IAM credentials and throttling on AWS side. Retries with some random delay should fix the problem.
@keithpitt
Copy link
Member

Hey @davidk-zenefits - this is a good change! Thanks so much for the PR!

We have a our own retry function that can use here:

err = retry.Do(func(s *retry.Stats) error {

If you're happy to update the code to use that (if you want to extend retry to add some randomness to it, then that's cool to) I'd happily merge this in and push out a stable release :)

@davidk-zenefits
Copy link
Contributor Author

Cool. I'll try.

@davidk-zenefits
Copy link
Contributor Author

I think I got all the errors. At least running go build main.go no longer gives any errors. Let me know if there are more changes that are necessary @keithpitt.

@keithpitt keithpitt merged commit 3ac4fd5 into buildkite:master Oct 3, 2016
@keithpitt
Copy link
Member

Thanks @davidk-zenefits! I just merged these changes into master. I made a few tweaks: added some extra logging and moved your sleep stuff directly into our retry gear - since it's a really good idea!

Which version of the agent are you currently running? I'll make sure to get a new release out in the next few days.

@davidk-zenefits
Copy link
Contributor Author

Fantastic. Thanks a lot. According to agent overview page we are running User Agent: buildkite-agent/2.2 (linux; amd64). We've already worked around the issue on our end by adding a sleep between each agent start call to avoid the AWS throttling so the metadata is populated correctly now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants