Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init should retry failed agent downloads #83

Closed
nmeyerhans opened this issue Feb 17, 2017 · 8 comments
Closed

Init should retry failed agent downloads #83

nmeyerhans opened this issue Feb 17, 2017 · 8 comments

Comments

@nmeyerhans
Copy link
Contributor

Currently if ecs-init is unable to download the ecs-agent md5 checksum file or the agent image itself, it exits. In order to make ecs-init robust against transient network or S3 issues, it should retry failed download attempts.

@grayaii
Copy link

grayaii commented Feb 28, 2017

Or better yet, have a way to NOT download ECS agent from s3 (just look at what happened today... s3 went down, and "ecs start" failed due to not being able to download from s3). If we can burn the agent onto the AMI, there should be no reason for the box to access s3 at all.

@samuelkarp
Copy link
Contributor

@grayaii When you're using the ECS-optimized AMI, the agent is already cached and does not need to be downloaded from S3. However, if you update and install a new version of ecs-init, that new version will not have the agent cached and will need to download it.

@grayaii
Copy link

grayaii commented Feb 28, 2017

@samuelkarp Thanks for the quick reply! We are not using an ECS optimized AMI; we are using our own AMIs. We burn the yum install of ecs-init into the AMI.

So basically to burn it on the AMI, we just need to run "start ecs", and THEN burn the AMI?
If so, that would be awesome! Thanks!

@samuelkarp
Copy link
Contributor

@grayaii Yes, you can cause ecs-init to cache the agent by running start ecs.

@endofcake
Copy link
Contributor

I am seeing this issue after upgrading to the latest version of ecs-init ecs-init-1.15.2-2.amzn1.x86_64.

Steps I performed:

  1. Launch a new instance off the latest ECS-optimised AMI.
  2. Configure proxy (specific to our environment) and register with ECS cluster. Everything is fine.
  3. Run yum update to update ecs-init.
  4. ECS agent is no longer able to start:
2017-11-21T21:50:44Z [INFO] pre-start
2017-11-21T21:50:44Z [INFO] Downloading Amazon EC2 Container Service Agent
2017-11-21T21:50:44Z [DEBUG] Downloading published md5sum from https://s3.amazonaws.com/amazon-ecs-agent/ecs-agent-v1.15.2.tar.md5
2017-11-21T21:50:49Z [ERROR] could not download Amazon EC2 Container Serivce Agent: Get https://s3.amazonaws.com/amazon-ecs-agent/ecs-agent-v1.15.2.tar.md5: dial tcp 54.231.120.226:443: i/o timeout
2017-11-21T21:50:49Z [INFO] post-stop

@samuelkarp
Copy link
Contributor

Configure proxy (specific to our environment) and register with ECS cluster. Everything is fine.

Did you also configure the proxy for ecs-init? You can follow these instructions to set proxy configuration in /etc/init/ecs.override.

@endofcake
Copy link
Contributor

@samuelkarp , thanks, this indeed was the issue here!

@fierlion
Copy link
Member

Closing. From the comments, I believe the issue has been resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants