Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: force symlink override if exists #970

Merged
merged 1 commit into from
Dec 2, 2021

Conversation

chefsale
Copy link
Contributor

@chefsale chefsale commented Dec 1, 2021

We at NEAR use a custom AMI, recently the update of the stack started failing due to a failure while creating the buildkite-agent binary symlink:

Dec  1 18:08:22 ip-10-0-2-221 user-data: ln: failed to create symbolic link ‘/usr/bin/buildkite-agent’: File exists
Dec  1 18:08:22 ip-10-0-2-221 user-data: ++ on_error 113
Dec  1 18:08:22 ip-10-0-2-221 user-data: ++ local exitCode=1
Dec  1 18:08:22 ip-10-0-2-221 user-data: ++ local errorLine=113
Dec  1 18:08:22 ip-10-0-2-221 user-data: +++ curl -X PUT -H 'X-aws-ec2-metadata-token-ttl-seconds: 60' --fail --silent --show-error --location http://169.254.169.254/latest/api/token
Dec  1 18:08:22 ip-10-0-2-221 user-data: ++ local token=AQAAAOpIVEeW-ZgjyDYWEQLzrBRHKV7UxYgIi3Rm0BON3WXY1661Aw==
Dec  1 18:08:22 ip-10-0-2-221 user-data: ++ [[ 1 != 0 ]]
Dec  1 18:08:22 ip-10-0-2-221 user-data: +++ curl -H 'X-aws-ec2-metadata-token: AQAAAOpIVEeW-ZgjyDYWEQLzrBRHKV7UxYgIi3Rm0BON3WXY1661Aw==' --fail --silent --show-error --location http://169.254.169.254/latest/meta-data/instance-id

This wasn't the case before and I assume something in error handling has potentially changed. I was able to manually change this line of code on our AMI and tried to deploy the stack, it works now. Seems like the image we used was created from a already existing/running buildkite instance and it happened to have the symlink generated causing the rollout to fail and auto-revert.

@keithduncan
Copy link
Contributor

Seems like the image we used was created from a already existing/running buildkite instance and it happened to have the symlink generated causing the rollout to fail and auto-revert.

That would cause the failure you’ve seen here if the link already exists. We can definitely add the -f flag to handle this better.

However, I would caution that it’s highly likely your agent token and tags have been incorporated into your AMI. Subsequent executions of the Elastic CI Stack’s UserData install script won’t overwrite these. If the Buildkite agent token is revoked, instances booted from this AMI won’t boot their agent successfully.

@keithduncan keithduncan merged commit be20b16 into buildkite:master Dec 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants