New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Package authentication failure during AMI start-up causing Cassandra install to fail #66
Comments
Thanks for the report @noelmcnulty ! Could you email me your full ami.log from one of these failed nodes as well as the On launch we do a hard reset: https://github.com/riptano/ComboAMI/blob/2.5/ds0_updater.py#L16-17 to ensure the AMI has the appropriate repo keys: https://github.com/riptano/ComboAMI/tree/2.5/repo_keys So let's first try to see if the AMIs are misconfigured or competing with AWS cleaning scripts and then we'll check if something's different with the server shortly after midnight (GMT). Thanks again! |
Hi Joaquin, Noel has finished for the day but I can send you on the logs, I'll email them to you now. Unfortunately the instance has been torn down so I cannot get you the tree output but we'll monitor over the weekend and try to get the info for you should it occur again. Thanks, Lyndsey |
Okay that works. Just send over the tree output the next time you spot this issue please. I'll look over the logs today. Thanks again! |
The logs you sent look clean and seem to have imported the key correctly. I've created a ticket in our private repo to investigate this issue. Do let us know the frequencies and times of this occurrence, if possible. Thanks again! |
Hi Joaquin, More random failures over the weekend I'm afraid but this time it seems related to the devices.
This has occurred several times over the weekend but is intermittent. Any help would be much appreciated! |
This issue does seem unrelated. You may want to ensure that the devices were added during the AMI's launch. If you see this issue again, try searching the system for these extra devices. If they end up appearing when you look for them, we may be hitting a race condition with EC2 adding the devices in a delayed fashion. |
Closing this since we're unable to reproduce with the current info. Re-open if more data becomes available. |
@mlococo @joaquincasares This very same error has occured for me 4 times in a row on the DataStax Auto-Clustering AMI 2.6.1-1404-hvm ami-0c26747b image. I'm running 2 m3.large instances which is giving me the exact above mentioned log. The same log appears on both nodes and results in nothing being set up or installed. This is also from the eu-west-1 (Ireland) dc. Currently failing images on eu-west-1:
ami-7f33cd08 worked fine yesterday. Error:
|
fwiw, I get the stuck machine and try again the command: And had the following output.
So I tried adding the --force-yes
It works, excepted I have a configuration conflict but I think this would work as a workaround. To have the exacte description of the issues, I ran the command without the -y or --force-yes options and get this:
There is clearly an issue around authentication. This would be the proper thing to fix imho. Yet adding --force-yes would have avoid this, maybe it is something you want to consider ? What could be a work around to still be able to use the AMI when this kind of things occur ? Hope this help. |
This is a similar symptom, but likely a different underlying issue since this was intermittent and the current issue is consistent. Leaving closed and let's keep discussion of the new issue in #88. |
We've seen occasional CI EC2 deployments fail in the EU-West (Ireland) AWS region.
The contents of ~/datastax_ami/ami.log suggests apt authentication failures when installing the DataStax/Cassandra packages:
~/datastax_ami/ami.log:
We're using the ami-8932ccfe AMI and supplying the following user data parameters:
This is not easily repeatable and we only seem to see it during deployments which kick off shortly after midnight (GMT), but this timing way well be a coincidence.
The text was updated successfully, but these errors were encountered: