Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package authentication failure during AMI start-up causing Cassandra install to fail #66

Closed
noelmcnulty opened this issue Feb 13, 2015 · 10 comments

Comments

@noelmcnulty
Copy link

We've seen occasional CI EC2 deployments fail in the EU-West (Ireland) AWS region.

The contents of ~/datastax_ami/ami.log suggests apt authentication failures when installing the DataStax/Cassandra packages:

~/datastax_ami/ami.log:

...
[EXEC] 02/12/15-16:29:19 sudo apt-get install -y python-cql datastax-agent cassandra=2.0.11 dsc20=2.0.11-1:
Reading package lists...
Building dependency tree...
Reading state information...
The following extra packages will be installed:
  python-thrift-basic
The following NEW packages will be installed:
  cassandra datastax-agent dsc20 python-cql python-thrift-basic
0 upgraded, 5 newly installed, 0 to remove and 122 not upgraded.
Need to get 40.6 MB of archives.
After this operation, 46.8 MB of additional disk space will be used.
WARNING: The following packages cannot be authenticated!
  cassandra datastax-agent dsc20 python-thrift-basic python-cql

[ERROR] 02/12/15-16:29:19 sudo service cassandra stop:
cassandra: unrecognized service
...

We're using the ami-8932ccfe AMI and supplying the following user data parameters:

--clustername myTestCluster --totalnodes 1 --version community --release 2.0.11

This is not easily repeatable and we only seem to see it during deployments which kick off shortly after midnight (GMT), but this timing way well be a coincidence.

@joaquincasares
Copy link
Contributor

Thanks for the report @noelmcnulty !

Could you email me your full ami.log from one of these failed nodes as well as the tree output of ~/datastax_ami?

On launch we do a hard reset:

https://github.com/riptano/ComboAMI/blob/2.5/ds0_updater.py#L16-17

to ensure the AMI has the appropriate repo keys:

https://github.com/riptano/ComboAMI/tree/2.5/repo_keys

So let's first try to see if the AMIs are misconfigured or competing with AWS cleaning scripts and then we'll check if something's different with the server shortly after midnight (GMT).

Thanks again!

@lyndseyparsons
Copy link

Hi Joaquin,

Noel has finished for the day but I can send you on the logs, I'll email them to you now. Unfortunately the instance has been torn down so I cannot get you the tree output but we'll monitor over the weekend and try to get the info for you should it occur again.

Thanks,

Lyndsey

@joaquincasares
Copy link
Contributor

Okay that works. Just send over the tree output the next time you spot this issue please.

I'll look over the logs today.

Thanks again!

@joaquincasares
Copy link
Contributor

The logs you sent look clean and seem to have imported the key correctly. I've created a ticket in our private repo to investigate this issue. Do let us know the frequencies and times of this occurrence, if possible.

Thanks again!

@lyndseyparsons
Copy link

Hi Joaquin,

More random failures over the weekend I'm afraid but this time it seems related to the devices.

[INFO] address.yaml configured.
[EXEC] 02/16/15-01:13:31 sudo chmod 777 /etc/fstab
[EXEC] 02/16/15-01:13:31 sudo chmod 644 /etc/fstab
[INFO] Unformatted devices: []
[INFO] Clear "invalid flag 0x0000 of partition table 4" by issuing a write, then running fdisk on the device...
[ERROR] Exception seen in ds1_launcher.py:
Traceback (most recent call last):
  File "/home/ubuntu/datastax_ami/ds1_launcher.py", line 22, in initial_configurations
    ds2_configure.run()
  File "/home/ubuntu/datastax_ami/ds2_configure.py", line 1153, in run
  File "/home/ubuntu/datastax_ami/ds2_configure.py", line 1010, in prepare_for_raid
  File "/home/ubuntu/datastax_ami/ds2_configure.py", line 956, in format_xfs
IndexError: list index out of range

This has occurred several times over the weekend but is intermittent. Any help would be much appreciated!

@joaquincasares
Copy link
Contributor

This issue does seem unrelated. You may want to ensure that the devices were added during the AMI's launch. If you see this issue again, try searching the system for these extra devices. If they end up appearing when you look for them, we may be hitting a race condition with EC2 adding the devices in a delayed fashion.

@mlococo
Copy link
Contributor

mlococo commented May 6, 2015

Closing this since we're unable to reproduce with the current info. Re-open if more data becomes available.

@mlococo mlococo closed this as completed May 6, 2015
@kareblak
Copy link

@mlococo @joaquincasares This very same error has occured for me 4 times in a row on the DataStax Auto-Clustering AMI 2.6.1-1404-hvm ami-0c26747b image. I'm running 2 m3.large instances which is giving me the exact above mentioned log. The same log appears on both nodes and results in nothing being set up or installed. This is also from the eu-west-1 (Ireland) dc.

Currently failing images on eu-west-1:

  • ami-7f33cd08 (2.5.1)
  • ami-e0207297 (2.6.1, 12.04)
  • ami-0c26747b (2.6.1, 14.04)

ami-7f33cd08 worked fine yesterday.

Error:

The following NEW packages will be installed:
  cassandra datastax-agent dsc22 python-cql python-thrift-basic
0 upgraded, 5 newly installed, 0 to remove and 2 not upgraded.
Need to get 47.3 MB of archives.
After this operation, 58.9 MB of additional disk space will be used.
WARNING: The following packages cannot be authenticated!
  cassandra datastax-agent dsc22 python-thrift-basic python-cql

[ERROR] 08/14/15-11:38:27 sudo service cassandra stop:
cassandra: unrecognized service

[EXEC] 08/14/15-11:38:27 sudo rm -rf /var/lib/cassandra
[EXEC] 08/14/15-11:38:27 sudo rm -rf /var/log/cassandra
[EXEC] 08/14/15-11:38:27 sudo mkdir -p /var/lib/cassandra
[EXEC] 08/14/15-11:38:27 sudo mkdir -p /var/log/cassandra
[ERROR] 08/14/15-11:38:27 sudo chown -R cassandra:cassandra /var/lib/cassandra:
chown: invalid user: `cassandra:cassandra'

[ERROR] 08/14/15-11:38:28 sudo chown -R cassandra:cassandra /var/log/cassandra:
chown: invalid user: `cassandra:cassandra'

[EXEC] 08/14/15-11:38:28 sudo mv /etc/security/limits.d/cassandra.conf.bak /etc/security/limits.d/cassandra.conf
[INFO] Installing OpsCenter...
[EXEC] 08/14/15-11:38:28 sudo apt-get install -y opscenter libssl0.9.8:
Reading package lists...
Building dependency tree...
Reading state information...
The following package was automatically installed and is no longer required:
  grub-pc-bin
Use 'apt-get autoremove' to remove them.
The following NEW packages will be installed:
  libssl0.9.8 opscenter
0 upgraded, 2 newly installed, 0 to remove and 2 not upgraded.
Need to get 77.5 MB of archives.
After this operation, 103 MB of additional disk space will be used.
WARNING: The following packages cannot be authenticated!
  opscenter

[ERROR] 08/14/15-11:38:28 sudo service opscenterd stop:
opscenterd: unrecognized service

[INFO] Reflector loop...
[INFO] 08/14/15-11:38:28 Reflector: Received 1 of 1 responses from: [u'172.31.21.194']
[INFO] Seed list: set([u'172.31.21.194'])
[INFO] OpsCenter: 172.31.21.194
[INFO] Options: Namespace(analyticsnodes=0, base64postscript=None, bootstrap=False, cfsreplication=None, clustername='shiplog-cassandra', customreservation=None, email=None, hadoop=False, heapsize=None, multiregion=False, opscenter=None, opscenterinterface=None, opscenterip=None, opscenteronly=False, opscenterssl=False, password='wo2FoHE8f0gUYqQh', raidonly=False, realtimenodes=2, reflector=None, release=None, rpcbinding=False, searchnodes=0, seed_indexes=[0, 2, 2], seeds=None, totalnodes=2, username='kaare@shiplog.no', version='community', vnodes=False)
[ERROR] Exception seen in ds1_launcher.py:
Traceback (most recent call last):
  File "/home/ubuntu/datastax_ami/ds1_launcher.py", line 22, in initial_configurations
    ds2_configure.run()
  File "/home/ubuntu/datastax_ami/ds2_configure.py", line 1178, in run
  File "/home/ubuntu/datastax_ami/ds2_configure.py", line 577, in construct_yaml
IOError: [Errno 2] No such file or directory: '/etc/cassandra/cassandra.yaml'

@arodrime
Copy link

fwiw, I get the stuck machine and try again the command: sudo apt-get install -y python-cql datastax-agent cassandra=2.0.16 dsc20=2.0.16-1

And had the following output.

ubuntu@ip-10-0-102-235:~$ sudo apt-get install -y python-cql datastax-agent cassandra=2.0.16 dsc20=2.0.16-1
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following package was automatically installed and is no longer required:
  grub-pc-bin
Use 'apt-get autoremove' to remove them.
The following extra packages will be installed:
  python-thrift-basic
The following NEW packages will be installed:
  cassandra datastax-agent dsc20 python-cql python-thrift-basic
0 upgraded, 5 newly installed, 0 to remove and 142 not upgraded.
Need to get 37.5 MB of archives.
After this operation, 43.3 MB of additional disk space will be used.
WARNING: The following packages cannot be authenticated!
  cassandra datastax-agent dsc20 python-thrift-basic python-cql
E: There are problems and -y was used without --force-yes

So I tried adding the --force-yes

ubuntu@ip-10-0-102-235:~$ sudo apt-get install --force-yes -y python-cql datastax-agent cassandra=2.0.16 dsc20=2.0.16-1
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following package was automatically installed and is no longer required:
  grub-pc-bin
Use 'apt-get autoremove' to remove them.
The following extra packages will be installed:
  python-thrift-basic
The following NEW packages will be installed:
  cassandra datastax-agent dsc20 python-cql python-thrift-basic
0 upgraded, 5 newly installed, 0 to remove and 142 not upgraded.
Need to get 37.5 MB of archives.
After this operation, 43.3 MB of additional disk space will be used.
WARNING: The following packages cannot be authenticated!
  cassandra datastax-agent dsc20 python-thrift-basic python-cql
Get:1 http://debian.datastax.com/community/ stable/main cassandra all 2.0.16 [14.5 MB]
Get:2 http://debian.datastax.com/community/ stable/main datastax-agent all 5.2.0 [22.8 MB]                                                                                                                 
Get:3 http://debian.datastax.com/community/ stable/main dsc20 all 2.0.16-1 [1,308 B]                                                                                                                       
Get:4 http://debian.datastax.com/community/ stable/main python-thrift-basic all 0.8.0-1~ds+1 [70.6 kB]                                                                                                     
Get:5 http://debian.datastax.com/community/ stable/main python-cql all 1.4.0-1 [59.2 kB]                                                                                                                   
Fetched 37.5 MB in 16s (2,320 kB/s)                                                                                                                                                                        
Selecting previously unselected package cassandra.
(Reading database ... 87806 files and directories currently installed.)
Unpacking cassandra (from .../cassandra_2.0.16_all.deb) ...
Selecting previously unselected package datastax-agent.
Unpacking datastax-agent (from .../datastax-agent_5.2.0_all.deb) ...
Selecting previously unselected package dsc20.
Unpacking dsc20 (from .../dsc20_2.0.16-1_all.deb) ...
Selecting previously unselected package python-thrift-basic.
Unpacking python-thrift-basic (from .../python-thrift-basic_0.8.0-1~ds+1_all.deb) ...
Selecting previously unselected package python-cql.
Unpacking python-cql (from .../python-cql_1.4.0-1_all.deb) ...
Processing triggers for ureadahead ...
Setting up cassandra (2.0.16) ...

Configuration file `/etc/security/limits.d/cassandra.conf'
 ==> File on system created by you or by a script.
 ==> File also in package provided by package maintainer.
   What would you like to do about it ?  Your options are:
    Y or I  : install the package maintainer's version
    N or O  : keep your currently-installed version
      D     : show the differences between the versions
      Z     : start a shell to examine the situation
 The default action is to keep your current version.
*** cassandra.conf (Y/I/N/O/D/Z) [default=N] ?

It works, excepted I have a configuration conflict but I think this would work as a workaround.

To have the exacte description of the issues, I ran the command without the -y or --force-yes options and get this:

WARNING: The following packages cannot be authenticated! cassandra datastax-agent dsc20 python-thrift-basic python-cql Install these packages without verification [y/N]?

There is clearly an issue around authentication. This would be the proper thing to fix imho. Yet adding --force-yes would have avoid this, maybe it is something you want to consider ? What could be a work around to still be able to use the AMI when this kind of things occur ?

Hope this help.

@mlococo
Copy link
Contributor

mlococo commented Aug 14, 2015

This is a similar symptom, but likely a different underlying issue since this was intermittent and the current issue is consistent. Leaving closed and let's keep discussion of the new issue in #88.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants