Skip to content
This repository has been archived by the owner on Jan 8, 2019. It is now read-only.

MySql Restart

ekund edited this page Jun 28, 2016 · 7 revisions

It is normal that users need to bring down the chef-bach hadoop test cluster nodes since these cluster nodes can be VMs on a personal laptop and the laptops had to be rebooted. When the hypervisor host (laptop) gets rebooted and the cluster nodes are brought up most of the hadoop services start-up on node VM boot. But the mysql service which will not be brought up since the service expects one other instance to be running to start successfully. This is due to HA set-up of mysql service. Users need to bring up one instance of mysql service manually and following are the steps

The node on which you start the mysql service first should be the one on which mysql service was stopped last

  • Stop mysql on all nodes
  • Find out the node on which mysql was started last using the following command on each node
sudo find /var/lib/mysql -type f -a -name 'ib*' -printf '%T+ %p\n' | sort -r | head -1
  • Logon to the node on which mysql was stopped last.
  • Run sudo service mysql bootstrap-pxc to start mysql in bootstrap mode
  • Logon to bcpc-bootstrap node and change to chef-bcpc directory
  • Run cluster-assign-roles.sh on the other head nodes where mysql service is not started manually
./cluster-assign-roles.sh Test-Laptop hadoop bcpc-vmX
  • Once the chef client is run successfully on the other head nodes, go back to the head node where mysql has been started in bootstrap node and stop it by running sudo service mysql stop.
  • From the bootsrap host, re-chef the head node where mysql has been manually stopped:
./cluster-assign-roles.sh Test-Laptop hadoop bcpc-vmY
  • User can verify that the mysql is running fine:
    • Across the cluster by bringing up the Graphite URL https://10.0.100.5:8888. Look under: servers->hostname->haproxy->mysql-galera
    • On each machine via curl -I http://$(hostname):3307. One should get back:
    HTTP/1.1 200 OK
    Content-Type: text/plain
    Connection: close
    Content-Length: 40
    



Some documentation on various states of the quorum and how to recover:
https://www.percona.com/blog/2014/09/01/galera-replication-how-to-recover-a-pxc-cluster/