-
Notifications
You must be signed in to change notification settings - Fork 9
Create and execute an installation plan
This stage will probe your servers to discover their drives, and propose an installation plan that you can edit and execute
This assumes you have the correct environment variables and fleet is set up. You can double check this by running
$ bin/qs-fleetctl.sh list-machines
You should see all of your machines listed.
Run
$ bin/qs-update-images.sh
This downloads the docker images on the remote machines, and may take a while to complete.
To generate a new plan, run
$ bin/qs-generate-config.sh
For our setup, this prints:
$ bin/qs-generate-config.sh
checking server liveness
- node0 is alive
- node1 is alive
- node2 is alive
scanning for disks
sudo: unable to resolve host ip-172-30-2-116
- found node0::xvda
- found node0::xvdb
- found node0::xvdc
- found node0::xvdd
sudo: unable to resolve host ip-172-30-2-117
- found node1::xvda
- found node1::xvdb
- found node1::xvdc
- found node1::xvdd
sudo: unable to resolve host ip-172-30-2-118
- found node2::xvda
- found node2::xvdb
- found node2::xvdc
- found node2::xvdd
You may have seen multiple of the "sudo: unable..." warnings printed during this and previous steps. This is a harmless bug in how Amazon configures EC2 hostnames. It will not cause any problems. Warnings that say "find: ‘/dev/disk/by-id’: No such file or directory" are also harmless.
You should now find a new file in the tools' root directory named "qs-config.sh". This is your installation plan. You should edit this file, taking care to note a few things:
- At present, all hard drives are suggested for OSD formatting. Your OS
/
partition is going to be one of those. You will want to comment out or delete those lines. If you set up the servers using EC2, these drives will be called /dev/xvda. - The drive model numbers, serial numbers and size are noted, if possible. On EC2 there are no model and serial numbers. This can help you distinguish between drives.
- If you are installing on custom servers instead of EC2, and you are setting up a production cluster, you may want to replace the /dev/ path with a udev persistent block device name such as /dev/disk/by-path. This will prevent loss of service in the event that your HBAs enumerate nondeterministically or drive failure leads to disk renaming on reboot.
- If you do not require parts of the smartgridstore synchrophasor stack, simply comment out the GEN_ lines for each component you do not need.
- If you have a nonstandard Ceph configuration (less than three machines), comment out
the
CREATE_CEPH_POOL
andFORMAT_BTRDB
lines, as you will have to do that manually later.
Once you have verified that the installation plan is appropriate, remove the indicated line at the top of the file.
Now, you can run
$ bin/qs-execute-config.sh
If you have GEN_SSL_CERT
in your installation plan, you will need to accept the SSH key and enter your email address
for the generated SSL certificate. After that step, the installation process is automated and you can leave
it and go have a cup of coffee. If you have many drives configured as Ceph OSDs, you may want to go for a lunch
as formatting the drives can take some time.
Once it is complete, you can verify that everything is working with
$ bin/qs-fleetctl.sh list-units
For our configuration this lists
UNIT MACHINE ACTIVE SUB
btrdb-node0.service 7acf664b.../172.30.2.116 active running
ceph-mon-node0.service 7acf664b.../172.30.2.116 active running
ceph-mon-node1.service 6123c2e2.../172.30.2.117 active running
ceph-mon-node2.service 2cd92697.../172.30.2.118 active running
ceph-osd-node0-00.service 7acf664b.../172.30.2.116 active running
ceph-osd-node0-01.service 7acf664b.../172.30.2.116 active running
ceph-osd-node0-02.service 7acf664b.../172.30.2.116 active running
ceph-osd-node1-03.service 6123c2e2.../172.30.2.117 active running
ceph-osd-node1-04.service 6123c2e2.../172.30.2.117 active running
ceph-osd-node1-05.service 6123c2e2.../172.30.2.117 active running
ceph-osd-node2-06.service 2cd92697.../172.30.2.118 active running
ceph-osd-node2-07.service 2cd92697.../172.30.2.118 active running
ceph-osd-node2-08.service 2cd92697.../172.30.2.118 active running
mongo-node0.service 7acf664b.../172.30.2.116 active running
plotter-metadata-node0.service 7acf664b.../172.30.2.116 active running
plotter-node0.service 7acf664b.../172.30.2.116 active running
receiver-node0.service 7acf664b.../172.30.2.116 active running
You can also see that your plotter is up and running, with a valid SSL certificate:
At this time, there are no accounts, and all data is public. This will be fixed in the next step.
In the next step you will see how to manage this newly setup cluster, adding uPMUs, creating users and verifying cluster health.