…disks. -1s == -1M and works better.
…n the partition we create.
…th components that can be installed on the admin node.
…ices Switch to using filesystem UUIDs for mouting and mountpoint naming. This makes the order in which disks were enumerated irrelavent, and enables us to ignore disks that we did not configure (along with printing out a warninig during the chef-client run).
Avoid sleeping if we don't have to when testing to see what nodes have what roles. Make sure we are looking at the "right" line when checking for hadoop service liveness with ps -aux.
Annoyingly enough, the nodes might not be fully indexed by the time we try to find out what roles are on what machines. We will have to spin on finding nodes until everything is indexed. Equally annoying, the hadoop service machinery has an annoying tendency to report that the pidfile exists but the service is dead even when the service is in fact alive and well. Switch to grepping the output of ps to check for service liveness instead.
* The disk recipies were not properly segregating the hdfs and mapreduce data on the data drives on the slaves, and they were not creating mapred directories on each drive. configure-disks was modified to pass correct paths for dfs_data_dir and mapred_local_dir to the slavenode recipe for each drive it configured. * The disk handling loop in configure-disks was modified to use an array intead of a hash, and the array is sorted before we do anything. This helps keep the order in which we perform operations on the drives be predictable, although it is not "perfect" behaviour. * Removed lost+found handling from the slavenode recipe, since we no longer store hdfs data right at the root of the drives.
…ges. * We no longer have a disk_configured flag. Instead, configure-disks is idempotent. This allows the cluster to more or less seamlessly handle disks coming and going (for replacment or whatever) without having to change anything in Crowbar. * configure-disks no longer uses a custom-compiled parted. The rewritten recipe just tells parted to create a single GPT partition that spans the entire drive if there is no first partition. This takes advantage of the newly added nuke drives before hardware isntalling phase. We also nuke the first 65K of any newly-created partitions to ensure that filesystem checks are not confused by legacy data if any happens to be there. * configure-disks will have parted try to align the start of the partition to start 1MB into the drive. This will make sure that the resulting filesystem is optimally alighend to minimize RMW cycles for virtually every drive type out there. * configure-disks takes care to make sure the kernel knows about any and all partitioning changes. We partprobe each drive before and after messing with the partitions to ensure that we don't have any confusion about the partition tables on the disk vs. what the kernel sees. We also minimize reliance on device files in /dev, because it can take udev a second or two to catch up with the kernel when things change. * Use blkid instead of tune2fs for filesystem presence testing. This makes it easier to port the code to handle different filesystems. * Use fork/exec instead of threads when formatting the drives. This gets the same parallelization that threads does, but it is at the proper level of granularity on a Unix system, and is slightly simpler and more robust. * Move responsibility for creating the dfs_data_dir array into configure_disks.
* It is OK to be missing your slavenodes when the cluster is coming up. In Hadoop, slavenodes are ephemeral, and we should not mark the config as invalid just because we don't have them. Besides, this eliminates and unneeded round of chef-client calls. * Do not bounce the namenode and jobtracker services every time the slaves file changes. The services have built-in logic to handle slavenodes coming and going and they watch the slaves file for changes -- the most we may have to to is ask them to rescan, but even that is not needed most of the time.
This smoketest will: * Deploy Hadoop. We change dfs.replication to 1, and we do not reserve any space on the data drives. * Verify that all the required services are running on the nodes. If they are not, it will run chef-client and sleep to get the cluster to synchronize. We force the runs because we do not want to wait the 15 - 30 minutes it may usually take to get the cluster fully running. * Make sure we have enough time for all the chef-clients and sleeps to run by giving us 900 seconds to finish the test.