Skip to content
No description or website provided.
Java Shell
Find file
Pull request Compare This branch is 445 commits ahead, 2 commits behind tomwhite:master.
Latest commit 3cfc7e6 @ggear ggear Merge pull request #59 from ggear/master
Update for CM5.1 / CDH5.1

Whirr Cloudera Manager Plugin

The Whirr-CM plugin provides the ability to boostrap, provision, initialise and then manage the full lifecycle of a Cloudera Manager (CM) CDH cluster.


This plugin has dependencies on Whirr and CM as per the pom, but remains backwards compatible with:

  • Whirr-0.9+
  • CM5+ (CM API 3+)
  • CDH4+

The plugin has been tested extensively on RHEL and Debian derivatives and ships with integration tests targeting:

  • CentOS 6.5
  • Ubuntu LTS 12.04

Installing and configuring Whirr

Run the following commands from you local machine or edge node with access to your infrastructure providers resources. Note that the latter is often preferable, providing a cluster client edge node for using your cluster, catering for:

  • network brownouts
  • minimal latency
  • private host/IP bindings

Set your infrastructure providers credentials:

For Amazon EC2:


Install Whirr:

Install the CDH repositories, eg for CDH5:

sudo rpm -ivh

then install the Whirr package and create some environment variables, eg for RHEL/CentOS:

yum install whirr
export WHIRR_HOME=/usr/lib/whirr

Create a password-less SSH keypair for Whirr to use:

ssh-keygen -t rsa -P '' -f ~/.ssh/whirr

Install Whirr-CM (optional)

As of CDH4.2, Whirr ships with the Whirr-CM plugin, but in the event that you would like to replace this, you can follow these instructions (take note of any files copied during the 'mvn dependency:copy-dependencies' command, this may indicate old and now stale versions have been replaced and should be deleted from $WHIRR_HOME/lib)

git clone
cd whirr-cm
mvn clean install -DskipTests
mvn dependency:copy-dependencies -DincludeScope=runtime -DoverWriteIfNewer=false -DoutputDirectory=$WHIRR_HOME/lib
rm -rf $WHIRR_HOME/lib/whirr-cdh-* $WHIRR_HOME/lib/whirr-cm-*
cp -rvf target/whirr-cm-*.jar $WHIRR_HOME/lib

Launch a Cloudera Manager managed CDH Cluster

A sample Whirr-CM EC2 config is available ( which should be locally copied to your Whirr client host:


If you would like to upload a CM License as part of the installation (Cloudera can provide this if you do not have one), place the license in a file "cm-license.txt" (or with a file name of your chossing specified via the Whirr '' paramater) on the Whirr classpath (eg in $WHIRR_HOME/conf), eg

mv -v eval_acme_20120925_cloudera_enterprise_license.txt $WHIRR_HOME/conf/cm-license.txt

As specified in the example file, the following command will start a cluster with 7 nodes, 1 CM server, 3 master and 3 slave nodes. To change the cluster topology, edit the file.

whirr launch-cluster --config

Whirr will report progress to the console as it runs and will exit once complete.

During the various phases of execution, the Whirr-CM plugin will report the CM Web Console URL, e.g. pre-provision

Whirr Handler -----------------------------------------------------------------
Whirr Handler [CMClusterProvision] 
Whirr Handler -----------------------------------------------------------------
Whirr Handler 
Whirr Handler [CMClusterProvision] follow live at

and post-provision:

Whirr Handler [CMClusterProvision] CM AGENTS
Whirr Handler [CMClusterProvision]   ssh -o StrictHostKeyChecking=no -i /root/.ssh/whirr whirr@
Whirr Handler [CMClusterProvision]   ssh -o StrictHostKeyChecking=no -i /root/.ssh/whirr whirr@
Whirr Handler [CMClusterProvision]   ssh -o StrictHostKeyChecking=no -i /root/.ssh/whirr whirr@
Whirr Handler [CMClusterProvision]   ssh -o StrictHostKeyChecking=no -i /root/.ssh/whirr whirr@
Whirr Handler [CMClusterProvision] CM SERVER
Whirr Handler [CMClusterProvision]
Whirr Handler [CMClusterProvision]   ssh -o StrictHostKeyChecking=no -i /root/.ssh/whirr whirr@

You are able to log into the CM Web Console (or hosts) at any stage and observe proceedings, via the async, real time UI.

The default admin user credentials are:

Username: admin 
Password: admin 

Manage the CDH cluster with CM

The Whirr property '', as set in, determines whether the Whirr CM plugin provisions, initialises and starts a new CDH cluster (true) or merely provisions the CM Server and Agents to allow manual CDH cluster management through the CM Web Console (false).

Other Whirr properties can be used to affect the cluster provision process, refer to the example for more details.

You can have Whirr report the currently running cluster nodes at any time:

whirr list-cluster --config

or query the Whirr CM plugin services:

whirr list-services --config

As well as supporting all the standard Whirr commands, the Whirr-CM plugin provides and or augments the following commands:

  • init-cluster
  • download-config
  • create-services
  • start-services
  • restart-services
  • stop-services
  • destroy-services
  • launch-cluster
  • clean-cluster
  • destroy-cluster

Where appropriate, these commands can be filtered by role via the '--roles' command line switch, bearing in mind that 'cm-server' is a mandatory role (all operations require it) and all lifecycle commands operate on the parent of the role to ensure consistency. For example, to issue a HDFS service start:

whirr start-services --roles cm-server,cm-cdh-datanode --config

A custom CM cluster name can be provided to most of the commands via the '--cm-cluster-name' switch. For example, to clean the "My Cluster" from CM:

whirr clean-cluster --cm-cluster-name "My Cluster" --config

Full command documentation is available as part of Whirr, for example:

whirr help clean-cluster

Use the CDH cluster

The host you have run Whirr from can be used as a CDH client, as long as it can see the cluster nodes (host,forward/reverse DNS) and has the same version of CDH installed (although not necessarily via the same means, eg parcels/RPMs/.debs etc).

The 'whirr.client-cidrs' can be used to ensure the clients IP ranges are accepted through the cloud providers security controls and host firewalls, see

To download the CDH cluster client config to the whirr cluster working directory ($HOME/.whirr/whirr)

whirr download-config --config

This will then allow the CDH clients to be executed, for example to list files in HDFS root directory:

hadoop --config $HOME/.whirr/whirr fs -ls /

Alternatively, you can interact with the cluster via a CM gateway node from within your cluster.

Shutdown the cluster

Finally, when you want to shutdown the cluster, run the following command. Note that all data and state stored on the cluster will be lost.

whirr destroy-cluster --config

Troubleshooting a failed Cluster launch

During launch, errors will be printed to the console indicating root cause, halting operation and cleaning up any cluster resources that were provisioned. In some rare cases, Whirr can block for an extended time without an error message (eg node disk space exhaustion, network partition etc) in which case you can investigate the issue by looking at the client side whirr.log in the current dir, cloud web console, the nodes via SSH or Cloudera Manager depending upon which stage Whirr has got to.

If the host bootstrap has failed, you should investigate the image you are using, manually creating an instance from it and attempting connection via the whirr.bootstrap-user (ec2-user) by SSH and troubleshooting. If this is successful, you should attempt to connect to the nodes provisioned by Whirr, by using the hosts specified in the cloud web console, the whirr.cluster-user (whirr) and the whirr.private-key-file (~/.ssh/whirr). If the connections succeeds, it is likely Whirr is still running and or blocking and the scripts and logs Whirr stages to /tmp on each node should help to determine the root cause:

ssh -o StrictHostKeyChecking=no -i ~/.ssh/whirr "ls -la /tmp/bootstrap* /tmp/configure*"

If you destroy the instances Whirr has created outside of Whirr (ie through a cloud web console) you will have to clean up the Whirr instance store before re-running:

rm -rf ~/.whirr/mycluster

Unit and Integration Tests

This project includes a full suite of unit tests, launched via:

mvn clean test

Integration tests that run against Amazon EC2 are also available, run as so:

mvn clean verify -DskipUTs -DskipITs=false -Dwhirr.test.identity=$AWS_ACCESS_KEY -Dwhirr.test.credential=$AWS_SECRET_KEY

Be wary though, these tests will take some time to execute, so it is advisable to use individual test cases for iterative testing, eg:

mvn clean test -Dtest=CmServerCommandIT#testCreateServices -Dwhirr.test.identity=$AWS_ACCESS_KEY -Dwhirr.test.credential=$AWS_SECRET_KEY
mvn clean test -Dtest=CmServerCommandIT#testStartServices -Dwhirr.test.identity=$AWS_ACCESS_KEY -Dwhirr.test.credential=$AWS_SECRET_KEY
mvn clean test -Dtest=CmServerCommandIT#testServiceLifecycle -Dwhirr.test.identity=$AWS_ACCESS_KEY -Dwhirr.test.credential=$AWS_SECRET_KEY

You can optionally specify a target platform, CM, CM API and CDH version via the following system properties (defaulting to the parameters defined in the example configuration and or the latest versions available):

  • whirr.test.platform (centos | ubuntu)

For example:

mvn clean test -Dtest=CmServerCommandIT#testCreateServices -Dwhirr.test.identity=$AWS_ACCESS_KEY -Dwhirr.test.credential=$AWS_SECRET_KEY -Dwhirr.test.platform=centos
mvn clean test -Dtest=CmServerCommandIT#testCreateServices -Dwhirr.test.identity=$AWS_ACCESS_KEY -Dwhirr.test.credential=$AWS_SECRET_KEY
mvn clean test -Dtest=CmServerCommandIT#testCreateServices -Dwhirr.test.identity=$AWS_ACCESS_KEY -Dwhirr.test.credential=$AWS_SECRET_KEY -Dwhirr.test.platform=centos

Note that '' specifies only the major CDH version, both the minor and incremental versions are defaulted to the latest available from the list of parcel repos defined by '', defaulting to the latest available.

The CmServerSmokeSuiteIT leverages a matrix of platforms and component versions to test all supported permutations, it ignores the command line switches.

mvn clean test -Dtest=CmServerSmokeSuiteIT -Dwhirr.test.identity=$AWS_ACCESS_KEY -Dwhirr.test.credential=$AWS_SECRET_KEY

The integration test frameworks sets up and tears down a cluster automatically as necessary, the latter conditional on the 'whirr.test.platform.destroy' system property. If you would like to launch and persist a cluster between integration tests for iterative testing sans the cluster bootstrap costs, a cluster can be launched as so (above platform and version system properties are also supported):

mvn exec:java -Dexec.mainClass="\$ClusterBoostrap" -Dexec.classpathScope="test" -Dwhirr.test.identity=$AWS_ACCESS_KEY -Dwhirr.test.credential=$AWS_SECRET_KEY -Dlog4j.configuration=file:./target/test-classes/ -Dwhirr.test.platform=centos

and then destroyed via:

mvn exec:java -Dexec.mainClass="\$ClusterDestroy" -Dexec.classpathScope="test" -Dwhirr.test.identity=$AWS_ACCESS_KEY -Dwhirr.test.credential=$AWS_SECRET_KEY -Dlog4j.configuration=file:./target/test-classes/

As a convenience (especially for running within an IDE) the integration tests source the as system and Whirr properties prior to execution, absolving the need to specify these properties in less convenient forms (eg command line switches, maven properties, IDE properties etc).

Something went wrong with that request. Please try again.