Skip to content
Zubair Saiyed edited this page Sep 7, 2016 · 2 revisions

CDH5

This document is authorized for use within Insight Data Engineering and not for re-distribution.

Note this document is outdated, and doesn't necessarily work perfectly. For more information, follow the official documentation listed here:

http://www.cloudera.com/documentation/enterprise/latest/topics/cm_ig_install_path_a.html

Install Cloudera v5

Do the following on the Master node (see previous steps to launch master node EC2 instance):

$ wget http://archive-primary.cloudera.com/cm5/installer/5.1.2/cloudera-manager-installer.bin
$ chmod +x cloudera-manager-installer.bin
$ sudo ./cloudera-manager-installer.bin

You’ll see the following screen:

Click Next, and you’ll see the following screen:

Click “Next” again to see the following screen - move to “Yes” and accept the license

Click “Next” on the following screen:

Keep selecting “Next” and “Yes” until you are presented to the progress bar screen, and let it finish (it will take several minutes):

Once finished, you’ll see this screen:

As it says, point your web browser to the master node ip followed by port 7180. In my case, it’s:

http://ec2-54-183-182-182.us-west-1.compute.amazonaws.com:7180

The following screen should show up:

Use username: admin password: admin

And click “Login”

Choose “Cloudera Express” and click “Continue”

Choose “Launch Classic Wizard”

Click “Continue” in the next screen.

After the next screen, you should be presented with the following screen -

Enter in the public IP addresses of the master node and 3 slave nodes of the EC2 instances that you launched earlier. These IP addresses are found in the dashboard console when the instances are selected - similar to how you found the master public IP to determine which IP address to ssh into - see the following screenshot:

Enter the IP addresses like so (remove the “i-xxxxx” id before hand) and click “Search”:

You should get a screen very similar to the following once the instances are resolved:

Click “Continue” and you should be presented with the following page:

Leave all the defaults the same and click “Continue” again to reach the following screen:

Make sure you click to install the Oracle JDK. I did not enable Java Unlimited Strength Encryption Policy Files, but did not have a strong reason not to. I also didn't choose Single User Mode, but this may be fine as well. Click “Continue” on this screen to be presented with:

Make sure you choose “Another user” and use “ubuntu” as the username. Also, select “All hosts accept the same private key” and choose the “insight-cloudera.pem” private key. Click “Continue” and say “continue anyway” if asked whether the key is passwordless.

The following screen should then appear, and ensure that the installation succeeds on all instances (all green):

Once all bars are green, click “Continue” to be presented with the following screen - this screen should take several minutes to complete.

This step will take several minutes. Once all bars turn green, click “Continue” to be presented with the following screen:

Click “Finish” to be presented with the following screen:

Select “Custom Services” and install any services you might need. For the Hadoop and MapReduce portion of your project, you are safe installing:

HBase, HDFS, Hive, Hue, Spark, Sqoop 2, and YARN. Later, if you decide you need to, you may like to introduce Oozie, Solr or ZooKeeper. Oozie is a neat job tracker visualization tool, and Solr is a nice index and search service.

Click “Continue” to be presented with:

On this screen, make sure the “Master” node or “NameNode” or “Gateway” node, etc… are all pointed to the instance that we chose as m1.large. The ip shown in these boxes is the private IP:

If the IPs shown in CDH installer are not the m1.large instance for the master/namenode/gateways, then change it so that it points to the m1.large instance. If everything looks ok, click “Continue” to see the following screen:

Note down the username and passwords, and click “Continue” to see:

Click “Continue” again:

You should now get the following screen:

Once everything is finished installed, you are good to go! You can log onto your master node and start issuing hadoop commands. Enjoy!

Clone this wiki locally