Skip to content

Deploy Hadoop Cluster

Jichao edited this page Jun 18, 2017 · 17 revisions

Knowledge

Java Version:

Machine Template

  • JAVA_HOME: /hadoop_env/jdk
  • HADOOP_HOME: /hadoop_env/hadoop
  • NAME NODE: /hadoop_env/namendoe/store
  • Allow DataNode file: /hadoop_env/namenode/conf/datanode-allow.list (should be ip address line by line?)
  • DataNode store: /hadoop_env/datanode/store
  • ResourceManager: /hadoop_env/resourcemanager/include_nodemanagers.list (should be ip address line by line?)
  • NodeManager: /hadoop_env/nodemanager/local_dir_1
  • NodeManager: /hadoop_env/nodemanager/log_dir_1
  • MR history server: /hadoop_env/mr-history/tmp
  • MR history server: /hadoop_env/mr-history/done

Topology

  • NameNode: 1
  • DataNode: 3
  • ResourceManager: 1
  • NodeManager?
  • WebAppProxy:?
  • MapReduceJobHistory Server ?
  • Typically one machine in the cluster is designated as the NameNode and another machine as the ResourceManager, exclusively. These are the masters. Other services (such as Web App Proxy Server and MapReduce Job History server) are usually run either on dedicated hardware or on shared infrastructure, depending upon the load. The rest of the machines in the cluster act as both DataNode and NodeManager. These are the workers.
  • 1 machine for NameNode
  • 1 machine for ResourceManager
  • 1 machine for Web App Proxy Server
  • 1 machine for Map Reduce Job History Server
  • other machines for DataNode and NodeManager. Two roles for one machine.

Reference

Clone this wiki locally