# [HDFS的HA基础](http://hadoop.apache.org/docs/r2.8.4/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html)

- 如果namenode出现问题，整个HDFS集群将不能使用。     
- 是不是可以有两个namenode呢  
    - 一个为对外服务->active  
    - 一个处于待机->standby    
    - 他们的之间共享的元数据叫 nameservice  

### HDFS HA的几大中重点


- 保证两个namenode里面的内存中存储的文件的元数据同步   
    - namenode启动时，会读镜像文件   
- 变化的记录信息同步  
- 日志文件的安全性   
    - 分布式的存储日志文件（cloudera公司提出来的）  
        - 2n+1个，使用副本数保证安全性  
    - 使用zookeeper监控  
        - 监控两个namenode，当一个出现了问题，可以达到自动故障转移。  
        - 如果出现了问题，不会影响整个集群   
        - zookeeper对时间同步要求比较高。   
- 客户端如何知道访问哪一个namenode  
    - 使用proxy代理     
    - 隔离机制   
    - 使用的是sshfence  
    - 两个namenode之间无密码登录   
- namenode是哪一个是active   
    - zookeeper通过选举选出zookeeper。  
    - 然后zookeeper开始监控，如果出现文件，自动故障转移。   

### 配置hadoop-env.sh

```shell
export JAVA_HOME=/usr/local/jdk1.8.0_171/
```

### 配置hadoop-core-site.xml

```xml
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://htfeng</value>
    </property>
    <property>
        <name>io.file.buffer.size</name>
        <value>4096</value>
    </property>
<!--tmp data-->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/htfeng/hahadoopdata/tmp/</value>
    </property>

</configuration>
```

### 配置hadoop-hdfs-site.xml

```xml
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
    <property>
        <name>dfs.block.size</name>
        <value>134217728</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/home/htfeng/hahadoopdata/dfs/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/home/htfeng/hahadoopdata/dfs/data</value>
    </property>

    <property>
      <name>dfs.nameservices</name>
      <value>htfeng</value>
    </property>
    <property>
      <name>dfs.ha.namenodes.htfeng</name>
      <value>nn1,nn2</value>
    </property>

    <property>
      <name>dfs.namenode.rpc-address.htfeng.nn1</name>
      <value>hadoop1:9000</value>
    </property>
    <property>
      <name>dfs.namenode.rpc-address.htfeng.nn2</name>
      <value>hadoop2:9000</value>
    </property>

    <property>
      <name>dfs.namenode.http-address.htfeng.nn1</name>
      <value>hadoop1:50070</value>
    </property>
    <property>
      <name>dfs.namenode.http-address.htfeng.nn2</name>
      <value>hadoop2:50070</value>
    </property>

    <property>
      <name>dfs.namenode.shared.edits.dir</name>
      <value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/htfeng</value>
    </property>

    <property>
      <name>dfs.journalnode.edits.dir</name>
      <value>/home/htfeng/hahadoopdata/journal/data</value>
    </property>

    <property>
      <name>dfs.ha.automatic-failover.enabled</name>
      <value>true</value>
    </property>

    <property>
      <name>dfs.client.failover.proxy.provider.htfeng</name>
                                       <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>

   <property>
      <name>dfs.ha.fencing.methods</name>
      <value>sshfence</value>
    </property>

    <property>
      <name>dfs.ha.fencing.ssh.private-key-files</name>
      <value>/root/.ssh/id_rsa</value>
    </property>


    <property>
      <name>dfs.ha.fencing.ssh.connect-timeout</name>
      <value>30000</value>
    </property>
```

### 配置slaves


```
hadoop1
hadoop2
hadoop3
```

### 配置yarn-site.xml

```xml
<configuration>

<!-- Site specific YARN configuration properties -->
<property>
  <name>yarn.resourcemanager.ha.enabled</name>
  <value>true</value>
</property>
<property>
  <name>yarn.resourcemanager.cluster-id</name>
  <value>htfengyarn</value>
</property>
<property>
  <name>yarn.resourcemanager.ha.rm-ids</name>
  <value>rm1,rm2</value>
</property>
<property>
  <name>yarn.resourcemanager.hostname.rm1</name>
  <value>hadoop1</value>
</property>
<property>
  <name>yarn.resourcemanager.hostname.rm2</name>
  <value>hadoop2</value>
</property>
<property>
  <name>yarn.resourcemanager.webapp.address.rm1</name>
  <value>hadoop1:8088</value>
</property>
<property>
  <name>yarn.resourcemanager.webapp.address.rm2</name>
  <value>hadoop2:8088</value>
</property>
<property>
  <name>yarn.resourcemanager.zk-address</name>
  <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
</property>
<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>
</configuration>

```

**配置mapred-site.xml**



```xml
<configuration>
     <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>
```