In [1]:
%load_ext dockermagic

# HDFS

## HDFS - Web Interface

- Master node
    - NameNode: http://localhost:9870
    - Secondary NameNode: http://localhost:9868
- Worker node
    - hadoop1
        - DataNode: http://localhost:9864
    - hadoop2
        - DataNode: http://localhost:9865
    - hadoop3
        - DataNode: http://localhost:9866

## HDFS - CLI

In [2]:
%%dockerexec hadoop

source /opt/envvars.sh
hdfs help

Usage: hdfs [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS]

  OPTIONS is none or any of:

tput: No value for $TERM and no -T specified
--buildpaths                       attempt to add class files from build tree
--config dir                       Hadoop config directory
--daemon (start|status|stop)       operate on a daemon
--debug                            turn on shell script debug mode
--help                             usage information
--hostnames list[,of,host,names]   hosts to use in worker mode
--hosts filename                   list of hosts to use in worker mode
--loglevel level                   set the log4j level for this command
--workers                          turn on worker mode

  SUBCOMMAND is one of:


    Admin Commands:

tput: No value for $TERM and no -T specified
cacheadmin           configure the HDFS cache
crypto               configure HDFS encryption zones
debug                run a Debug Admin to execute HDFS debug commands
dfsadmin             run a DFS admi

## Administration Commands

- https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html

### Verify HDFS cluster status

In [3]:
%%dockerexec hadoop

source /opt/envvars.sh

# print topology
hdfs dfsadmin -printTopology

printf "\n%40s\n\n" |tr " " "="

hdfs dfsadmin -report

Rack: /default-rack
   172.19.0.2:9866 (hadoop2.docker_hadoopnet) In Service
   172.19.0.4:9866 (hadoop1.docker_hadoopnet) In Service
   172.19.0.3:9866 (hadoop3.docker_hadoopnet) In Service



Configured Capacity: 100908072960 (93.98 GB)
Present Capacity: 43334434816 (40.36 GB)
DFS Remaining: 43334361088 (40.36 GB)
DFS Used: 73728 (72 KB)
DFS Used%: 0.00%
Replicated Blocks:
	Under replicated blocks: 0
	Blocks with corrupt replicas: 0
	Missing blocks: 0
	Missing blocks (with replication factor 1): 0
	Low redundancy blocks with highest priority to recover: 0
	Pending deletion blocks: 0
Erasure Coded Block Groups: 
	Low redundancy block groups: 0
	Block groups with corrupt internal blocks: 0
	Missing block groups: 0
	Low redundancy blocks with highest priority to recover: 0
	Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (3):

Name: 172.19.0.2:9866 (hadoop2.docker_hadoopnet)
Hostname: hadoop2
Decommission Status : Normal
Configured Capacity: 

### Replication factor

#### Run randomtext application

In [4]:
%%dockerexec hadoop

source /opt/envvars.sh

# deletes randomtext folder if it exists
hdfs dfs -test -d /user/hadoop/randomtext && hdfs dfs -rm -r /user/hadoop/randomtext

cd /opt/hadoop/share/hadoop/mapreduce

hadoop jar ./hadoop-mapreduce-examples-$HADOOP_VERSION.jar randomtextwriter \
  -D mapreduce.randomtextwriter.totalbytes=157286400 \
  -D mapreduce.randomtextwriter.bytespermap=52428800 \
  -D mapreduce.output.fileoutputformat.compress=false \
  -outFormat org.apache.hadoop.mapreduce.lib.output.TextOutputFormat \
  randomtext

2023-12-04 10:24:13,338 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at hadoop/172.19.0.5:8032
2023-12-04 10:24:13,521 INFO client.AHSProxy: Connecting to Application History server at hadoop/172.19.0.5:10200
Running 3 maps.
Job started: Mon Dec 04 10:24:14 BRT 2023
2023-12-04 10:24:14,014 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at hadoop/172.19.0.5:8032
2023-12-04 10:24:14,014 INFO client.AHSProxy: Connecting to Application History server at hadoop/172.19.0.5:10200
2023-12-04 10:24:14,177 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1701696043891_0001
2023-12-04 10:24:15,204 INFO mapreduce.JobSubmitter: number of splits:3
2023-12-04 10:24:15,356 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1701696043891_0001
2023-12-04 10:24:15,356 INFO mapreduce.JobSubmitter: Executing with tokens: []
2023-12-04 10:24:15,496 INFO conf.Configurati

#### List folder block location

In [5]:
%%dockerexec hadoop

source /opt/envvars.sh

hdfs fsck /user/hadoop/randomtext -files -blocks -locations

Connecting to namenode via http://hadoop:9870/fsck?ugi=hadoop&files=1&blocks=1&locations=1&path=%2Fuser%2Fhadoop%2Frandomtext
FSCK started by hadoop (auth:SIMPLE) from /172.19.0.5 for path /user/hadoop/randomtext at Mon Dec 04 10:24:45 BRT 2023

/user/hadoop/randomtext <dir>
/user/hadoop/randomtext/_SUCCESS 0 bytes, replicated: replication=2, 0 block(s):  OK

/user/hadoop/randomtext/part-m-00000 52589372 bytes, replicated: replication=2, 2 block(s):  OK
0. BP-1060887769-172.19.0.5-1701696025450:blk_1073741833_1009 len=33554432 Live_repl=2  [DatanodeInfoWithStorage[172.19.0.3:9866,DS-9f58d647-656a-43c3-a8cf-2db68edaa6cf,DISK], DatanodeInfoWithStorage[172.19.0.4:9866,DS-1e865565-6bed-481f-ae3b-d9f778884cf4,DISK]]
1. BP-1060887769-172.19.0.5-1701696025450:blk_1073741835_1011 len=19034940 Live_repl=2  [DatanodeInfoWithStorage[172.19.0.3:9866,DS-9f58d647-656a-43c3-a8cf-2db68edaa6cf,DISK], DatanodeInfoWithStorage[172.19.0.4:9866,DS-1e865565-6bed-481f-ae3b-d9f778884cf4,DISK]]

/user/hadoop/ra

#### Change replication factor of all files in directory to 3

In [6]:
%%dockerexec hadoop

source /opt/envvars.sh

hdfs dfs -setrep 3 /user/hadoop/randomtext

Replication 3 set: /user/hadoop/randomtext/_SUCCESS
Replication 3 set: /user/hadoop/randomtext/part-m-00000
Replication 3 set: /user/hadoop/randomtext/part-m-00001
Replication 3 set: /user/hadoop/randomtext/part-m-00002


#### List folder block location

In [7]:
%%dockerexec hadoop

source /opt/envvars.sh

hdfs fsck /user/hadoop/randomtext -files -blocks -locations

Connecting to namenode via http://hadoop:9870/fsck?ugi=hadoop&files=1&blocks=1&locations=1&path=%2Fuser%2Fhadoop%2Frandomtext
FSCK started by hadoop (auth:SIMPLE) from /172.19.0.5 for path /user/hadoop/randomtext at Mon Dec 04 10:25:39 BRT 2023

/user/hadoop/randomtext <dir>
/user/hadoop/randomtext/_SUCCESS 0 bytes, replicated: replication=3, 0 block(s):  OK

/user/hadoop/randomtext/part-m-00000 52589372 bytes, replicated: replication=3, 2 block(s):  OK
0. BP-1060887769-172.19.0.5-1701696025450:blk_1073741833_1009 len=33554432 Live_repl=3  [DatanodeInfoWithStorage[172.19.0.3:9866,DS-9f58d647-656a-43c3-a8cf-2db68edaa6cf,DISK], DatanodeInfoWithStorage[172.19.0.4:9866,DS-1e865565-6bed-481f-ae3b-d9f778884cf4,DISK], DatanodeInfoWithStorage[172.19.0.2:9866,DS-dd925961-0131-4e0a-acdb-9557d8dd9e52,DISK]]
1. BP-1060887769-172.19.0.5-1701696025450:blk_1073741835_1011 len=19034940 Live_repl=3  [DatanodeInfoWithStorage[172.19.0.3:9866,DS-9f58d647-656a-43c3-a8cf-2db68edaa6cf,DISK], DatanodeInfoWith

#### Change replication factor back to 2

In [8]:
%%dockerexec hadoop

source /opt/envvars.sh

hdfs dfs -setrep 2 /user/hadoop/randomtext

Replication 2 set: /user/hadoop/randomtext/_SUCCESS
Replication 2 set: /user/hadoop/randomtext/part-m-00000
Replication 2 set: /user/hadoop/randomtext/part-m-00001
Replication 2 set: /user/hadoop/randomtext/part-m-00002


#### List folder block location

In [9]:
%%dockerexec hadoop

source /opt/envvars.sh

hdfs fsck /user/hadoop/randomtext -files -blocks -locations

Connecting to namenode via http://hadoop:9870/fsck?ugi=hadoop&files=1&blocks=1&locations=1&path=%2Fuser%2Fhadoop%2Frandomtext
FSCK started by hadoop (auth:SIMPLE) from /172.19.0.5 for path /user/hadoop/randomtext at Mon Dec 04 10:26:27 BRT 2023

/user/hadoop/randomtext <dir>
/user/hadoop/randomtext/_SUCCESS 0 bytes, replicated: replication=2, 0 block(s):  OK

/user/hadoop/randomtext/part-m-00000 52589372 bytes, replicated: replication=2, 2 block(s):  OK
0. BP-1060887769-172.19.0.5-1701696025450:blk_1073741833_1009 len=33554432 Live_repl=2  [DatanodeInfoWithStorage[172.19.0.2:9866,DS-dd925961-0131-4e0a-acdb-9557d8dd9e52,DISK], DatanodeInfoWithStorage[172.19.0.4:9866,DS-1e865565-6bed-481f-ae3b-d9f778884cf4,DISK]]
1. BP-1060887769-172.19.0.5-1701696025450:blk_1073741835_1011 len=19034940 Live_repl=2  [DatanodeInfoWithStorage[172.19.0.2:9866,DS-dd925961-0131-4e0a-acdb-9557d8dd9e52,DISK], DatanodeInfoWithStorage[172.19.0.4:9866,DS-1e865565-6bed-481f-ae3b-d9f778884cf4,DISK]]

/user/hadoop/ra

### Decomission nodes

- dfs.hosts.exclude in hdfs-site.xml

In [10]:
%%dockerexec hadoop

source /opt/envvars.sh

# Decomissioning hadoop1
cat > /opt/hadoop/etc/hadoop/dfs.exclude << EOF
hadoop1
EOF

hdfs dfsadmin -refreshNodes

Refresh nodes successful


- **Namenode:** http://localhost:9870

#### Report HDFS Status

In [12]:
%%dockerexec hadoop

source /opt/envvars.sh

hdfs dfsadmin -report

Configured Capacity: 67272048640 (62.65 GB)
Present Capacity: 28248800032 (26.31 GB)
DFS Remaining: 27929780224 (26.01 GB)
DFS Used: 319019808 (304.24 MB)
DFS Used%: 1.13%
Replicated Blocks:
	Under replicated blocks: 0
	Blocks with corrupt replicas: 0
	Missing blocks: 0
	Missing blocks (with replication factor 1): 0
	Low redundancy blocks with highest priority to recover: 0
	Pending deletion blocks: 0
Erasure Coded Block Groups: 
	Low redundancy block groups: 0
	Block groups with corrupt internal blocks: 0
	Missing block groups: 0
	Low redundancy blocks with highest priority to recover: 0
	Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (3):

Name: 172.19.0.2:9866 (hadoop2.docker_hadoopnet)
Hostname: hadoop2
Decommission Status : Normal
Configured Capacity: 33636024320 (31.33 GB)
DFS Used: 159509904 (152.12 MB)
Non DFS Used: 17776865904 (16.56 GB)
DFS Remaining: 13964890112 (13.01 GB)
DFS Used%: 0.47%
DFS Remaining%: 41.52%
Configured Cache 

#### Recomission all nodes

In [13]:
%%dockerexec hadoop

source /opt/envvars.sh

cat > /opt/hadoop/etc/hadoop/dfs.exclude << EOF
EOF

hdfs dfsadmin -refreshNodes

Refresh nodes successful


#### Report HDFS status

In [14]:
%%dockerexec hadoop

source /opt/envvars.sh

hdfs dfsadmin -report

Configured Capacity: 100908072960 (93.98 GB)
Present Capacity: 42691582752 (39.76 GB)
DFS Remaining: 42372538368 (39.46 GB)
DFS Used: 319044384 (304.26 MB)
DFS Used%: 0.75%
Replicated Blocks:
	Under replicated blocks: 0
	Blocks with corrupt replicas: 0
	Missing blocks: 0
	Missing blocks (with replication factor 1): 0
	Low redundancy blocks with highest priority to recover: 0
	Pending deletion blocks: 0
Erasure Coded Block Groups: 
	Low redundancy block groups: 0
	Block groups with corrupt internal blocks: 0
	Missing block groups: 0
	Low redundancy blocks with highest priority to recover: 0
	Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (3):

Name: 172.19.0.2:9866 (hadoop2.docker_hadoopnet)
Hostname: hadoop2
Decommission Status : Normal
Configured Capacity: 33636024320 (31.33 GB)
DFS Used: 106197705 (101.28 MB)
Non DFS Used: 17670888759 (16.46 GB)
DFS Remaining: 14124179456 (13.15 GB)
DFS Used%: 0.32%
DFS Remaining%: 41.99%
Configured Cache

### Handling datanode failures

- timeouts defined in hdfs-site.xml 
    - dfs.namenode.heartbeat.recheck-interval = 10000 (10 seconds)
    - dfs.heartbeat.interval = 3 seconds
- timeout = 2 x recheck-interval + 10 x heartbeat.interval
    - timeout = 50 seconds

In [15]:
%%dockerexec hadoop

source /opt/envvars.sh

# get dfs.namenode.heartbeat.recheck-interval
hdfs getconf -confKey dfs.namenode.heartbeat.recheck-interval

# get dfs.heartbeat.interval
hdfs getconf -confKey dfs.heartbeat.interval

10000
3


#### Simulate node fault

In [16]:
%%dockerexec hadoop

source /opt/envvars.sh

ssh hadoop1 'kill -9 $(cat /tmp/hadoop-hadoop-datanode.pid)'



- **Namenode:** http://localhost:9870

In [17]:
%%dockerexec hadoop

source /opt/envvars.sh

hdfs dfsadmin -report

Configured Capacity: 67272048640 (62.65 GB)
Present Capacity: 28494179061 (26.54 GB)
DFS Remaining: 28247654400 (26.31 GB)
DFS Used: 246524661 (235.10 MB)
DFS Used%: 0.87%
Replicated Blocks:
	Under replicated blocks: 2
	Blocks with corrupt replicas: 0
	Missing blocks: 0
	Missing blocks (with replication factor 1): 0
	Low redundancy blocks with highest priority to recover: 2
	Pending deletion blocks: 0
Erasure Coded Block Groups: 
	Low redundancy block groups: 0
	Block groups with corrupt internal blocks: 0
	Missing block groups: 0
	Low redundancy blocks with highest priority to recover: 0
	Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (2):

Name: 172.19.0.2:9866 (hadoop2.docker_hadoopnet)
Hostname: hadoop2
Decommission Status : Normal
Configured Capacity: 33636024320 (31.33 GB)
DFS Used: 106197705 (101.28 MB)
Non DFS Used: 17671220535 (16.46 GB)
DFS Remaining: 14123847680 (13.15 GB)
DFS Used%: 0.32%
DFS Remaining%: 41.99%
Configured Cache 

#### Restart nodemanager

In [18]:
%%dockerexec hadoop

source /opt/envvars.sh

ssh hadoop1 /opt/hadoop/bin/hdfs --daemon start datanode



#### Refresh nodes

In [19]:
%%dockerexec hadoop

source /opt/envvars.sh

hdfs dfsadmin -refreshNodes
hdfs dfsadmin -report

Refresh nodes successful
Configured Capacity: 100908072960 (93.98 GB)
Present Capacity: 42689764752 (39.76 GB)
DFS Remaining: 42298081280 (39.39 GB)
DFS Used: 391683472 (373.54 MB)
DFS Used%: 0.92%
Replicated Blocks:
	Under replicated blocks: 0
	Blocks with corrupt replicas: 0
	Missing blocks: 0
	Missing blocks (with replication factor 1): 0
	Low redundancy blocks with highest priority to recover: 0
	Pending deletion blocks: 0
Erasure Coded Block Groups: 
	Low redundancy block groups: 0
	Block groups with corrupt internal blocks: 0
	Missing block groups: 0
	Low redundancy blocks with highest priority to recover: 0
	Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (3):

Name: 172.19.0.2:9866 (hadoop2.docker_hadoopnet)
Hostname: hadoop2
Decommission Status : Normal
Configured Capacity: 33636024320 (31.33 GB)
DFS Used: 159576064 (152.18 MB)
Non DFS Used: 17618132992 (16.41 GB)
DFS Remaining: 14123556864 (13.15 GB)
DFS Used%: 0.47%
DFS Remaining%