HDFS-17307. docker-compose.yaml sets namenode directory wrong causing datanode failures on restart #6387
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Restarting existing services using the docker-compose.yaml, causes the datanode to crash after a few seconds.
How to reproduce:
The log produced by the datanode suggests the issue is due to a mismatch in the clusterIDs of the namenode and the datanode:
After some troubleshooting I found out the namenode is not reusing the clusterID of the previous run because it cannot find it in the directory set by ENSURE_NAMENODE_DIR=/tmp/hadoop-root/dfs/name. This is due to a change of the default user of the namenode, which is now "hadoop", so the namenode is actually writing these information to /tmp/hadoop-hadoop/dfs/name.
See https://issues.apache.org/jira/browse/HDFS-17307