HDFS-17926. Automatically create home directory for users#8514
Open
magnuma3 wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of PR
HDFS-17926
Currently, HDFS does not automatically create a user's home directory (e.g.,
/user/<user>). This requires administrators to manually create home directories, which adds operational overhead and can cause failures for user-facing tools (e.g., MapReduce job submission, Hive, Spark) that assume the home directory exists.This JIRA tracks the development of automatic home directory creation so that when a user's home directory does not yet exist, HDFS creates it automatically with appropriate ownership (
<username>:<supergroup>) and permissions (drwx------).Motivation:
Behavior
When dfs.namenode.auto.create.user.home.enabled=true, the NN intercepts
the following RPCs and creates the caller's /user/ if it does
not yet exist:
hdfs dfs -lsand similar commands issue getFileInfo first, so a user'shome directory is created the first time they touch the cluster.
The directory is created with:
Creation is performed by the NN superuser (not the requesting user)
Results (success and most failures) are cached by short username so that
subsequent RPCs of the same user incur only a HashMap lookup (~0.001ms)
instead of an NN getFileInfo round trip (~0.1ms).
How was this patch tested?
This feature was originally developed and added to an internal fork of Apache Hadoop 3.1.2, and has been running in production for over a year.
For code changes:
LICENSE,LICENSE-binary,NOTICE-binaryfiles?AI Tooling
If an AI tool was used:
where is the name of the AI tool used.
https://www.apache.org/legal/generative-tooling.html