HDFS-17926. Automatically create home directory for users by magnuma3 · Pull Request #8514 · apache/hadoop

magnuma3 · 2026-05-26T01:05:21Z

Description of PR

Currently, HDFS does not automatically create a user's home directory (e.g., /user/<user>). This requires administrators to manually create home directories, which adds operational overhead and can cause failures for user-facing tools (e.g., MapReduce job submission, Hive, Spark) that assume the home directory exists.

This JIRA tracks the development of automatic home directory creation so that when a user's home directory does not yet exist, HDFS creates it automatically with appropriate ownership (<username>:<supergroup>) and permissions (drwx------).

Motivation:

Reduces administrative burden for large multi-tenant clusters
Prevents job failures caused by missing home directories
Aligns with the behavior expected by higher-level ecosystem tools

Behavior

When dfs.namenode.auto.create.user.home.enabled=true, the NN intercepts
the following RPCs and creates the caller's /user/ if it does
not yet exist:

create, mkdirs, getListing, getFileInfo, getLocatedFileInfo

hdfs dfs -ls and similar commands issue getFileInfo first, so a user's
home directory is created the first time they touch the cluster.

The directory is created with:

owner: caller's short username
group: caller's primary group (or a configured group if set)
permission: configured octal (default 0700)
quota: matched against group/user rules from the new quota config

Creation is performed by the NN superuser (not the requesting user)

Results (success and most failures) are cached by short username so that
subsequent RPCs of the same user incur only a HashMap lookup (~0.001ms)
instead of an NN getFileInfo round trip (~0.1ms).

How was this patch tested?

This feature was originally developed and added to an internal fork of Apache Hadoop 3.1.2, and has been running in production for over a year.

For code changes:

Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

AI Tooling

If an AI tool was used:

The PR includes the phrase "Contains content generated by "
where is the name of the AI tool used.
My use of AI contributions follows the ASF legal policy
https://www.apache.org/legal/generative-tooling.html

HDFS-17926. Automatically create home directory for users

dfb4496

github-actions Bot added HDFS trunk labels May 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDFS-17926. Automatically create home directory for users#8514

HDFS-17926. Automatically create home directory for users#8514
magnuma3 wants to merge 1 commit into
apache:trunkfrom
magnuma3:auto-create-home-directory

magnuma3 commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

magnuma3 commented May 26, 2026

Description of PR

How was this patch tested?

For code changes:

AI Tooling

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant