Skip to content

[SPARK-56966][SPARK-56967][CORE] Auto-create event log and history se…#56028

Closed
sharma-0311 wants to merge 2 commits into
apache:masterfrom
sharma-0311:SPARK-56966-SPARK-56967-auto-create-event-log-dirs
Closed

[SPARK-56966][SPARK-56967][CORE] Auto-create event log and history se…#56028
sharma-0311 wants to merge 2 commits into
apache:masterfrom
sharma-0311:SPARK-56966-SPARK-56967-auto-create-event-log-dirs

Conversation

@sharma-0311
Copy link
Copy Markdown

What changes were proposed in this pull request?

Auto-create the event log directory (spark.eventLog.dir) and the
history server log directory (spark.history.fs.logDirectory) if they
do not exist at startup, instead of failing with a FileNotFoundException.

FsHistoryProvider (SPARK-56966): In startPolling(), the
FileNotFoundException catch block now calls FileSystem.mkdirs with
LOG_FOLDER_PERMISSIONS before giving up. If creation succeeds the
directory is treated as valid; if it fails, the original warning-and-skip
behavior is preserved.

EventLogFileWriters (SPARK-56967): requireLogBaseDirAsDirectory()
now checks fileSystem.exists first and calls FileSystem.mkdirs if the
path is absent. This works for local FS, HDFS, and S3-compatible paths.

Why are the changes needed?

Users frequently see startup failures because the log directory has not
been pre-created, even when the path and permissions are otherwise correct.
Auto-creating it reduces operational friction with no behavioral regression
for directories that already exist.

Does this PR introduce any user-facing change?

Yes — previously both code paths threw/warned on a missing directory.
Now they silently create it. The behavior when a directory exists is
unchanged.

How was this patch tested?

  • Unit tests for FsHistoryProviderSuite covering the new mkdir path
  • Unit tests for EventLogFileWritersSuite covering auto-creation
  • Manual test with a non-existent local path and an S3 path

Closes #

Resolves https://issues.apache.org/jira/browse/SPARK-56966
Resolves https://issues.apache.org/jira/browse/SPARK-56967

…rver log directories if they do not exist

- FsHistoryProvider: when spark.history.fs.logDirectory does not exist,
  attempt to create it automatically via FileSystem.mkdirs instead of
  immediately failing. Falls back to a warning if creation fails.

- EventLogFileWriters: in requireLogBaseDirAsDirectory(), check for
  directory existence before getFileStatus and auto-create via
  FileSystem.mkdirs if missing. This prevents FileNotFoundException
  when spark.eventLog.dir (including S3 paths) has not been pre-created.
…rver log directories if they do not exist

- FsHistoryProvider: when spark.history.fs.logDirectory does not exist,
  attempt to create it automatically via FileSystem.mkdirs instead of
  immediately failing. Falls back to a warning if creation fails.

- EventLogFileWriters: in requireLogBaseDirAsDirectory(), check for
  directory existence before getFileStatus and auto-create via
  FileSystem.mkdirs if missing. This prevents FileNotFoundException
  when spark.eventLog.dir (including S3 paths) has not been pre-created.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant