[SPARK-47008][CORE] Added Hadoops fileSystems hasPathCapability check to avoid FileNotFoundException(s) when using S3 Express One Zone Storage by leovegas · Pull Request #48497 · apache/spark

leovegas · 2024-10-16T14:13:58Z

What changes were proposed in this pull request?

Added check for inconsistent directory listings through Hadoop fs.hasPathCapability(path, "fs.capability.directory.listing.inconsistent") in the following method:
org.apache.spark.util.Utils#fetchHcfsFile

In org.apache.spark.deploy.SparkHadoopUtil#listLeafStatuses
treewalk logic replaced by Hadoop's fs.listFiles method.

Why are the changes needed?

Spark to support S3 Express One Zone Storage
Details in this Jira

Does this PR introduce any user-facing change?

No

How was this patch tested?

Unit tests added to

test("SPARK-47008: ...) core/src/test/scala/org/apache/spark/util/UtilsSuite.scala
test("SPARK-47008: ...) core/src/test/scala/org/apache/spark/deploy/SparkHadoopUtilSuite.scala

Was this patch authored or co-authored using generative AI tooling?

No

…d FileNotFoundException(s) when using S3 Express One Zone Storage.

…scala Fixed import style.

… true.

…bility-check

…atuses and added test.

…apability-check' into feature/SPARK-47008-add-hasPathCapability-check # Conflicts: # core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala

…bility-check

…apability-check' into feature/SPARK-47008-add-hasPathCapability-check

…bility-check

HyukjinKwon · 2024-10-17T01:34:49Z

can you fill the PR description please?

github-actions · 2025-01-26T00:24:25Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

Leonid Timofeev and others added 22 commits May 17, 2024 08:56

[SPARK-47008][CORE] Added fileSystems hasPathCapability check to avoi…

5bbc431

…d FileNotFoundException(s) when using S3 Express One Zone Storage.

SPARK-47008: Added hasPathCapability check to fetchHcfsFile in Utils.…

e483697

…scala Fixed import style.

SPARK-47008: fetchHcfsFile won't throw FNFE if fsHasPathCapability is…

74b055b

… true.

SPARK-47008: Fixed import in Utils.scala

33493e9

SPARK-47008: Made fsHasPathCapability variable lazy

485a595

SPARK-47008: Added "Ignoring missing directory logs."

005ec3a

SPARK-47008: Added unit test for fetchHcfsFile.

5b600fa

SPARK-47008: Fixed imports in UtilsSuite.

0e85635

Merge branch 'apache:master' into feature/SPARK-47008-add-hasPathCapa…

69c173d

…bility-check

Merge branch 'apache:master' into feature/SPARK-47008-add-hasPathCapa…

3d09e38

…bility-check

Merge branch 'apache:master' into feature/SPARK-47008-add-hasPathCapa…

86306cb

…bility-check

SPARK-47008: Rolled back unnecessary formatting.

afcfe22

Merge branch 'apache:master' into feature/SPARK-47008-add-hasPathCapa…

54f77d7

…bility-check

SPARK-47008: Used Hadoop's fs.listFiles in SparkHadoopUtil#listLeafSt…

9f776f2

…atuses and added test.

SPARK-47008: Fixed scalastyle.

99213d7

SPARK-47008: Rollback using listFiles.

40f9168

Merge remote-tracking branch 'origin/feature/SPARK-47008-add-hasPathC…

dbdb927

…apability-check' into feature/SPARK-47008-add-hasPathCapability-check # Conflicts: # core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala

Merge branch 'apache:master' into feature/SPARK-47008-add-hasPathCapa…

dba1c23

…bility-check

SPARK-47008: Reverted changes in OrcFileOperator and OrcUtils

bc078b1

Merge remote-tracking branch 'origin/feature/SPARK-47008-add-hasPathC…

805791e

…apability-check' into feature/SPARK-47008-add-hasPathCapability-check

SPARK-47008: Reverted tests for SparkHadoopUtilSuite

37c4124

Merge branch 'apache:master' into feature/SPARK-47008-add-hasPathCapa…

1587346

…bility-check

github-actions bot added the CORE label Oct 16, 2024

leovegas changed the title ~~Feature/spark 47008 add has path capability check~~ [SPARK-47008][CORE] Added Hadoops fileSystems hasPathCapability check to avoid FileNotFoundException(s) when using S3 Express One Zone Storage Oct 16, 2024

github-actions bot added the Stale label Jan 26, 2025

github-actions bot closed this Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-47008][CORE] Added Hadoops fileSystems hasPathCapability check to avoid FileNotFoundException(s) when using S3 Express One Zone Storage#48497

[SPARK-47008][CORE] Added Hadoops fileSystems hasPathCapability check to avoid FileNotFoundException(s) when using S3 Express One Zone Storage#48497
leovegas wants to merge 22 commits intoapache:masterfrom
leovegas:feature/SPARK-47008-add-hasPathCapability-check

leovegas commented Oct 16, 2024 •

edited

Loading

Uh oh!

HyukjinKwon commented Oct 17, 2024

Uh oh!

github-actions bot commented Jan 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

leovegas commented Oct 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

HyukjinKwon commented Oct 17, 2024

Uh oh!

github-actions bot commented Jan 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

leovegas commented Oct 16, 2024 •

edited

Loading