[SPARK-31675][CORE] Fix rename and delete files with different filesystem #36070

CHENXCHEN · 2022-04-05T06:14:03Z

What changes were proposed in this pull request?

When we use a partition table, if the filesystem of partition location is different from the filesystem of the table location,
we will get an exception like that：java.lang.IllegalArgumentException: Wrong FS: s3a://path/to/spark3_snap/dt=2020-09-10, expected: hdfs://cluster,
because HadoopMapReduceCommitProtocol will use the filesystem of the table location to operate the file.
For example, the following SQL will cause the above exception:

CREATE TABLE `spark3_snap`( `id` string) PARTITIONED BY (`dt` string)
STORED AS ORC LOCATION 'hdfs://path/to/spark3_snap';

-- The file system of the partition location is different from the filesystem of the table location,
-- one is S3A, the other is HDFS
alter table tmp.spark3_snap add partition (dt='2020-09-10') 
LOCATION 's3a://path/to/spark3_snap/dt=2020-09-10';

-- This will get an exception: "java.lang.IllegalArgumentException: Wrong FS: s3a://path/to/spark3_snap/dt=2020-09-10, expected: hdfs://cluster"
insert overwrite table tmp.spark3_snap partition(dt)
select '10' id, '2020-09-09' dt
union
select '20' id, '2020-09-10' dt
;

See details in the JIRA. SPARK-31675

Why are the changes needed?

We cannot operate on partitions with different from filesystem of table partition location

Does this PR introduce any user-facing change?

Yes, before this PR, an exception will be reported when the user operates a filesystem of partition location different from the filesystem of table location. After this PR, it will be processed as needed.

How was this patch tested?

Manual testing, not sure how to use unit tests in Spark to verify this patch.

…uceCommitProtocol

CHENXCHEN · 2022-04-05T07:04:50Z

cc @cloud-fan could you help take a look when you have time? Thanks.

AmplabJenkins · 2022-04-05T09:08:23Z

Can one of the admins verify this patch?

core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala

cloud-fan · 2022-04-07T07:44:50Z

Cross-file-system table writing sounds like a big feature to me. Currently, Spark fails to write so this feature is not supported yet, but I'm wondering what's the best way to do it, e.g. shall we put the staging dir in the same file system of the target path?

cc @AngersZhuuuu @yaooqinn

CHENXCHEN · 2022-04-07T08:27:11Z

Staging Dir is generated based on the table location
If we specify that the generated partition location must be placed under the table location, we need:

Find the partition locations that are different from the table locations, change their locations, and then delete the old locations at InsertIntoHadoopFsRelationCommand.scala
Partition locations that are different from the table location need to be passed into the commiter as new task file at FileFormatDataWriter.scala

Hive's approach is to keep the path to the partition location and move files across file systems.Hive.java
Would it be better if our behavior was consistent with that of Hive?

AngersZhuuuu · 2022-04-07T08:40:23Z

Current DS insert only support write staging dir for dynamic partition overwrite, this pr's case seems is to use hive serde(since hive serde support config staging dir use different file system). And spark's commit protocol not support different file system for staging dir.

Add support for different file system have to consider a lot, you can check the point mentioned in #33828

Also I am writing a new build in commit protocol #36056, it's behavior like hive, and it make all overwrite use a staging dir in the same file system of the target path.

CHENXCHEN · 2022-04-07T08:40:42Z

Do we need a wrapper file system to handle all the files in spark, including cross file system operations?
It sounds like a big change...

CHENXCHEN · 2022-04-07T09:10:04Z

Current DS insert only support write staging dir for dynamic partition overwrite, this pr's case seems is to use hive serde(since hive serde support config staging dir use different file system). And spark's commit protocol not support different file system for staging dir.

Add support for different file system have to consider a lot, you can check the point mentioned in #33828

Also I am writing a new build in commit protocol #36056, it's behavior like hive, and it make all overwrite use a staging dir in the same file system of the target path.

Yes, this pr's case is to use hive serde(hive serde support config staging dir use different file system)

#33828 and #36056 If we have multiple filesystems that are different from the staging dir filesystem, we still have exceptions.

steveloughran

You really should not be using the classic FileOutputCommitter against S3; as well as performance being awful it lacks correctness and resilience against failure of task commit. Problems with file rename here are essentially second order.

Which committer are you using and what filesystems?

the code does look good for cross EZ copies in hdfs.

steveloughran · 2022-08-11T12:35:29Z

core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala

+        val dstFs = dstPath.getFileSystem(hadoopConf)
+        // Copying files across different file systems
+        if (needCopy(srcPath, dstPath, srcFs, dstFs)) {
+          if (!FileUtil.copy(srcFs, srcFs.listStatus(srcPath).map(_.getPath), dstFs, dstPath,


you may want to think about parallelizing the copy, as for each file it is now going to take time proportional to data.length/(download_bandwidth+upload_bandwidth)

shame copy returns false sometimes; looks like it is only if mkdirs() on the dest or delete(src) fails.

can i highlight something i've noticed here, that copy() command stos on src read() returning -1, without doing any checks to validate file length, not great.

github-actions · 2022-11-20T00:23:02Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

Fix rename and delete files with different filesystem at HadoopMapRed…

4dc9457

…uceCommitProtocol

github-actions bot added the CORE label Apr 5, 2022

CHENXCHEN changed the title ~~[SPARK-32838][CORE] Fix rename and delete files with different filesystem~~ [SPARK-31675][CORE] Fix rename and delete files with different filesystem Apr 5, 2022

cloud-fan reviewed Apr 7, 2022

View reviewed changes

core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala Outdated Show resolved Hide resolved

spell correction

b5efbab

steveloughran reviewed Aug 11, 2022

View reviewed changes

github-actions bot added the Stale label Nov 20, 2022

github-actions bot closed this Nov 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-31675][CORE] Fix rename and delete files with different filesystem #36070

[SPARK-31675][CORE] Fix rename and delete files with different filesystem #36070

CHENXCHEN commented Apr 5, 2022 •

edited

Loading

CHENXCHEN commented Apr 5, 2022

AmplabJenkins commented Apr 5, 2022

cloud-fan commented Apr 7, 2022

CHENXCHEN commented Apr 7, 2022 •

edited

Loading

AngersZhuuuu commented Apr 7, 2022

CHENXCHEN commented Apr 7, 2022

CHENXCHEN commented Apr 7, 2022

steveloughran left a comment

steveloughran Aug 11, 2022

steveloughran Nov 22, 2022

github-actions bot commented Nov 20, 2022

[SPARK-31675][CORE] Fix rename and delete files with different filesystem #36070

[SPARK-31675][CORE] Fix rename and delete files with different filesystem #36070

Conversation

CHENXCHEN commented Apr 5, 2022 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

CHENXCHEN commented Apr 5, 2022

AmplabJenkins commented Apr 5, 2022

cloud-fan commented Apr 7, 2022

CHENXCHEN commented Apr 7, 2022 • edited Loading

AngersZhuuuu commented Apr 7, 2022

CHENXCHEN commented Apr 7, 2022

CHENXCHEN commented Apr 7, 2022

steveloughran left a comment

Choose a reason for hiding this comment

steveloughran Aug 11, 2022

Choose a reason for hiding this comment

steveloughran Nov 22, 2022

Choose a reason for hiding this comment

github-actions bot commented Nov 20, 2022

CHENXCHEN commented Apr 5, 2022 •

edited

Loading

CHENXCHEN commented Apr 7, 2022 •

edited

Loading