[SPARK-36571][SQL] Add an SQLOverwriteHadoopMapReduceCommitProtocol to support all SQL overwrite write data to staging dir #36056

AngersZhuuuu · 2022-04-04T03:43:32Z

What changes were proposed in this pull request?

For the current data source insert SQL commit protocol, it has below problems:
case a: both job A and job B write data into partitioned table TBL with different statistic partition, it will have conflict since they use the same temp location ${table_location}/_temporary/0/....,
when job A has finished and then it will clean this temp location, then cause job B’s temp data is cleared. Then it will cause job B to fail to write data.
case b: for the current dynamic partition insert, if we kill a job writing data, will remain data under table location in the staging dir under table path.
case c: If we use a dynamic partition insert to insert a new table with a huge number of partitions, we need to move partition paths one by one, for this case, we can just rename stagingdir path to the table path to make it more quickly. But if we want to do this, we need to make staging dir can be customized and should not use the staging path under table location.

In this approach, we plan to add a new build in the SQL Commit protocol.
In this new SQL commit protocol ** SQLOverwriteHadoopMapReduceCommitProtocol** ,
We defined a new staging path that parallel to the target table path as
new Path(new Path(targetTablePath).getParent, s".${new Path(targetTablePath).getName}-spark-staging-" + jobId).
And we only change the behavior of overwrite mode. In overwrite mode:

for non-partition insert:

1. Spark won't delete the target table location before computing.
2. Before job commit, data have been computed and stored in staging dir
3. Before calling output committer's `commitJob`, we delete the target table location
4. After calling output committer's `commitJob`, we rename staging dir to target table location

for all static partition insert:

1.  Spark won't delete the matching partition before computing.
2. Before job commit, data have been computed and stored in staging dir
3. Before calling output committer's `commitJob`, we delete the matching partition
4. During the `commitJob` partition (with custom partition path) 's data have been written to the custom partition path
5. After calling the output committer's `commitJob`, if the custom partition path is not empty, result data have been written to target the custom partition path during commitJob. otherwise spark will rename staging dir to the target partition location

for dynamic partition insert: all same behaviors as SQLHadoopMapReduceCommitProtocol.

for dynamic partition overwrite in static mode

1.  Spark won't delete the matching partition before computing.
2. Before job commit, data have been computed and stored in staging dir
3. Before calling output committer's `commitJob`, we delete the matching partition
4. During the `commitJob` partition (with custom partition path) 's data have been written to the custom partition path
5. After calling output committer's `commitJob`, we rename staging dir to the target location for renamed normal partition path.

Why are the changes needed?

Provide new build-in sql commit protocol that can handle problems mentioned in pr desc.

Does this PR introduce any user-facing change?

User can set SQL commit protocol to org.apache.spark.sql.execution.datasources.SQLOverwriteHadoopMapReduceCommitProtocol to use a commit protocol with staging dir

How was this patch tested?

Added UT

…o support all SQL overwrite write data to staging dir

AngersZhuuuu · 2022-04-04T14:45:37Z

ping @cloud-fan In order to realize the target that can support all overwrite, spark can't delete matching partitions before computing. But we support custom partition path and this part is handled in commit protocol, and spark must delete matching partitions before commit job. But

In committer side, we don't know the information about jobs (such as if it's a overwriting? or is it's a non-partition table overwriting)
Also in committer side, it don't know how to handle deleting matching partitions
Custom partition path overwrite is handled in commit protocol's commitJob

So the best processing steps is:

Not delete matching partitions
Executing data and write temp path
Delete matching partitions
commitJob, in this step, custom partition path is written and commit data to staging dir
rename staging dir to target location

AngersZhuuuu · 2022-05-16T03:33:20Z

Gentle ping @cloud-fan Could you take a look?

dongjoon-hyun · 2022-08-22T19:50:09Z

Could you rebase this PR once more? I'll review this PR, @AngersZhuuuu .

dongjoon-hyun

It seems that there is no concurrent query tests in this PR which is mentioned the main problem in the PR description? Did I understand correctly?

AngersZhuuuu · 2022-08-23T03:21:42Z

Could you rebase this PR once more? I'll review this PR, @AngersZhuuuu .

Yea!

AngersZhuuuu · 2022-08-23T03:22:33Z

It seems that there is no concurrent query tests in this PR which is mentioned the main problem in the PR description? Did I understand correctly?

It's hard to control two concurrent test commit files at the same time.

AngersZhuuuu · 2022-08-24T05:31:08Z

Could you rebase this PR once more? I'll review this PR, @AngersZhuuuu .

Conflict resolved

AngersZhuuuu · 2022-08-24T06:02:28Z

.../apache/spark/sql/execution/datasources/SQLOverwriteHadoopMapReduceCommitProtocolSuite.scala

+            .toDF("c1", "p1").repartition(1)
+          df.createOrReplaceTempView("temp")
+          sql("INSERT OVERWRITE TABLE t SELECT * FROM temp")
+          checkAnswer(sql("SELECT * FROM t"), df)


Before rename to the target output the file is

/Users/yi.zhu/Documents/project/Angerszhuuuu/spark/sql/core/spark-warehouse/org.apache.spark.sql.execution.datasources.SQLOverwriteHadoopMapReduceCommitProtocolSuite/.t-spark-staging-2f9f8d83-118a-45f4-86d3-d3b3f68e03fa/ ├── _SUCCESS └── part-00000-2f9f8d83-118a-45f4-86d3-d3b3f68e03fa-c000.snappy.parquet

AngersZhuuuu · 2022-08-24T06:03:18Z

.../apache/spark/sql/execution/datasources/SQLOverwriteHadoopMapReduceCommitProtocolSuite.scala

+          val df = Seq(1, 2, 3).toDF("c1")
+          df.createOrReplaceTempView("temp")
+          sql("INSERT OVERWRITE TABLE t PARTITION (p1 = 1, p2 = 1) SELECT * FROM temp")
+          checkAnswer(sql("SELECT c1 FROM t WHERE p1 = 1 AND p2 = 1"), df)


Before rename to partition's target path

/Users/yi.zhu/Documents/project/Angerszhuuuu/spark/sql/core/spark-warehouse/org.apache.spark.sql.execution.datasources.SQLOverwriteHadoopMapReduceCommitProtocolSuite/.t-spark-staging-4800632c-fd44-4dd4-964c-be1aca8a939d/ ├── _SUCCESS └── p1=1 └── p2=1 ├── part-00000-4800632c-fd44-4dd4-964c-be1aca8a939d.c000.snappy.parquet └── part-00001-4800632c-fd44-4dd4-964c-be1aca8a939d.c000.snappy.parquet 2 directories, 3 files

AngersZhuuuu · 2022-08-24T06:08:20Z

.../apache/spark/sql/execution/datasources/SQLOverwriteHadoopMapReduceCommitProtocolSuite.scala

+          df.createOrReplaceTempView("temp")
+          sql("INSERT OVERWRITE TABLE t SELECT * FROM temp")
+          checkAnswer(sql("SELECT * FROM t"), df)
+          checkAnswer(sql("SELECT c1 FROM t WHERE p1 = 1 AND p2 = 1"), Row(1) :: Nil)


Before move to target path, the file in staging dir

/Users/yi.zhu/Documents/project/Angerszhuuuu/spark/sql/core/spark-warehouse/org.apache.spark.sql.execution.datasources.SQLOverwriteHadoopMapReduceCommitProtocolSuite/.t-spark-staging-669ea1c6-5eaa-4ae7-a670-0261cbde3318/ ├── _SUCCESS ├── p1=1 │ └── p2=1 │ └── part-00000-669ea1c6-5eaa-4ae7-a670-0261cbde3318.c000.snappy.parquet ├── p1=2 │ └── p2=2 │ └── part-00001-669ea1c6-5eaa-4ae7-a670-0261cbde3318.c000.snappy.parquet └── p1=3 └── p2=3 └── part-00001-669ea1c6-5eaa-4ae7-a670-0261cbde3318.c000.snappy.parquet

dongjoon-hyun · 2022-08-25T04:24:40Z

Thank you, @AngersZhuuuu .

.../apache/spark/sql/execution/datasources/SQLOverwriteHadoopMapReduceCommitProtocolSuite.scala

dongjoon-hyun

Hi, @wangyum and @c21 . WDYT about this PR?

github-actions · 2022-12-04T00:18:55Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

[SPARK-36571][SQL] Add an SQLOverwriteHadoopMapReduceCommitProtocol t…

9bb02ba

…o support all SQL overwrite write data to staging dir

github-actions bot added CORE SQL labels Apr 4, 2022

AngersZhuuuu changed the title ~~[WIP][SPARK-36571][SQL] Add an SQLOverwriteHadoopMapReduceCommitProtocol to support all SQL overwrite write data to staging dir~~ [SPARK-36571][SQL] Add an SQLOverwriteHadoopMapReduceCommitProtocol to support all SQL overwrite write data to staging dir Apr 4, 2022

update

b345c6c

AngersZhuuuu mentioned this pull request Apr 7, 2022

[SPARK-31675][CORE] Fix rename and delete files with different filesystem #36070

Closed

Merge branch 'master' into SPARK-36571-STAGING-DIR-OVERWRITE

fa82d73

dongjoon-hyun reviewed Aug 22, 2022

View reviewed changes

AngersZhuuuu added 2 commits August 24, 2022 13:04

Merge branch 'master' into SPARK-36571-STAGING-DIR-OVERWRITE

d193326

Update InsertIntoHadoopFsRelationCommand.scala

2ee2840

AngersZhuuuu commented Aug 24, 2022

View reviewed changes

update

5f4a925

dongjoon-hyun reviewed Aug 25, 2022

View reviewed changes

.../apache/spark/sql/execution/datasources/SQLOverwriteHadoopMapReduceCommitProtocolSuite.scala Outdated Show resolved Hide resolved

dongjoon-hyun reviewed Aug 25, 2022

View reviewed changes

Update SQLOverwriteHadoopMapReduceCommitProtocolSuite.scala

0e2cc3e

github-actions bot added the Stale label Dec 4, 2022

github-actions bot closed this Dec 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-36571][SQL] Add an SQLOverwriteHadoopMapReduceCommitProtocol to support all SQL overwrite write data to staging dir #36056

[SPARK-36571][SQL] Add an SQLOverwriteHadoopMapReduceCommitProtocol to support all SQL overwrite write data to staging dir #36056

AngersZhuuuu commented Apr 4, 2022 •

edited

AngersZhuuuu commented Apr 4, 2022

AngersZhuuuu commented May 16, 2022

dongjoon-hyun commented Aug 22, 2022

dongjoon-hyun left a comment

AngersZhuuuu commented Aug 23, 2022

AngersZhuuuu commented Aug 23, 2022

AngersZhuuuu commented Aug 24, 2022

AngersZhuuuu Aug 24, 2022

AngersZhuuuu Aug 24, 2022 •

edited

AngersZhuuuu Aug 24, 2022

dongjoon-hyun commented Aug 25, 2022

dongjoon-hyun left a comment

github-actions bot commented Dec 4, 2022

[SPARK-36571][SQL] Add an SQLOverwriteHadoopMapReduceCommitProtocol to support all SQL overwrite write data to staging dir #36056

[SPARK-36571][SQL] Add an SQLOverwriteHadoopMapReduceCommitProtocol to support all SQL overwrite write data to staging dir #36056

Conversation

AngersZhuuuu commented Apr 4, 2022 • edited

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

AngersZhuuuu commented Apr 4, 2022

AngersZhuuuu commented May 16, 2022

dongjoon-hyun commented Aug 22, 2022

dongjoon-hyun left a comment

Choose a reason for hiding this comment

AngersZhuuuu commented Aug 23, 2022

AngersZhuuuu commented Aug 23, 2022

AngersZhuuuu commented Aug 24, 2022

AngersZhuuuu Aug 24, 2022

Choose a reason for hiding this comment

AngersZhuuuu Aug 24, 2022 • edited

Choose a reason for hiding this comment

AngersZhuuuu Aug 24, 2022

Choose a reason for hiding this comment

dongjoon-hyun commented Aug 25, 2022

dongjoon-hyun left a comment

Choose a reason for hiding this comment

github-actions bot commented Dec 4, 2022

AngersZhuuuu commented Apr 4, 2022 •

edited

AngersZhuuuu Aug 24, 2022 •

edited