-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-35106] Avoid failing rename in HadoopMapReduceCommitProtocol with dynamic partition overwrite #32207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-35106] Avoid failing rename in HadoopMapReduceCommitProtocol with dynamic partition overwrite #32207
Conversation
|
Test build #756210704 for PR 32207 at commit |
|
cc @mridulm and @cloud-fan |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137490 has finished for PR 32207 at commit
|
|
I took a look at the failing test: It does seem legitimate, but it led me to be even more confused about this functionality. It fails at this last step: I ran through this code in a debugger and it appears that the unit test is relying on the behavior of This means that the unit test works properly on a local FS, but fails when run against HDFS. I verified this by executing the unit test code (slightly modified) in a Spark Shell instance: scala> val scheme = "file"
scala> :paste
// Entering paste mode (ctrl-D to finish)
val basepath = s"$scheme:/tmp/ekrogentest/base"
val path1 = s"$scheme:/tmp/ekrogentest/1"
val path2 = s"$scheme:/tmp/ekrogentest/2"
// refresh everything
sql("DROP TABLE IF EXISTS t")
val fs = new Path(basepath).getFileSystem(sc.hadoopConfiguration)
fs.delete(new Path(basepath).getParent, true)
Seq(basepath, path1, path2).foreach(p => fs.mkdirs(new Path(p)))
sql(
s"""
|create table t(i int, part1 int, part2 int) using parquet
|partitioned by (part1, part2) location '$basepath'
""".stripMargin)
//val path1 = Utils.createTempDir()
sql(s"alter table t add partition(part1=1, part2=1) location '$path1'")
sql(s"insert into t partition(part1=1, part2=1) select 1")
sql("SELECT * FROM t").show()
//checkAnswer(spark.table("t"), Row(1, 1, 1))
sql("insert overwrite table t partition(part1=1, part2=1) select 2")
sql("SELECT * FROM t").show()
//checkAnswer(spark.table("t"), Row(2, 1, 1))
sql("insert overwrite table t partition(part1=2, part2) select 2, 2")
sql("SELECT * FROM t").show()
//checkAnswer(spark.table("t"), Row(2, 1, 1) :: Row(2, 2, 2) :: Nil)
//val path2 = Utils.createTempDir()
sql(s"alter table t add partition(part1=1, part2=2) location '$path2'")
sql("insert overwrite table t partition(part1=1, part2=2) select 3")
sql("SELECT * FROM t").show()
//checkAnswer(spark.table("t"), Row(2, 1, 1) :: Row(2, 2, 2) :: Row(3, 1, 2) :: Nil)
sql("insert overwrite table t partition(part1=1, part2) select 4, 1")
sql("SELECT * FROM t").show()
//checkAnswer(spark.table("t"), Row(4, 1, 1) :: Row(2, 2, 2) :: Row(3, 1, 2) :: Nil)
// Exiting paste mode, now interpreting.
+---+-----+-----+
| i|part1|part2|
+---+-----+-----+
| 1| 1| 1|
+---+-----+-----+
+---+-----+-----+
| i|part1|part2|
+---+-----+-----+
| 2| 1| 1|
+---+-----+-----+
+---+-----+-----+
| i|part1|part2|
+---+-----+-----+
| 2| 2| 2|
| 2| 1| 1|
+---+-----+-----+
+---+-----+-----+
| i|part1|part2|
+---+-----+-----+
| 3| 1| 2|
| 2| 2| 2|
| 2| 1| 1|
+---+-----+-----+
+---+-----+-----+
| i|part1|part2|
+---+-----+-----+
| 3| 1| 2|
| 2| 2| 2|
| 4| 1| 1|
+---+-----+-----+Works fine when using the local file system. However when I rerun the same using HDFS: scala> val scheme = "hdfs"
scheme: String = hdfs
scala> :paste
// Entering paste mode (ctrl-D to finish)
val basepath = s"$scheme:/tmp/ekrogentest/base"
val path1 = s"$scheme:/tmp/ekrogentest/1"
val path2 = s"$scheme:/tmp/ekrogentest/2"
sql("DROP TABLE IF EXISTS t")
val fs = new Path(basepath).getFileSystem(sc.hadoopConfiguration)
fs.delete(new Path(basepath).getParent, true)
Seq(basepath, path1, path2).foreach(p => fs.mkdirs(new Path(p)))
sql(
s"""
|create table t(i int, part1 int, part2 int) using parquet
|partitioned by (part1, part2) location '$basepath'
""".stripMargin)
//val path1 = Utils.createTempDir()
sql(s"alter table t add partition(part1=1, part2=1) location '$path1'")
sql(s"insert into t partition(part1=1, part2=1) select 1")
sql("SELECT * FROM t").show()
//checkAnswer(spark.table("t"), Row(1, 1, 1))
sql("insert overwrite table t partition(part1=1, part2=1) select 2")
sql("SELECT * FROM t").show()
//checkAnswer(spark.table("t"), Row(2, 1, 1))
sql("insert overwrite table t partition(part1=2, part2) select 2, 2")
sql("SELECT * FROM t").show()
//checkAnswer(spark.table("t"), Row(2, 1, 1) :: Row(2, 2, 2) :: Nil)
//val path2 = Utils.createTempDir()
sql(s"alter table t add partition(part1=1, part2=2) location '$path2'")
sql("insert overwrite table t partition(part1=1, part2=2) select 3")
sql("SELECT * FROM t").show()
//checkAnswer(spark.table("t"), Row(2, 1, 1) :: Row(2, 2, 2) :: Row(3, 1, 2) :: Nil)
sql("insert overwrite table t partition(part1=1, part2) select 4, 1")
sql("SELECT * FROM t").show()
//checkAnswer(spark.table("t"), Row(4, 1, 1) :: Row(2, 2, 2) :: Row(3, 1, 2) :: Nil)
// Exiting paste mode, now interpreting.
+---+-----+-----+
| i|part1|part2|
+---+-----+-----+
| 1| 1| 1|
+---+-----+-----+
21/04/16 22:43:38 WARN HadoopFSUtils: The directory hdfs://.../tmp/ekrogentest/1 was not found. Was it deleted very recently?
+---+-----+-----+
| i|part1|part2|
+---+-----+-----+
+---+-----+-----+
21/04/16 22:43:39 WARN HadoopFSUtils: The directory hdfs://.../tmp/ekrogentest/1 was not found. Was it deleted very recently?
+---+-----+-----+
| i|part1|part2|
+---+-----+-----+
| 2| 2| 2|
+---+-----+-----+
21/04/16 22:43:39 WARN HadoopFSUtils: The directory hdfs://.../tmp/ekrogentest/2 was not found. Was it deleted very recently?
21/04/16 22:43:39 WARN HadoopFSUtils: The directory hdfs://.../tmp/ekrogentest/1 was not found. Was it deleted very recently?
+---+-----+-----+
| i|part1|part2|
+---+-----+-----+
| 2| 2| 2|
+---+-----+-----+
21/04/16 22:43:39 WARN HadoopFSUtils: The directory hdfs://.../tmp/ekrogentest/2 was not found. Was it deleted very recently?
21/04/16 22:43:40 WARN HadoopFSUtils: The directory hdfs://.../tmp/ekrogentest/1 was not found. Was it deleted very recently?
+---+-----+-----+
| i|part1|part2|
+---+-----+-----+
| 2| 2| 2|
+---+-----+-----+
basepath: String = hdfs:/tmp/ekrogentest/base
path1: String = hdfs:/tmp/ekrogentest/1
path2: String = hdfs:/tmp/ekrogentest/2
fs: org.apache.hadoop.fs.FileSystem = DFS[DFSClient[...]]Now everything is broken. @cloud-fan , it looks like you added this, what is the expected behavior here? I can't tell if I'm missing something. |
|
@xkrogen the test does seem to be legitimate, but this is likely to be a long standing bug. Can you look into it and see if we can support it? If not we need to throw a clear error instead of relying on |
|
I don't quite understand the commit sequence under the various modes so I can't provide any quick input on whether this is easily fixable. I already spent more time on this issue than I was expecting and it's pretty far outside of my normal scope so I can't devote more time currently, but if I do find some spare cycles in the future, I will try to circle back here. Thanks for your input so far! |
|
Or we can ignore the test first and move forward. It's a long-standing bug and not caused by this patch. |
|
Hello, about “we should only run Block 2 in the dynamicPartitionOverwrite == false case”: the Block 2 is actually meant for custom partition paths (i.e. absolute partitions), in both dynamic partition overwrite or static partition overwrite cases. That’s probably the reason why InsertSuite.test("SPARK-20236: dynamic partition overwrite with custom partition path") failed with the changes. The fix could be re-creating the parent directories when required. We created another PR for this jira. |
|
@YuzhouSun Can you help to take over this PR? |
|
Created #32530. @cloud-fan Could you review it? Thanks! |
|
Closing in favor of #32530 |
What changes were proposed in this pull request?
Clean up code in
HadoopMapReduceCommitProtocol#commitJobto avoid renames that will always fail (usually silently).Why are the changes needed?
This renames in this block will always fail under
dynamicPartitionOverwrite == true:spark/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
Lines 191 to 218 in 0494dc9
We have the following sequence of events:
filesToMove.valuesfilesToMove.keystofilesToMove.valuesAll renames in the for-loop will always fail, since all parent directories of
filesToMove.valueswere just deleted. Under a normal HDFS scenario, the contract offs.renameis to returnfalseunder such a failure scenario, as opposed to throwing an exception. This allows for dynamic partition overwrite to work, albeit with a bunch of failed renames in the middle. Really, we should only run the for-loop deletions in thedynamicPartitionOverwrite == falsecase, and consolidate the two if-blocks for thetruecase.Does this PR introduce any user-facing change?
In almost all cases, no. However if you happen to use a
FileSystemimplementation which throws an exception on this kind offs.renamecase,dynamicPartitionOverwritewill be unusable prior to this PR, and start working after this PR.How was this patch tested?
Did not add/modify tests. Didn't see test cases for this file. Open to suggestions on where/how to add such tests.