New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-26180][CORE][TEST] Reuse withTempDir function to the SparkCore test case #23151
Conversation
Test build #99302 has finished for PR 23151 at commit
|
d29d4ec
to
6402d2d
Compare
Test build #99303 has finished for PR 23151 at commit
|
retest this please |
Test build #99315 has finished for PR 23151 at commit
|
Good catch |
* | ||
* @todo Probably this method should be moved to a more general place | ||
*/ | ||
protected def withCreateTempDir(f: File => Unit): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason not to use withTempDir
as a function name like other modules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trait SQLTestUtils extends SparkFunSuite with SQLTestUtilsBase with PlanTest
if SparkFunSuite
and SQLTestUtilsBase
use the same name withTempDir
.
Can cause name contamination? thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we have this function in SparkFunSuite
, why do we need to define it again in SQLTestUtils
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, withTempDir
and withCreateTempDir
are somewhat different.
withTempDir
protected def withTempDir(f: File => Unit): Unit = {
val dir = Utils.createTempDir().getCanonicalFile
try f(dir) finally {
// wait for all tasks to finish before deleting files
waitForTasksToFinish()
Utils.deleteRecursively(dir)
}
}
protected def waitForTasksToFinish(): Unit = {
eventually(timeout(10.seconds)) {
assert(spark.sparkContext.statusTracker
.getExecutorInfos.map(_.numRunningTasks()).sum == 0)
}
}
withCreateTempDir
protected def withCreateTempDir(f: File => Unit): Unit = {
val dir = Utils.createTempDir()
try f(dir) finally {
Utils.deleteRecursively(dir)
}
}
thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we also do waitForTasksToFinish
in withCreateTempDir
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I need assert(spark.sparkContext.statusTracker .getExecutorInfos.map(_.numRunningTasks()).sum == 0)
. after all, protected def spark: SparkSession
defined in SQLTestData
. Unless we construct one waitForTasksToFinish
look like
protected def waitForTasksToFinish(): Unit = {
eventually(timeout(10.seconds)) {
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, how about in SQLTestUtils
we override withTempDir
with this extra logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, maybe I didn't make it clear. withTempDir
define in SQLTestUtilsBase
not in SQLTestUtils
.if we will override withTempDir
first must be remove withTempDir
from SQLTestUtilsBase
to SQLTestUtils
and override, but in this way, other functions are also involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not talking about details like which calss to override, just the idea. Why wouldn't override work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it seems like we should be able to use an override. The subclass that needs to inject an additional method call in the block can call the super method with a lambda that calls the user-supplied block, then this other method. It's probably worth whatever surgery is needed to make this clean and reduce duplication. We already have a lot of "create temp thing" methods all over.
* @todo Probably this method should be moved to a more general place | ||
*/ | ||
protected def withCreateTempDir(f: File => Unit): Unit = { | ||
val dir = Utils.createTempDir() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to call .getCanonicalFile
, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I need call .getCanonicalFile
again. i feel it's a little redundant. review Utils.createTempDir()
--> createDirectory
--> dir.getCanonicalFile
.
It has been called .getCanonicalFile
. please correct If I misunderstand. thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes shouldn't be necessary here.
* Creates a temporary directory, which is then passed to `f` and will be deleted after `f` | ||
* returns. | ||
* | ||
* @todo Probably this method should be moved to a more general place |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the most general place we can find...
6402d2d
to
d7482a3
Compare
Test build #99421 has finished for PR 23151 at commit
|
d7482a3
to
073f5f7
Compare
Test build #99436 has finished for PR 23151 at commit
|
073f5f7
to
0537eb6
Compare
Test build #99440 has finished for PR 23151 at commit
|
0537eb6
to
819dc6a
Compare
Test build #99441 has finished for PR 23151 at commit
|
819dc6a
to
c387cc7
Compare
Test build #99442 has finished for PR 23151 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the cleanup. I think there are even more places where createTempDir() is called and could use this new pattern. If you have time, are you willing to search out some more? we don't need to get all of them but maybe some more easy cases out there.
val unusedJar = TestUtils.createJarWithClasses(Seq.empty) | ||
val fileSystem = Utils.getHadoopFileSystem("/", | ||
SparkHadoopUtil.get.newConfiguration(new SparkConf())) | ||
try { | ||
withTempDir { testDir => | ||
testDir.deleteOnExit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I think this is redundant for temp dirs, you can put this in the Utils.createTempDir method and take it out in places like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleteOnExit
is redundant for directories created by Utils.createTempDir
. That method already tracks directories to cleanup on exit.
(Also deleteOnExit
does not work for non-empty directories.)
*/ | ||
protected override def withTempDir(f: File => Unit): Unit = { | ||
val dir = Utils.createTempDir().getCanonicalFile | ||
try f(dir) finally { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not call the super method with a function that calls f, then waitForTasksToFinish()?
c387cc7
to
482c4f4
Compare
/** | ||
* Creates a temporary directory, which is then passed to `f` and will be deleted after `f` | ||
* returns. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unnecessary blank line
* | ||
*/ | ||
protected override def withTempDir(f: File => Unit): Unit = { | ||
super.withTempDir { dir => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea this is what I expect, thanks for doing it!
|
||
// All the resources should still be remote paths, so that YARN client will not upload again. | ||
conf.get("spark.yarn.dist.jars") should be (tmpJarPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't change those spaces alone tho. Let's leave as were.
|
||
conf.get(PY_FILES.key) should be (s"s3a://${pyFile.getAbsolutePath}") | ||
conf.get("spark.submit.pyFiles") should (startWith("/")) | ||
conf.get(PY_FILES.key) should be(s"s3a://${pyFile.getAbsolutePath}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto. Technically it should better be assert and avoid infix notation but I think we don't have to do it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. PR description should also be updated (e.g., inheritance, same name, etc.)
Test build #99489 has finished for PR 23151 at commit
|
482c4f4
to
beccd74
Compare
Test build #99496 has finished for PR 23151 at commit
|
retest this please |
Test build #99505 has finished for PR 23151 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comments to date have been incorporated... this is looking good.
@@ -245,8 +245,7 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu | |||
} | |||
|
|||
test("cannot call addFile with different paths that have the same filename") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @heary-cao . This PR is very useful. While touching this file, could you replace more val dir = Utils.createTempDir()
with new withTempDir
in this file? For example, the above test case at line 235.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, i have update it. please review it. thanks.
beccd74
to
de0ef11
Compare
Test build #99532 has finished for PR 23151 at commit
|
de0ef11
to
6d2f2d3
Compare
Test build #99533 has finished for PR 23151 at commit
|
Merged to master. |
… test case ## What changes were proposed in this pull request? Currently, the common `withTempDir` function is used in Spark SQL test cases. To handle `val dir = Utils. createTempDir()` and `Utils. deleteRecursively (dir)`. Unfortunately, the `withTempDir` function cannot be used in the Spark Core test case. This PR Sharing `withTempDir` function in Spark Sql and SparkCore to clean up SparkCore test cases. thanks. ## How was this patch tested? N / A Closes apache#23151 from heary-cao/withCreateTempDir. Authored-by: caoxuewen <cao.xuewen@zte.com.cn> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
What changes were proposed in this pull request?
Currently, the common
withTempDir
function is used in Spark SQL test cases. To handleval dir = Utils. createTempDir()
andUtils. deleteRecursively (dir)
. Unfortunately, thewithTempDir
function cannot be used in the Spark Core test case. This PR SharingwithTempDir
function in Spark Sql and SparkCore to clean up SparkCore test cases. thanks.How was this patch tested?
N / A