Skip to content

[SPARK-21349][CORE] Make TASK_SIZE_TO_WARN_KB configurable by spark.task.size.warnThreshold#18573

Closed
dongjoon-hyun wants to merge 1 commit intoapache:masterfrom
dongjoon-hyun:SPARK-21349
Closed

[SPARK-21349][CORE] Make TASK_SIZE_TO_WARN_KB configurable by spark.task.size.warnThreshold#18573
dongjoon-hyun wants to merge 1 commit intoapache:masterfrom
dongjoon-hyun:SPARK-21349

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Jul 8, 2017

What changes were proposed in this pull request?

Although this is just a warning message, this issue tries to make TASK_SIZE_TO_WARN_KB into a normal Spark configuration for advanced users.

Case 1

scala> val data = (1 to (24*365*3)).map(i => (i, s"$i", i % 2 == 0)).toDF("col1", "part1", "part2")
data: org.apache.spark.sql.DataFrame = [col1: int, part1: string ... 1 more field]

scala> data.write.format("parquet").partitionBy("part1", "part2").mode("overwrite").saveAsTable("t")
17/08/29 21:08:04 WARN TaskSetManager: Stage 0 contains a task of very large size (190 KB). The maximum recommended task size is 100 KB.
17/08/29 21:09:34 WARN TaskSetManager: Stage 2 contains a task of very large size (233 KB). The maximum recommended task size is 100 KB.

scala> spark.version
res1: String = 2.3.0-SNAPSHOT

According to the log, we have 123 warnings. This is useful, but it would be great if this is configurable.

How was this patch tested?

Pass the Jenkins with a newly added test case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you trying to make this configuration an internal one, or you want to expose to the user? If the latter, I think we should add this to docs.

Also it would be better to make this configuration a ConfigEntry and add to o.a.s.internal.config.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Thank you for review, @jerryshao!

@SparkQA
Copy link

SparkQA commented Jul 9, 2017

Test build #79393 has finished for PR 18573 at commit 776d51a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 9, 2017

Test build #79398 has finished for PR 18573 at commit f90b2ff.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Retest this please

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add checkValue here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use conf.get(TASK_SIZE_THRESHOLD_TO_WARN)?

@dongjoon-hyun
Copy link
Member Author

Thank you for review, @jiangxb1987 . I'll update soon.

@SparkQA
Copy link

SparkQA commented Jul 9, 2017

Test build #79403 has finished for PR 18573 at commit f90b2ff.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 9, 2017

Test build #79414 has finished for PR 18573 at commit 2a45d90.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Jul 9, 2017

Hi, @kayousterhout .
Could you review this PR? This is introduced by you at [SPARK-2185] Emit warning when task size exceeds a threshold. in Spark 1.1.0.

@dongjoon-hyun
Copy link
Member Author

Rebased to the master.

@SparkQA
Copy link

SparkQA commented Aug 30, 2017

Test build #81249 has finished for PR 18573 at commit 6ab5b5b.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-21349 branch September 1, 2017 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants