Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21568][CORE] ConsoleProgressBar should only be enabled in shells #19061

Closed
wants to merge 4 commits into from
Closed

[SPARK-21568][CORE] ConsoleProgressBar should only be enabled in shells #19061

wants to merge 4 commits into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Aug 26, 2017

What changes were proposed in this pull request?

This PR disables console progress bar feature in non-shell environment by overriding the configuration.

How was this patch tested?

Manual. Run the following examples with and without spark.ui.showConsoleProgress in order to see progress bar on master branch and this PR.

Scala Shell

spark.range(1000000000).map(_ + 1).count

PySpark

spark.range(10000000).rdd.map(lambda x: len(x)).count()

Spark Submit

from pyspark.sql import SparkSession

if __name__ == "__main__":
    spark = SparkSession.builder.getOrCreate()
    spark.range(2000000).rdd.map(lambda row: len(row)).count()
    spark.stop()

@dongjoon-hyun
Copy link
Member Author

Hi, @vanzin .
Could you review this PR?

@SparkQA
Copy link

SparkQA commented Aug 26, 2017

Test build #81156 has finished for PR 19061 at commit 62645d2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 27, 2017

Test build #81159 has finished for PR 19061 at commit 1db4bb1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Hi, @vanzin .
Could you review this when you have sometime? I'm wondering if this is implemented correctly in a way you expected. Please let me know if there is something to do more. Thanks!

@jerryshao
Copy link
Contributor

@dongjoon-hyun I'm just thinking if other repl-like projects may actually require this, you changes here make them fail to leverage this feature. Did you see any issue with this feature on in non-shell apps?

Also I think you should do the check in SparkConf because user can still set this with SparkConf in run-time.

@dongjoon-hyun
Copy link
Member Author

Thank you for review, @jerryshao .
@vanzin reported the issue in non-shell apps in the JIRA. Please refer JIRA.

Copy link
Contributor

@vanzin vanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only covers spark-submit, and not applications that just embed a SparkContext and are not shells.

That would require changing the logic where this is initialized in SparkContext, although I'm not sure what would be the best way to handle that case...

@@ -537,6 +537,11 @@ object SparkSubmit extends CommandLineUtils {
}
}

// SPARK-21568 ConsoleProgressBar should be enabled only in shells
Copy link
Contributor

@vanzin vanzin Aug 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of comments like this. It's just repeating what the code is doing.

Instead, can you change the comment to mention that it's ok to do this because the user configuration is loaded later in this method and will override this value if set?

@@ -804,7 +804,7 @@ Apart from these, the following properties are also available, and may be useful
<td>
Show the progress bar in the console. The progress bar shows the progress of stages
that run for longer than 500ms. If multiple stages run at the same time, multiple
progress bars will be displayed on the same line.
progress bars will be displayed on the same line. This is applied only in shells.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will still be enabled for normal apps if explicitly set in the configuration, so your update is not really correct.

@dongjoon-hyun
Copy link
Member Author

Thank you for reiview, @vanzin . Yep. @jerryshao also advised me the missing point. I'll try to make it more general.

@dongjoon-hyun
Copy link
Member Author

BTW, @vanzin . I was waiting your comment because @jerryshao is worried about 3rd party relations.

other repl-like projects may actually require this

Is it okay we change this behavior like this?

@vanzin
Copy link
Contributor

vanzin commented Aug 29, 2017

Is it okay we change this behavior like this?

It's not optimal, but at the same time, the existing behavior wasn't really correctly advertised anyway. If such a thing as a non-Spark repl-like application exists, it wouldn't be getting the progress bar by default, for example, because its default log level is "INFO" in Spark, something that disables the progress bar.

@dongjoon-hyun
Copy link
Member Author

I see. Thank you for the guide!

@jerryshao
Copy link
Contributor

If such a thing as a non-Spark repl-like application exists, it wouldn't be getting the progress bar by default, for example, because its default log level is "INFO" in Spark, something that disables the progress bar.

That makes sense!

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-21568][CORE] ConsoleProgressBar should only be enabled in shells [SPARK-21568][CORE][WIP] ConsoleProgressBar should only be enabled in shells Oct 2, 2017
@SparkQA
Copy link

SparkQA commented Oct 3, 2017

Test build #82401 has finished for PR 19061 at commit a465619.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 3, 2017

Test build #82415 has finished for PR 19061 at commit 6f6141e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-21568][CORE][WIP] ConsoleProgressBar should only be enabled in shells [SPARK-21568][CORE] ConsoleProgressBar should only be enabled in shells Oct 3, 2017
@dongjoon-hyun
Copy link
Member Author

Hi, @vanzin and @jerryshao .
Could you review this again?

@@ -434,7 +434,7 @@ class SparkContext(config: SparkConf) extends Logging {
_statusTracker = new SparkStatusTracker(this)

_progressBar =
if (_conf.getBoolean("spark.ui.showConsoleProgress", true) && !log.isInfoEnabled) {
if (_conf.getBoolean("spark.ui.showConsoleProgress", false) && !log.isInfoEnabled) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, the default value is false. However, SparkSubmit will inject 'true' if spark.ui.showConsoleProgress is not defined in our Shell environments.

The original issue was the default behavior of this option. This PR solves that. If you want to stop users from using this option in order to enable ConsoleProgressBar, we also are able to change this config completely into some internal one, spark.internal.ui.showConsoleProgress in this SparkContext class. In that case, spark.ui.showConsoleProgress will converted into spark.internal.ui.showConsoleProgress in SparkSubmit.

@@ -598,6 +598,15 @@ object SparkSubmit extends CommandLineUtils with Logging {
}
}

// In case of shells, spark.ui.showConsoleProgress can be true by default or by user.
if (isShell(args.primaryResource)) {
if (!sparkConf.contains("spark.ui.showConsoleProgress")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a config constant now you're using this in more places?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @vanzin . I'll create a new config for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ur, @vanzin . It seems there is no pre-defined config for spark.ui.XXX.
Is it a convention for Apache Spark? Which class do you prefer to add a new config constant?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be added to the config package object like all other constants in core.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Then, I'll create a deploy/config.scala and put it there. The followings are the existing config.scala files, but it seems to be irrelevant to deploy/SparkSubmit.scala.

$ find . -name config.scala
./core/src/main/scala/org/apache/spark/deploy/history/config.scala
./resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/config.scala
./resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/config.scala

sysProps("spark.ui.showConsoleProgress") = "true"
}
} else {
sysProps("spark.ui.showConsoleProgress") = "false"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This'll prevent user's mistakes by overriding user's spark.ui.showConsoleProgress=true with SparkSubmit CLI. In case of SparkSubmit, only shells will accept user's spark.ui.showConsoleProgress.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, yes. It can be removed because SparkContext also accepts user's configuration. In that case, you're right. This is not needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, I prefer blocking it from here. Please comment me back if this should be removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a need to prevent users from forcing this configuration if they want to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I see. Thanks!

@@ -203,6 +203,10 @@ package object config {
private[spark] val HISTORY_UI_MAX_APPS =
ConfigBuilder("spark.history.ui.maxApplications").intConf.createWithDefault(Integer.MAX_VALUE)

private[spark] val UI_SHOW_CONSOLE_PROGRESS = ConfigBuilder("spark.ui.showConsoleProgress")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While updating the PR, I noticed that internal/config/package.scala looks more natural for this.

Like spark.ui.retainedTasks, I added here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add document to this config using .doc(...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

@SparkQA
Copy link

SparkQA commented Oct 4, 2017

Test build #82441 has finished for PR 19061 at commit e2158ad.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Could you review this ConsoleProgressBar PR again, @vanzin ?

@@ -203,6 +203,10 @@ package object config {
private[spark] val HISTORY_UI_MAX_APPS =
ConfigBuilder("spark.history.ui.maxApplications").intConf.createWithDefault(Integer.MAX_VALUE)

private[spark] val UI_SHOW_CONSOLE_PROGRESS = ConfigBuilder("spark.ui.showConsoleProgress")
.booleanConf
.createOptional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createWithDefault(false)

@@ -434,7 +434,7 @@ class SparkContext(config: SparkConf) extends Logging {
_statusTracker = new SparkStatusTracker(this)

_progressBar =
if (_conf.getBoolean("spark.ui.showConsoleProgress", true) && !log.isInfoEnabled) {
if (_conf.getBoolean(UI_SHOW_CONSOLE_PROGRESS.key, false) && !log.isInfoEnabled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_conf.get(UI_SHOW_CONSOLE_PROGRESS)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Do you mean this?

if (_conf.get(UI_SHOW_CONSOLE_PROGRESS)) {
...
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, never mind.

@@ -598,6 +598,11 @@ object SparkSubmit extends CommandLineUtils with Logging {
}
}

// In case of shells, spark.ui.showConsoleProgress can be true by default or by user.
if (isShell(args.primaryResource) && !sparkConf.contains(UI_SHOW_CONSOLE_PROGRESS.key)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!sparkConf.contains(UI_SHOW_CONSOLE_PROGRESS)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, this?

if (isShell(args.primaryResource) && !sparkConf.contains(UI_SHOW_CONSOLE_PROGRESS)) {
...
}

val clArgs1 = Seq("--class", "org.apache.spark.repl.Main", "spark-shell")
val appArgs1 = new SparkSubmitArguments(clArgs1)
val (_, _, sysProps1, _) = prepareSubmitEnvironment(appArgs1)
sysProps1("spark.ui.showConsoleProgress") should be ("true")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UI_SHOW_CONSOLE_PROGRESS.key

val clArgs2 = Seq("--class", "org.SomeClass", "thejar.jar")
val appArgs2 = new SparkSubmitArguments(clArgs2)
val (_, _, sysProps2, _) = prepareSubmitEnvironment(appArgs2)
sysProps2.keys should not contain "spark.ui.showConsoleProgress"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UI_SHOW_CONSOLE_PROGRESS.key

@dongjoon-hyun
Copy link
Member Author

The PR is updated. Thank you for review, @vanzin !

@dongjoon-hyun
Copy link
Member Author

It's done. Thank you for review, @jiangxb1987 .

@SparkQA
Copy link

SparkQA commented Oct 6, 2017

Test build #82488 has finished for PR 19061 at commit 6e59cf1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 6, 2017

Test build #82491 has finished for PR 19061 at commit ff4b8e4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Hi, @vanzin and @jerryshao .
Could you review this again when you have a chance? Thank you!

@dongjoon-hyun
Copy link
Member Author

Retest this please.

@SparkQA
Copy link

SparkQA commented Oct 8, 2017

Test build #82545 has finished for PR 19061 at commit ff4b8e4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Hi, @vanzin and @jerryshao .
Could you review this ConsoleProgressBar issue when you have some time?

@vanzin
Copy link
Contributor

vanzin commented Oct 9, 2017

Merging to master.

@asfgit asfgit closed this in f31e114 Oct 9, 2017
@dongjoon-hyun
Copy link
Member Author

Thank you for review and merging, @vanzin !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants