[SPARK-26457] Show hadoop configurations in HistoryServer environment tab #23486

deshanxiao · 2019-01-07T12:49:13Z

What changes were proposed in this pull request?

I know that yarn provided all hadoop configurations. But I guess it may be fine that the historyserver unify all configuration in it. It will be convenient for us to debug some problems.

How was this patch tested?

planga82 · 2019-01-07T16:33:21Z

Could you attach a screenshot? to see what the new properties look like. Thanks!!

planga82 · 2019-01-07T16:37:19Z

core/src/main/scala/org/apache/spark/SparkEnv.scala

@@ -442,9 +443,13 @@ object SparkEnv extends Logging {
    val addedJarsAndFiles = (addedJars ++ addedFiles).map((_, "Added By User"))
    val classPaths = (addedJarsAndFiles ++ classPathEntries).sorted

+    // Add Hadoop properties, it will ignore configs including in Spark
+    val hadoopProperties = hadoopConf.asScala.filter(entry => !conf.contains("spark.hadoop." + entry.getKey)).


Why do you exclude "spark.hadoop" conf? it could be posible that they are showing in spark properties?

Yes, these properties have been shown above. So I remove it in hadoop properties.

deshanxiao · 2019-01-08T03:41:25Z

@planga82 Thank you for your reply! Sure! Here it is!

planga82 · 2019-01-08T22:32:14Z

what happens if there are not hadoop options? I was thinking about a standalone scenario or Mesos scenario.

deshanxiao · 2019-01-09T01:47:05Z

@planga82
For yarn application, it will be convenient because we don't need to see hadoop configurations in two sides. I used to meet with a case where users set "fs.impl.disable.cache"=true caused out of memory when they list so many hdfs files. It will not be shown in spark properties if they just set it in hadoop configuration files.

planga82 · 2019-01-09T07:22:58Z

@deshanxiao
It looks good for me. You need a commiter to authorize testing.

felixcheung · 2019-01-09T08:03:28Z

core/src/main/scala/org/apache/spark/SparkEnv.scala

@@ -442,9 +445,13 @@ object SparkEnv extends Logging {
    val addedJarsAndFiles = (addedJars ++ addedFiles).map((_, "Added By User"))
    val classPaths = (addedJarsAndFiles ++ classPathEntries).sorted

+    // Add Hadoop properties, it will ignore configs including in Spark
+    val hadoopProperties = hadoopConf.asScala.filter(entry => !conf.contains("spark.hadoop." + entry.getKey)).


instead of conf.contains, use startsWith?

I actually don't think we should filter them - if I am trying to debug why the config is not getting picked up it's useful to see everyone listed here

@felixcheung Yes, I argee with you. I will remove it.

gengliangwang · 2019-01-10T17:23:10Z

core/src/main/scala/org/apache/spark/status/api/v1/api.scala

@@ -352,6 +352,7 @@ class VersionInfo private[spark](
 class ApplicationEnvironmentInfo private[spark] (
    val runtime: RuntimeInfo,
    val sparkProperties: Seq[(String, String)],
+    val hadoopProperties: Seq[(String, String)],


I tried running the SHS with previous event logs, it seems that the API change will cause json parse exception.

Yes, you are right. Thank you! I have repaired it.

deshanxiao · 2019-01-14T06:39:09Z

@srowen @shahidki31 Could you give me some suggestions?

srowen

I think this is OK. My only concerns is that it makes the env page bigger and heavier. Are these sections collapsed by default or could they be?

core/src/main/scala/org/apache/spark/SparkEnv.scala

deshanxiao · 2019-01-15T01:48:29Z

@srowen Thank you! These sections can be collapsed. By default, it's not. Maybe the default style could keep in line with other properties.

HyukjinKwon · 2019-01-15T03:16:02Z

ok to test

HyukjinKwon · 2019-01-15T03:16:58Z

+1 for Sean's opinion. I think it's okie too but I'm a bit worried if the hadoop configurations looks overwhelming in that page.

SparkQA · 2019-01-15T03:29:20Z

Test build #101221 has finished for PR 23486 at commit b275970.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2019-01-15T03:52:00Z

core/src/main/scala/org/apache/spark/status/api/v1/api.scala

@@ -352,6 +352,7 @@ class VersionInfo private[spark](
 class ApplicationEnvironmentInfo private[spark] (
    val runtime: RuntimeInfo,
    val sparkProperties: Seq[(String, String)],
+    val hadoopProperties: Seq[(String, String)],


Looks it breaks mima but the constructor is private. Let's exclude it in mima. See the Jenkins test messages.

Thank you! I will try it.

deshanxiao · 2019-01-15T04:52:44Z

Retest please.

SparkQA · 2019-01-15T08:05:01Z

Test build #101226 has finished for PR 23486 at commit 209ba5f.

This patch fails due to an unknown error code, -9.
This patch does not merge cleanly.
This patch adds no public classes.

srowen · 2019-01-15T14:00:33Z

@deshanxiao I wonder if it's easy enough to make this or even other sections besides Spark properties collapsed by default in the HTML here too

srowen · 2019-01-15T13:59:29Z

project/MimaExcludes.scala

@@ -220,6 +220,9 @@ object MimaExcludes {
    // [SPARK-26139] Implement shuffle write metrics in SQL
    ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.ShuffleDependency.this"),

+    // [SPARK-26457] Show hadoop configurations in HistoryServer environment tab


I wonder if this changes any user-facing API then? if it's something people might use anywhere, I'd keep the constructor

OK, maybe regarding hadoopConf as an optional parameters will be better.

deshanxiao · 2019-01-15T14:30:54Z

@srowen Yes, it looks like not difficult. I will make other properties collapsed by default.

fork

SparkQA · 2019-01-16T05:56:59Z

Test build #101295 has finished for PR 23486 at commit b275970.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-01-16T05:59:52Z

Test build #101293 has finished for PR 23486 at commit ae78595.

This patch fails MiMa tests.
This patch does not merge cleanly.
This patch adds no public classes.

SparkQA · 2019-01-16T06:01:46Z

Test build #101294 has finished for PR 23486 at commit fd0b26a.

This patch fails MiMa tests.
This patch does not merge cleanly.
This patch adds no public classes.

SparkQA · 2019-01-16T06:03:48Z

Test build #101296 has finished for PR 23486 at commit 2fff471.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

deshanxiao · 2019-01-16T06:52:43Z

I try to make these properties collapsed by default.

Hadoop Properties
System Properties
Classpath Entries

It looks like:

SparkQA · 2019-01-16T07:05:32Z

Test build #101298 has finished for PR 23486 at commit cbcd90c.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-01-16T07:23:00Z

Test build #101299 has finished for PR 23486 at commit 7bcd656.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

srowen

You'll have to resolve the MiMa warning:

[error]  * method this(org.apache.spark.status.api.v1.RuntimeInfo,scala.collection.Seq,scala.collection.Seq,scala.collection.Seq)Unit in class org.apache.spark.status.api.v1.ApplicationEnvironmentInfo does not have a correspondent in current version
[error]    filter with: ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.ApplicationEnvironmentInfo.this")

srowen · 2019-01-16T14:58:22Z

core/src/main/scala/org/apache/spark/ui/env/EnvironmentPage.scala

            <a>System Properties</a>
          </h4>
        </span>
-        <div class="aggregated-systemProperties collapsible-table">
+        <div class="aggregated-systemProperties collapsible-table collapsed">


Nice, so only the Spark properties and system info are expanded by default? that sounds good.

Yes, but collapseTablePageLoad in web.js may remember actions once expanded it.

SparkQA · 2019-01-17T02:57:21Z

Test build #101339 has finished for PR 23486 at commit b9b434a.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

deshanxiao · 2019-01-17T03:01:17Z

Retest please.

SparkQA · 2019-01-17T06:45:30Z

Test build #101340 has finished for PR 23486 at commit 1d921f7.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-01-17T07:26:31Z

Test build #101341 has finished for PR 23486 at commit 1b9d325.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2019-01-17T11:51:55Z

Merged to master

deshanxiao · 2019-01-18T09:00:52Z

Thank you! @planga82 @gengliangwang @felixcheung @srowen @HyukjinKwon

… tab ## What changes were proposed in this pull request? I know that yarn provided all hadoop configurations. But I guess it may be fine that the historyserver unify all configuration in it. It will be convenient for us to debug some problems. ## How was this patch tested? ![image](https://user-images.githubusercontent.com/42019462/50808610-4d742900-133a-11e9-868c-2976e856ed9a.png) Closes apache#23486 from deshanxiao/spark-26457. Lead-authored-by: xiaodeshan <xiaodeshan@xiaomi.com> Co-authored-by: deshanxiao <42019462+deshanxiao@users.noreply.github.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>

planga82 reviewed Jan 7, 2019

View reviewed changes

felixcheung reviewed Jan 9, 2019

View reviewed changes

gengliangwang reviewed Jan 10, 2019

View reviewed changes

srowen reviewed Jan 14, 2019

View reviewed changes

core/src/main/scala/org/apache/spark/SparkEnv.scala Show resolved Hide resolved

HyukjinKwon reviewed Jan 15, 2019

View reviewed changes

srowen reviewed Jan 15, 2019

View reviewed changes

deshanxiao and others added 6 commits January 16, 2019 10:00

Merge pull request #1 from apache/master

b1e91c0

fork

update

0bb1736

update

2721c52

update

9300acf

remove filter

87956dc

compatible with old event log

2fff471

fold some properties

cbcd90c

update

7bcd656

srowen requested changes Jan 16, 2019

View reviewed changes

fix-test-and-mina

b9b434a

fix-style

1d921f7

fix

1b9d325

srowen approved these changes Jan 17, 2019

View reviewed changes

srowen closed this in 650b879 Jan 17, 2019

[SPARK-26457] Show hadoop configurations in HistoryServer environment tab #23486

[SPARK-26457] Show hadoop configurations in HistoryServer environment tab #23486

Conversation

deshanxiao commented Jan 7, 2019 • edited by HyukjinKwon

What changes were proposed in this pull request?

How was this patch tested?

planga82 commented Jan 7, 2019

Choose a reason for hiding this comment

deshanxiao Jan 8, 2019 • edited

Choose a reason for hiding this comment

deshanxiao commented Jan 8, 2019

planga82 commented Jan 8, 2019

deshanxiao commented Jan 9, 2019

planga82 commented Jan 9, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deshanxiao commented Jan 14, 2019

srowen left a comment

Choose a reason for hiding this comment

deshanxiao commented Jan 15, 2019

HyukjinKwon commented Jan 15, 2019

HyukjinKwon commented Jan 15, 2019

SparkQA commented Jan 15, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deshanxiao commented Jan 15, 2019

SparkQA commented Jan 15, 2019

srowen commented Jan 15, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deshanxiao commented Jan 15, 2019

SparkQA commented Jan 16, 2019

SparkQA commented Jan 16, 2019

SparkQA commented Jan 16, 2019

SparkQA commented Jan 16, 2019

deshanxiao commented Jan 16, 2019

SparkQA commented Jan 16, 2019

SparkQA commented Jan 16, 2019

srowen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jan 17, 2019

deshanxiao commented Jan 17, 2019

SparkQA commented Jan 17, 2019

SparkQA commented Jan 17, 2019

srowen commented Jan 17, 2019

deshanxiao commented Jan 18, 2019

deshanxiao commented Jan 7, 2019 •

edited by HyukjinKwon

deshanxiao Jan 8, 2019 •

edited