[SPARK-27464][CORE] Added Constant instead of referring string literal used from many places #24368

shivusondur · 2019-04-14T05:17:13Z

What changes were proposed in this pull request?

Added Constant instead of referring the same String literal "spark.buffer.pageSize" from many places

How was this patch tested?

Run the corresponding Unit Test Cases manually.

HyukjinKwon · 2019-04-14T09:23:16Z

ok to test

HyukjinKwon · 2019-04-14T09:23:25Z

core/src/main/scala/org/apache/spark/internal/config/package.scala

@@ -1303,4 +1303,9 @@ package object config {
    .doc("Staging directory used while submitting applications.")
    .stringConf
    .createOptional
+
+  private[spark] val BUFFER_PAGESIZE = ConfigBuilder("spark.buffer.pageSize")
+    .bytesConf(ByteUnit.BYTE)


can you add a doc?

@HyukjinKwon
Thanks for your time.
Now I added the doc and corrected the scalastyle issue

I was going to say this needs a default, but it looks like the default in non-test code is not a single value, and different tests want different defaults.

SparkQA · 2019-04-14T09:38:29Z

Test build #104570 has finished for PR 24368 at commit bc815de.

This patch fails Java style tests.
This patch merges cleanly.
This patch adds no public classes.

> corrected the scalasyle issue

srowen · 2019-04-14T14:10:18Z

core/src/main/scala/org/apache/spark/internal/config/package.scala

@@ -1303,4 +1303,9 @@ package object config {
    .doc("Staging directory used while submitting applications.")
    .stringConf
    .createOptional
+
+  private[spark] val BUFFER_PAGESIZE = ConfigBuilder("spark.buffer.pageSize")
+    .bytesConf(ByteUnit.BYTE)


I was going to say this needs a default, but it looks like the default in non-test code is not a single value, and different tests want different defaults.

srowen · 2019-04-14T14:10:33Z

core/src/main/scala/org/apache/spark/memory/MemoryManager.scala

@@ -255,7 +255,7 @@ private[spark] abstract class MemoryManager(
    }
    val size = ByteArrayMethods.nextPowerOf2(maxTungstenMemory / cores / safetyFactor)
    val default = math.min(maxPageSize, math.max(minPageSize, size))
-    conf.getSizeAsBytes("spark.buffer.pageSize", default)
+    conf.getSizeAsBytes(BUFFER_PAGESIZE.key, default)


If this is a byteConf, can this just use conf.get()?

@srowen
Updated the code to use conf.get.
initially, I thought the same, But there is no method to provide the default value sparkConf class.
Now I am checking the values, if not present initializing with a default value.

SparkQA · 2019-04-14T17:28:16Z

Test build #104571 has finished for PR 24368 at commit eba8417.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-04-14T22:45:37Z

Test build #104577 has finished for PR 24368 at commit 8cf7d81.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

attilapiros

Small question otherwise LGTM.

core/src/main/scala/org/apache/spark/internal/config/package.scala

srowen · 2019-04-16T14:31:15Z

Merged to master

shivusondur · 2019-04-16T16:50:30Z

@srowen
Thanks for merging

### What changes were proposed in this pull request? This PR fixes a regression introduced by #36885 which broke JsonProtocol's ability to handle missing fields from exception field. old eventlogs missing a `Stack Trace` will raise a NPE. As a result, SHS misinterprets failed-jobs/SQLs as `Active/Incomplete` This PR solves this problem by checking the JsonNode for null. If it is null, an empty array of `StackTraceElements` ### Why are the changes needed? Fix a case which prevents the history server from identifying failed jobs if the stacktrace was not set. Example eventlog ``` { "Event":"SparkListenerJobEnd", "Job ID":31, "Completion Time":1616171909785, "Job Result":{ "Result":"JobFailed", "Exception":{ "Message":"Job aborted" } } } ``` **Original behavior:** The job is marked as `incomplete` Error from the SHS logs: ``` 23/05/01 21:57:16 INFO FsHistoryProvider: Parsing file:/tmp/nds_q86_fail_test to re-build UI... 23/05/01 21:57:17 ERROR ReplayListenerBus: Exception parsing Spark event log: file:/tmp/nds_q86_fail_test java.lang.NullPointerException at org.apache.spark.util.JsonProtocol$JsonNodeImplicits.extractElements(JsonProtocol.scala:1589) at org.apache.spark.util.JsonProtocol$.stackTraceFromJson(JsonProtocol.scala:1558) at org.apache.spark.util.JsonProtocol$.exceptionFromJson(JsonProtocol.scala:1569) at org.apache.spark.util.JsonProtocol$.jobResultFromJson(JsonProtocol.scala:1423) at org.apache.spark.util.JsonProtocol$.jobEndFromJson(JsonProtocol.scala:967) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:878) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:865) .... 23/05/01 21:57:17 ERROR ReplayListenerBus: Malformed line #24368: {"Event":"SparkListenerJobEnd","Job ID":31,"Completion Time":1616171909785,"Job Result":{"Result":"JobFailed","Exception": {"Message":"Job aborted"} }} ``` **After the fix:** Job 31 is marked as `failedJob` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added new unit test in JsonProtocolSuite. Closes #41050 from amahussein/aspark-43340-b. Authored-by: Ahmed Hussein <ahussein@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

This PR fixes a regression introduced by #36885 which broke JsonProtocol's ability to handle missing fields from exception field. old eventlogs missing a `Stack Trace` will raise a NPE. As a result, SHS misinterprets failed-jobs/SQLs as `Active/Incomplete` This PR solves this problem by checking the JsonNode for null. If it is null, an empty array of `StackTraceElements` Fix a case which prevents the history server from identifying failed jobs if the stacktrace was not set. Example eventlog ``` { "Event":"SparkListenerJobEnd", "Job ID":31, "Completion Time":1616171909785, "Job Result":{ "Result":"JobFailed", "Exception":{ "Message":"Job aborted" } } } ``` **Original behavior:** The job is marked as `incomplete` Error from the SHS logs: ``` 23/05/01 21:57:16 INFO FsHistoryProvider: Parsing file:/tmp/nds_q86_fail_test to re-build UI... 23/05/01 21:57:17 ERROR ReplayListenerBus: Exception parsing Spark event log: file:/tmp/nds_q86_fail_test java.lang.NullPointerException at org.apache.spark.util.JsonProtocol$JsonNodeImplicits.extractElements(JsonProtocol.scala:1589) at org.apache.spark.util.JsonProtocol$.stackTraceFromJson(JsonProtocol.scala:1558) at org.apache.spark.util.JsonProtocol$.exceptionFromJson(JsonProtocol.scala:1569) at org.apache.spark.util.JsonProtocol$.jobResultFromJson(JsonProtocol.scala:1423) at org.apache.spark.util.JsonProtocol$.jobEndFromJson(JsonProtocol.scala:967) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:878) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:865) .... 23/05/01 21:57:17 ERROR ReplayListenerBus: Malformed line #24368: {"Event":"SparkListenerJobEnd","Job ID":31,"Completion Time":1616171909785,"Job Result":{"Result":"JobFailed","Exception": {"Message":"Job aborted"} }} ``` **After the fix:** Job 31 is marked as `failedJob` No. Added new unit test in JsonProtocolSuite. Closes #41050 from amahussein/aspark-43340-b. Authored-by: Ahmed Hussein <ahussein@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit dcd710d) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

### What changes were proposed in this pull request? This PR fixes a regression introduced by apache#36885 which broke JsonProtocol's ability to handle missing fields from exception field. old eventlogs missing a `Stack Trace` will raise a NPE. As a result, SHS misinterprets failed-jobs/SQLs as `Active/Incomplete` This PR solves this problem by checking the JsonNode for null. If it is null, an empty array of `StackTraceElements` ### Why are the changes needed? Fix a case which prevents the history server from identifying failed jobs if the stacktrace was not set. Example eventlog ``` { "Event":"SparkListenerJobEnd", "Job ID":31, "Completion Time":1616171909785, "Job Result":{ "Result":"JobFailed", "Exception":{ "Message":"Job aborted" } } } ``` **Original behavior:** The job is marked as `incomplete` Error from the SHS logs: ``` 23/05/01 21:57:16 INFO FsHistoryProvider: Parsing file:/tmp/nds_q86_fail_test to re-build UI... 23/05/01 21:57:17 ERROR ReplayListenerBus: Exception parsing Spark event log: file:/tmp/nds_q86_fail_test java.lang.NullPointerException at org.apache.spark.util.JsonProtocol$JsonNodeImplicits.extractElements(JsonProtocol.scala:1589) at org.apache.spark.util.JsonProtocol$.stackTraceFromJson(JsonProtocol.scala:1558) at org.apache.spark.util.JsonProtocol$.exceptionFromJson(JsonProtocol.scala:1569) at org.apache.spark.util.JsonProtocol$.jobResultFromJson(JsonProtocol.scala:1423) at org.apache.spark.util.JsonProtocol$.jobEndFromJson(JsonProtocol.scala:967) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:878) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:865) .... 23/05/01 21:57:17 ERROR ReplayListenerBus: Malformed line apache#24368: {"Event":"SparkListenerJobEnd","Job ID":31,"Completion Time":1616171909785,"Job Result":{"Result":"JobFailed","Exception": {"Message":"Job aborted"} }} ``` **After the fix:** Job 31 is marked as `failedJob` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added new unit test in JsonProtocolSuite. Closes apache#41050 from amahussein/aspark-43340-b. Authored-by: Ahmed Hussein <ahussein@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

This PR fixes a regression introduced by apache#36885 which broke JsonProtocol's ability to handle missing fields from exception field. old eventlogs missing a `Stack Trace` will raise a NPE. As a result, SHS misinterprets failed-jobs/SQLs as `Active/Incomplete` This PR solves this problem by checking the JsonNode for null. If it is null, an empty array of `StackTraceElements` Fix a case which prevents the history server from identifying failed jobs if the stacktrace was not set. Example eventlog ``` { "Event":"SparkListenerJobEnd", "Job ID":31, "Completion Time":1616171909785, "Job Result":{ "Result":"JobFailed", "Exception":{ "Message":"Job aborted" } } } ``` **Original behavior:** The job is marked as `incomplete` Error from the SHS logs: ``` 23/05/01 21:57:16 INFO FsHistoryProvider: Parsing file:/tmp/nds_q86_fail_test to re-build UI... 23/05/01 21:57:17 ERROR ReplayListenerBus: Exception parsing Spark event log: file:/tmp/nds_q86_fail_test java.lang.NullPointerException at org.apache.spark.util.JsonProtocol$JsonNodeImplicits.extractElements(JsonProtocol.scala:1589) at org.apache.spark.util.JsonProtocol$.stackTraceFromJson(JsonProtocol.scala:1558) at org.apache.spark.util.JsonProtocol$.exceptionFromJson(JsonProtocol.scala:1569) at org.apache.spark.util.JsonProtocol$.jobResultFromJson(JsonProtocol.scala:1423) at org.apache.spark.util.JsonProtocol$.jobEndFromJson(JsonProtocol.scala:967) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:878) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:865) .... 23/05/01 21:57:17 ERROR ReplayListenerBus: Malformed line apache#24368: {"Event":"SparkListenerJobEnd","Job ID":31,"Completion Time":1616171909785,"Job Result":{"Result":"JobFailed","Exception": {"Message":"Job aborted"} }} ``` **After the fix:** Job 31 is marked as `failedJob` No. Added new unit test in JsonProtocolSuite. Closes apache#41050 from amahussein/aspark-43340-b. Authored-by: Ahmed Hussein <ahussein@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit dcd710d) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

used the Costant instead of string literla used from many places

bc815de

HyukjinKwon reviewed Apr 14, 2019

View reviewed changes

> Added the doc for the configuration

eba8417

> corrected the scalasyle issue

srowen reviewed Apr 14, 2019

View reviewed changes

Directly getting the bytes configuration from conf.

8cf7d81

srowen approved these changes Apr 14, 2019

View reviewed changes

shivusondur changed the title ~~[MINOR][CORE] Added Constant instead of referring string literal used from many places~~ [SPARK-27464][CORE] Added Constant instead of referring string literal used from many places Apr 15, 2019

attilapiros reviewed Apr 15, 2019

View reviewed changes

core/src/main/scala/org/apache/spark/internal/config/package.scala Show resolved Hide resolved

srowen closed this in 88d9de2 Apr 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-27464][CORE] Added Constant instead of referring string literal used from many places #24368

[SPARK-27464][CORE] Added Constant instead of referring string literal used from many places #24368

shivusondur commented Apr 14, 2019

HyukjinKwon commented Apr 14, 2019

HyukjinKwon Apr 14, 2019

shivusondur Apr 14, 2019

srowen Apr 14, 2019

SparkQA commented Apr 14, 2019

srowen Apr 14, 2019

srowen Apr 14, 2019

shivusondur Apr 14, 2019

SparkQA commented Apr 14, 2019

SparkQA commented Apr 14, 2019

attilapiros left a comment

srowen commented Apr 16, 2019

shivusondur commented Apr 16, 2019

[SPARK-27464][CORE] Added Constant instead of referring string literal used from many places #24368

[SPARK-27464][CORE] Added Constant instead of referring string literal used from many places #24368

Conversation

shivusondur commented Apr 14, 2019

What changes were proposed in this pull request?

How was this patch tested?

HyukjinKwon commented Apr 14, 2019

HyukjinKwon Apr 14, 2019

Choose a reason for hiding this comment

shivusondur Apr 14, 2019

Choose a reason for hiding this comment

srowen Apr 14, 2019

Choose a reason for hiding this comment

SparkQA commented Apr 14, 2019

srowen Apr 14, 2019

Choose a reason for hiding this comment

srowen Apr 14, 2019

Choose a reason for hiding this comment

shivusondur Apr 14, 2019

Choose a reason for hiding this comment

SparkQA commented Apr 14, 2019

SparkQA commented Apr 14, 2019

attilapiros left a comment

Choose a reason for hiding this comment

srowen commented Apr 16, 2019

shivusondur commented Apr 16, 2019