New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-27464][CORE] Added Constant instead of referring string literal used from many places #24368
Conversation
ok to test |
@@ -1303,4 +1303,9 @@ package object config { | |||
.doc("Staging directory used while submitting applications.") | |||
.stringConf | |||
.createOptional | |||
|
|||
private[spark] val BUFFER_PAGESIZE = ConfigBuilder("spark.buffer.pageSize") | |||
.bytesConf(ByteUnit.BYTE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HyukjinKwon
Thanks for your time.
Now I added the doc and corrected the scalastyle issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to say this needs a default, but it looks like the default in non-test code is not a single value, and different tests want different defaults.
Test build #104570 has finished for PR 24368 at commit
|
> corrected the scalasyle issue
@@ -1303,4 +1303,9 @@ package object config { | |||
.doc("Staging directory used while submitting applications.") | |||
.stringConf | |||
.createOptional | |||
|
|||
private[spark] val BUFFER_PAGESIZE = ConfigBuilder("spark.buffer.pageSize") | |||
.bytesConf(ByteUnit.BYTE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to say this needs a default, but it looks like the default in non-test code is not a single value, and different tests want different defaults.
@@ -255,7 +255,7 @@ private[spark] abstract class MemoryManager( | |||
} | |||
val size = ByteArrayMethods.nextPowerOf2(maxTungstenMemory / cores / safetyFactor) | |||
val default = math.min(maxPageSize, math.max(minPageSize, size)) | |||
conf.getSizeAsBytes("spark.buffer.pageSize", default) | |||
conf.getSizeAsBytes(BUFFER_PAGESIZE.key, default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a byteConf, can this just use conf.get()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen
Updated the code to use conf.get.
initially, I thought the same, But there is no method to provide the default value sparkConf class.
Now I am checking the values, if not present initializing with a default value.
Test build #104571 has finished for PR 24368 at commit
|
Test build #104577 has finished for PR 24368 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small question otherwise LGTM.
Merged to master |
@srowen |
### What changes were proposed in this pull request? This PR fixes a regression introduced by #36885 which broke JsonProtocol's ability to handle missing fields from exception field. old eventlogs missing a `Stack Trace` will raise a NPE. As a result, SHS misinterprets failed-jobs/SQLs as `Active/Incomplete` This PR solves this problem by checking the JsonNode for null. If it is null, an empty array of `StackTraceElements` ### Why are the changes needed? Fix a case which prevents the history server from identifying failed jobs if the stacktrace was not set. Example eventlog ``` { "Event":"SparkListenerJobEnd", "Job ID":31, "Completion Time":1616171909785, "Job Result":{ "Result":"JobFailed", "Exception":{ "Message":"Job aborted" } } } ``` **Original behavior:** The job is marked as `incomplete` Error from the SHS logs: ``` 23/05/01 21:57:16 INFO FsHistoryProvider: Parsing file:/tmp/nds_q86_fail_test to re-build UI... 23/05/01 21:57:17 ERROR ReplayListenerBus: Exception parsing Spark event log: file:/tmp/nds_q86_fail_test java.lang.NullPointerException at org.apache.spark.util.JsonProtocol$JsonNodeImplicits.extractElements(JsonProtocol.scala:1589) at org.apache.spark.util.JsonProtocol$.stackTraceFromJson(JsonProtocol.scala:1558) at org.apache.spark.util.JsonProtocol$.exceptionFromJson(JsonProtocol.scala:1569) at org.apache.spark.util.JsonProtocol$.jobResultFromJson(JsonProtocol.scala:1423) at org.apache.spark.util.JsonProtocol$.jobEndFromJson(JsonProtocol.scala:967) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:878) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:865) .... 23/05/01 21:57:17 ERROR ReplayListenerBus: Malformed line #24368: {"Event":"SparkListenerJobEnd","Job ID":31,"Completion Time":1616171909785,"Job Result":{"Result":"JobFailed","Exception": {"Message":"Job aborted"} }} ``` **After the fix:** Job 31 is marked as `failedJob` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added new unit test in JsonProtocolSuite. Closes #41050 from amahussein/aspark-43340-b. Authored-by: Ahmed Hussein <ahussein@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
This PR fixes a regression introduced by #36885 which broke JsonProtocol's ability to handle missing fields from exception field. old eventlogs missing a `Stack Trace` will raise a NPE. As a result, SHS misinterprets failed-jobs/SQLs as `Active/Incomplete` This PR solves this problem by checking the JsonNode for null. If it is null, an empty array of `StackTraceElements` Fix a case which prevents the history server from identifying failed jobs if the stacktrace was not set. Example eventlog ``` { "Event":"SparkListenerJobEnd", "Job ID":31, "Completion Time":1616171909785, "Job Result":{ "Result":"JobFailed", "Exception":{ "Message":"Job aborted" } } } ``` **Original behavior:** The job is marked as `incomplete` Error from the SHS logs: ``` 23/05/01 21:57:16 INFO FsHistoryProvider: Parsing file:/tmp/nds_q86_fail_test to re-build UI... 23/05/01 21:57:17 ERROR ReplayListenerBus: Exception parsing Spark event log: file:/tmp/nds_q86_fail_test java.lang.NullPointerException at org.apache.spark.util.JsonProtocol$JsonNodeImplicits.extractElements(JsonProtocol.scala:1589) at org.apache.spark.util.JsonProtocol$.stackTraceFromJson(JsonProtocol.scala:1558) at org.apache.spark.util.JsonProtocol$.exceptionFromJson(JsonProtocol.scala:1569) at org.apache.spark.util.JsonProtocol$.jobResultFromJson(JsonProtocol.scala:1423) at org.apache.spark.util.JsonProtocol$.jobEndFromJson(JsonProtocol.scala:967) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:878) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:865) .... 23/05/01 21:57:17 ERROR ReplayListenerBus: Malformed line #24368: {"Event":"SparkListenerJobEnd","Job ID":31,"Completion Time":1616171909785,"Job Result":{"Result":"JobFailed","Exception": {"Message":"Job aborted"} }} ``` **After the fix:** Job 31 is marked as `failedJob` No. Added new unit test in JsonProtocolSuite. Closes #41050 from amahussein/aspark-43340-b. Authored-by: Ahmed Hussein <ahussein@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit dcd710d) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request? This PR fixes a regression introduced by apache#36885 which broke JsonProtocol's ability to handle missing fields from exception field. old eventlogs missing a `Stack Trace` will raise a NPE. As a result, SHS misinterprets failed-jobs/SQLs as `Active/Incomplete` This PR solves this problem by checking the JsonNode for null. If it is null, an empty array of `StackTraceElements` ### Why are the changes needed? Fix a case which prevents the history server from identifying failed jobs if the stacktrace was not set. Example eventlog ``` { "Event":"SparkListenerJobEnd", "Job ID":31, "Completion Time":1616171909785, "Job Result":{ "Result":"JobFailed", "Exception":{ "Message":"Job aborted" } } } ``` **Original behavior:** The job is marked as `incomplete` Error from the SHS logs: ``` 23/05/01 21:57:16 INFO FsHistoryProvider: Parsing file:/tmp/nds_q86_fail_test to re-build UI... 23/05/01 21:57:17 ERROR ReplayListenerBus: Exception parsing Spark event log: file:/tmp/nds_q86_fail_test java.lang.NullPointerException at org.apache.spark.util.JsonProtocol$JsonNodeImplicits.extractElements(JsonProtocol.scala:1589) at org.apache.spark.util.JsonProtocol$.stackTraceFromJson(JsonProtocol.scala:1558) at org.apache.spark.util.JsonProtocol$.exceptionFromJson(JsonProtocol.scala:1569) at org.apache.spark.util.JsonProtocol$.jobResultFromJson(JsonProtocol.scala:1423) at org.apache.spark.util.JsonProtocol$.jobEndFromJson(JsonProtocol.scala:967) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:878) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:865) .... 23/05/01 21:57:17 ERROR ReplayListenerBus: Malformed line apache#24368: {"Event":"SparkListenerJobEnd","Job ID":31,"Completion Time":1616171909785,"Job Result":{"Result":"JobFailed","Exception": {"Message":"Job aborted"} }} ``` **After the fix:** Job 31 is marked as `failedJob` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added new unit test in JsonProtocolSuite. Closes apache#41050 from amahussein/aspark-43340-b. Authored-by: Ahmed Hussein <ahussein@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
This PR fixes a regression introduced by apache#36885 which broke JsonProtocol's ability to handle missing fields from exception field. old eventlogs missing a `Stack Trace` will raise a NPE. As a result, SHS misinterprets failed-jobs/SQLs as `Active/Incomplete` This PR solves this problem by checking the JsonNode for null. If it is null, an empty array of `StackTraceElements` Fix a case which prevents the history server from identifying failed jobs if the stacktrace was not set. Example eventlog ``` { "Event":"SparkListenerJobEnd", "Job ID":31, "Completion Time":1616171909785, "Job Result":{ "Result":"JobFailed", "Exception":{ "Message":"Job aborted" } } } ``` **Original behavior:** The job is marked as `incomplete` Error from the SHS logs: ``` 23/05/01 21:57:16 INFO FsHistoryProvider: Parsing file:/tmp/nds_q86_fail_test to re-build UI... 23/05/01 21:57:17 ERROR ReplayListenerBus: Exception parsing Spark event log: file:/tmp/nds_q86_fail_test java.lang.NullPointerException at org.apache.spark.util.JsonProtocol$JsonNodeImplicits.extractElements(JsonProtocol.scala:1589) at org.apache.spark.util.JsonProtocol$.stackTraceFromJson(JsonProtocol.scala:1558) at org.apache.spark.util.JsonProtocol$.exceptionFromJson(JsonProtocol.scala:1569) at org.apache.spark.util.JsonProtocol$.jobResultFromJson(JsonProtocol.scala:1423) at org.apache.spark.util.JsonProtocol$.jobEndFromJson(JsonProtocol.scala:967) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:878) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:865) .... 23/05/01 21:57:17 ERROR ReplayListenerBus: Malformed line apache#24368: {"Event":"SparkListenerJobEnd","Job ID":31,"Completion Time":1616171909785,"Job Result":{"Result":"JobFailed","Exception": {"Message":"Job aborted"} }} ``` **After the fix:** Job 31 is marked as `failedJob` No. Added new unit test in JsonProtocolSuite. Closes apache#41050 from amahussein/aspark-43340-b. Authored-by: Ahmed Hussein <ahussein@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit dcd710d) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
This PR fixes a regression introduced by apache#36885 which broke JsonProtocol's ability to handle missing fields from exception field. old eventlogs missing a `Stack Trace` will raise a NPE. As a result, SHS misinterprets failed-jobs/SQLs as `Active/Incomplete` This PR solves this problem by checking the JsonNode for null. If it is null, an empty array of `StackTraceElements` Fix a case which prevents the history server from identifying failed jobs if the stacktrace was not set. Example eventlog ``` { "Event":"SparkListenerJobEnd", "Job ID":31, "Completion Time":1616171909785, "Job Result":{ "Result":"JobFailed", "Exception":{ "Message":"Job aborted" } } } ``` **Original behavior:** The job is marked as `incomplete` Error from the SHS logs: ``` 23/05/01 21:57:16 INFO FsHistoryProvider: Parsing file:/tmp/nds_q86_fail_test to re-build UI... 23/05/01 21:57:17 ERROR ReplayListenerBus: Exception parsing Spark event log: file:/tmp/nds_q86_fail_test java.lang.NullPointerException at org.apache.spark.util.JsonProtocol$JsonNodeImplicits.extractElements(JsonProtocol.scala:1589) at org.apache.spark.util.JsonProtocol$.stackTraceFromJson(JsonProtocol.scala:1558) at org.apache.spark.util.JsonProtocol$.exceptionFromJson(JsonProtocol.scala:1569) at org.apache.spark.util.JsonProtocol$.jobResultFromJson(JsonProtocol.scala:1423) at org.apache.spark.util.JsonProtocol$.jobEndFromJson(JsonProtocol.scala:967) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:878) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:865) .... 23/05/01 21:57:17 ERROR ReplayListenerBus: Malformed line apache#24368: {"Event":"SparkListenerJobEnd","Job ID":31,"Completion Time":1616171909785,"Job Result":{"Result":"JobFailed","Exception": {"Message":"Job aborted"} }} ``` **After the fix:** Job 31 is marked as `failedJob` No. Added new unit test in JsonProtocolSuite. Closes apache#41050 from amahussein/aspark-43340-b. Authored-by: Ahmed Hussein <ahussein@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit dcd710d) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
This PR fixes a regression introduced by apache#36885 which broke JsonProtocol's ability to handle missing fields from exception field. old eventlogs missing a `Stack Trace` will raise a NPE. As a result, SHS misinterprets failed-jobs/SQLs as `Active/Incomplete` This PR solves this problem by checking the JsonNode for null. If it is null, an empty array of `StackTraceElements` Fix a case which prevents the history server from identifying failed jobs if the stacktrace was not set. Example eventlog ``` { "Event":"SparkListenerJobEnd", "Job ID":31, "Completion Time":1616171909785, "Job Result":{ "Result":"JobFailed", "Exception":{ "Message":"Job aborted" } } } ``` **Original behavior:** The job is marked as `incomplete` Error from the SHS logs: ``` 23/05/01 21:57:16 INFO FsHistoryProvider: Parsing file:/tmp/nds_q86_fail_test to re-build UI... 23/05/01 21:57:17 ERROR ReplayListenerBus: Exception parsing Spark event log: file:/tmp/nds_q86_fail_test java.lang.NullPointerException at org.apache.spark.util.JsonProtocol$JsonNodeImplicits.extractElements(JsonProtocol.scala:1589) at org.apache.spark.util.JsonProtocol$.stackTraceFromJson(JsonProtocol.scala:1558) at org.apache.spark.util.JsonProtocol$.exceptionFromJson(JsonProtocol.scala:1569) at org.apache.spark.util.JsonProtocol$.jobResultFromJson(JsonProtocol.scala:1423) at org.apache.spark.util.JsonProtocol$.jobEndFromJson(JsonProtocol.scala:967) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:878) at org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:865) .... 23/05/01 21:57:17 ERROR ReplayListenerBus: Malformed line apache#24368: {"Event":"SparkListenerJobEnd","Job ID":31,"Completion Time":1616171909785,"Job Result":{"Result":"JobFailed","Exception": {"Message":"Job aborted"} }} ``` **After the fix:** Job 31 is marked as `failedJob` No. Added new unit test in JsonProtocolSuite. Closes apache#41050 from amahussein/aspark-43340-b. Authored-by: Ahmed Hussein <ahussein@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit dcd710d) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
Added Constant instead of referring the same String literal "spark.buffer.pageSize" from many places
How was this patch tested?
Run the corresponding Unit Test Cases manually.