Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-11213] Display Beam Metrics in Spark History Server for Classic Runner #14409

Merged
merged 1 commit into from Apr 2, 2021

Conversation

iemejia
Copy link
Member

@iemejia iemejia commented Apr 2, 2021

R: @ibzib
CC: @tszerszen (for awareness)

@iemejia iemejia requested a review from ibzib April 2, 2021 12:20
@iemejia iemejia force-pushed the BEAM-11212-spark-metrics-display branch from 8059182 to 26ee103 Compare April 2, 2021 12:28
@iemejia iemejia changed the title [BEAM-11212] Display Beam Metrics in Spark History Server for Classic Runner too [BEAM-11213] Display Beam Metrics in Spark History Server for Classic Runner too Apr 2, 2021
@@ -34,18 +34,6 @@
*/
public interface SparkPipelineOptions extends SparkCommonPipelineOptions {

@Description("Set it to true if event logs should be saved to Spark History Server directory")
Copy link
Member Author

@iemejia iemejia Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically we avoid if possible to introduce variables that are already defined as Spark Configurations, so these variables should come from the Spark configuration to avoid duplicated non synchronized configuration changes and confusion for end users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. 👍

final JavaSparkContext jsc, SparkPipelineOptions pipelineOptions, long startTime) {
EventLoggingListener eventLoggingListener = null;
try {
if (jsc.getConf().getBoolean("spark.eventLog.enabled", false)) {
Copy link
Member Author

@iemejia iemejia Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the official Spark way to configure what the eventLogEnabled and sparkHistoryDir options were doing. So better to use these.

… Runner too

It also removes the SparkPipelineOptions related to events logging
because those are already configured and used by Spark config.
@iemejia iemejia force-pushed the BEAM-11212-spark-metrics-display branch from 26ee103 to 9e0b378 Compare April 2, 2021 14:20
@iemejia
Copy link
Member Author

iemejia commented Apr 2, 2021

Run Spark ValidatesRunner

@iemejia
Copy link
Member Author

iemejia commented Apr 2, 2021

Run Spark ValidatesRunner Java 11

@iemejia
Copy link
Member Author

iemejia commented Apr 2, 2021

Run Java Spark PortableValidatesRunner Batch

Copy link
Contributor

@ibzib ibzib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for extending and improving this.

@@ -34,18 +34,6 @@
*/
public interface SparkPipelineOptions extends SparkCommonPipelineOptions {

@Description("Set it to true if event logs should be saved to Spark History Server directory")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. 👍

@@ -97,7 +102,7 @@
private static final Logger LOG = LoggerFactory.getLogger(SparkRunner.class);

/** Options used in this pipeline runner. */
private final SparkPipelineOptions mOptions;
private final SparkPipelineOptions pipelineOptions;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 😄

metricsPusher.start();

if (eventLoggingListener != null && jsc != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious how jsc could be null here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't it would have definitely failed before but I am double checking just to be sure.

@iemejia iemejia merged commit 1e60f38 into apache:master Apr 2, 2021
@iemejia
Copy link
Member Author

iemejia commented Apr 2, 2021

Thanks @ibzib !

@iemejia iemejia changed the title [BEAM-11213] Display Beam Metrics in Spark History Server for Classic Runner too [BEAM-11213] Display Beam Metrics in Spark History Server for Classic Runner Apr 2, 2021
@iemejia iemejia deleted the BEAM-11212-spark-metrics-display branch April 2, 2021 19:34
iemejia added a commit to iemejia/beam that referenced this pull request Apr 2, 2021
… Metrics in Spark History Server for Classic Runner

It also removes the SparkPipelineOptions related to events logging
because those are already configured and used by Spark config.
kennknowles added a commit that referenced this pull request Apr 7, 2021
… 2.29.0: Display Beam Metrics in Spark History Server for Classic Runner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants