Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-11213] Display Beam Metrics in Spark History Server #13743

Merged
merged 28 commits into from Feb 3, 2021
Merged

[BEAM-11213] Display Beam Metrics in Spark History Server #13743

merged 28 commits into from Feb 3, 2021

Conversation

ghost
Copy link

@ghost ghost commented Jan 13, 2021

Create event log following Spark History Server format in default /tmp/spark-events log directory, so it could be read by Spark History Server.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK Dataflow Flink Samza Spark Twister2
Go Build Status --- Build Status --- Build Status ---
Java Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status
Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
--- Build Status ---
XLang Build Status Build Status Build Status --- Build Status ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website Whitespace Typescript
Non-portable Build Status Build Status
Build Status
Build Status
Build Status
Build Status Build Status Build Status Build Status
Portable --- Build Status --- --- --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

@ghost ghost changed the title [BEAM-11213] Beam metrics should be displayed in Spark UI [BEAM-11213] Display Beam Metrics in Spark History Server Jan 13, 2021
@tysonjh
Copy link
Contributor

tysonjh commented Jan 13, 2021

R: @ibzib

Copy link
Contributor

@ibzib ibzib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tszerszen! A couple questions:

  • Does this PR affect the regular Spark UI at all, or does it only show metrics in the history server?
  • These changes only apply to the portable Spark runner. What about the "classic"/non-portable version?

runners/spark/job-server/build.gradle Outdated Show resolved Hide resolved
@@ -34,6 +34,12 @@
*/
public interface SparkPipelineOptions extends SparkCommonPipelineOptions {

@Description("The directory to save Spark History Server logs")
@Default.String("/tmp/spark-events/")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set spark.eventLog.dir in the Spark conf? Or does that not matter?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter, however for consistency I think it would be good to configure it in such a way.

@ghost
Copy link
Author

ghost commented Jan 24, 2021

Run Java PreCommit

@ghost
Copy link
Author

ghost commented Jan 24, 2021

R: @ibzib

runners/spark/job-server/build.gradle Outdated Show resolved Hide resolved
}
}));
eventLoggingListener.onApplicationEnd(
new SparkListenerApplicationEnd(Instant.now().getMillis()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure pipeline end time (and also start time for that matter) is itself a metric. To keep things consistent, it'd be better to use that metric here instead of Instant.now().

Copy link
Author

@ghost ghost Jan 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I printed out the results of renderAll method I didn't found such metrics for whole pipeline only for it's parts. Maybe not all metrics appear in renderAll method or should I filter for them specifically?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, it's not a blocker for this PR though. Thanks for checking.

@ghost
Copy link
Author

ghost commented Jan 27, 2021

R: @ibzib

Copy link
Contributor

@ibzib ibzib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tszerszen, this is looking a lot better. I have a few more comments.

runners/spark/job-server/build.gradle Outdated Show resolved Hide resolved
}
}));
eventLoggingListener.onApplicationEnd(
new SparkListenerApplicationEnd(Instant.now().getMillis()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, it's not a blocker for this PR though. Thanks for checking.

@iemejia
Copy link
Member

iemejia commented Jan 27, 2021

I have not checked what is the current status on the classic runners, are the Beam metrics shown correctly? Do you think we can reuse this work (if they are not shown correctly) there too @tszerszen ?

@ghost
Copy link
Author

ghost commented Jan 28, 2021

@iemejia I think this work can be reused there, since it's calling native Spark EventLoggingListener.

@iemejia
Copy link
Member

iemejia commented Jan 28, 2021

Great to know @tszerszen, thanks . Sounds like a nice follow up issue to be created/fixed in case you feel motivated after this one is merged.

@ghost
Copy link
Author

ghost commented Jan 30, 2021

R: @ibzib

Copy link
Contributor

@ibzib ibzib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Tomasz, I think this will be good to go after addressing a couple things:

  1. Remove the new options from the Spark job server configuration, and all related code. (getSparkHistoryDir and getEventLogEnabled should only be pipeline options.)
  2. Fix the SparkListenerExecutorAdded logic.

Copy link
Contributor

@ibzib ibzib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor cleanup and then we can merge this.

@ghost
Copy link
Author

ghost commented Feb 3, 2021

Run Java PreCommit

@ghost
Copy link
Author

ghost commented Feb 3, 2021

Run Java PreCommit

@ghost
Copy link
Author

ghost commented Feb 3, 2021

R: @ibzib

Copy link
Contributor

@ibzib ibzib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@ibzib
Copy link
Contributor

ibzib commented Feb 3, 2021

Run Java PreCommit

1 similar comment
@ghost
Copy link
Author

ghost commented Feb 3, 2021

Run Java PreCommit

@ibzib
Copy link
Contributor

ibzib commented Feb 3, 2021

Java test failure seems to be a flake: BEAM-11746

@ibzib
Copy link
Contributor

ibzib commented Feb 3, 2021

Run Java PreCommit

@ibzib ibzib merged commit 654ad2b into apache:master Feb 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants