New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-8133] Publishing results of Nexmark tests to InfluxDB #11956
Conversation
Run Seed Job |
8644628
to
7f2be43
Compare
7f2be43
to
fab9cff
Compare
Run Direct Runner Nexmark Tests |
Run Seed Job |
Run Direct Runner Nexmark Tests |
fab9cff
to
5fedf7e
Compare
Run Seed Job |
Run Direct Runner Nexmark Tests |
5fedf7e
to
e89172c
Compare
Run Seed Job |
Run Direct Runner Nexmark Tests |
1 similar comment
Run Direct Runner Nexmark Tests |
e89172c
to
1586c75
Compare
Run Seed Job |
Run Direct Runner Nexmark Tests |
R: @iemejia Could you take a look? |
cc: @tysonjh |
New results will be displayed in Grafana once this pull request is merged. The results of tests executed by phase triggering are written to different measurement (or table). |
results.forEach( | ||
map -> | ||
metricBuilder | ||
.append(map.get("measurement")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it would make sense to extract the appends of keys and values to a method, but I can't find a nice and clean way of doing it. @kamilwu do you have any thoughts about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding a method getKV
that would return a String in this format: "key=value", e.g. "runner=DataflowRunner"? This would reduce the number of appends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, makes sense, thanks!
1586c75
to
92d802f
Compare
Run Direct Runner Nexmark Tests |
92d802f
to
116ad4a
Compare
Run Seed Job |
Run Direct Runner Nexmark Tests |
Run Java AvroIO Performance Test HDFS |
116ad4a
to
b52dc6d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great so far. I let some questions, just minor stuff. it was not clear to me if we are conserving the same schema for compatibility reasons with BigQuery because the goal is to support both for some time, if it is not the case we should probably get rid of the BigQuery bits.
@@ -24,6 +24,7 @@ metadata: | |||
data: | |||
init-script.iql: | | |||
CREATE RETENTION POLICY "a_year" ON "beam_test_metrics" DURATION 52w REPLICATION 1 DEFAULT | |||
CREATE RETENTION POLICY "forever" ON "beam_test_metrics" DURATION INF REPLICATION 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<3
final ImmutableMap<String, String> schema = | ||
ImmutableMap.<String, String>builder() | ||
.put("timestamp", "timestamp") | ||
.put("runtimeSec", "float") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the goal is to improve the existing use case can we make this an integer and use ms instead to make it more precise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but I think we need to change wording to runtimeMs
as well, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes good idea
ImmutableMap.<String, String>builder() | ||
.put("timestamp", "timestamp") | ||
.put("runtimeSec", "float") | ||
.put("eventsPerSec", "float") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we use this one? it looks with runtimeMs + numResults this is not needed anymore or we can deduce it if someone cares.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll remove it from implementation but not here (we want to preserve compatibility with BQ), I'll change it in influx publisher, thanks for the info!
savePerfsToBigQuery( | ||
BigQueryResultsPublisher.create(options.getBigQueryDataset(), schema), | ||
options, | ||
actual, | ||
start); | ||
} | ||
|
||
if (options.getExportSummaryToInfluxDB()) { | ||
final long timestamp = start.getMillis() / 1000; // seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I thought timestamps in Influxe were in ms well probably we don't need that level of precision for the start timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default precision is nanoseconds. In case of nexmark results we changed it and use seconds instead
return new HttpPost(
settings.host + "/write?db=" + settings.database + "&" + retentionPolicy + "&precision=s");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've thought that we don't really need milliseconds. Even seconds are probably more than enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you guys are right seconds is ok for the execution timestamp and ms is good for the benchmark run time
@@ -42,8 +42,8 @@ | |||
PROCESSING_TIME_WINDOWS(12), // Query "12" | |||
|
|||
// Other non-numbered queries | |||
BOUNDED_SIDE_INPUT_JOIN, | |||
SESSION_SIDE_INPUT_JOIN; | |||
BOUNDED_SIDE_INPUT_JOIN(13), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -19,14 +19,19 @@ | |||
|
|||
import static java.nio.charset.StandardCharsets.UTF_8; | |||
import static java.util.Objects.requireNonNull; | |||
import static java.util.stream.Collectors.joining; | |||
import static org.apache.beam.repackaged.core.org.apache.commons.lang3.StringUtils.isBlank; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not depend on repackaged commons-lang3 this will probably be removed in the future so better add the explicit commons-lang3 import and corresponding classes.
|
||
final StringBuilder metricBuilder = new StringBuilder(); | ||
results.stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The original code with the strings looks uglier but somehow is easier to understand in a single read (so easier to maintain), the new one requires a lot of methods and jumping back and forth in code for not much. Can we go back to the older approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm...sure we can.
91d0081
to
12f2cbe
Compare
Run Seed Job |
Run Direct Runner Nexmark Tests |
Run Java AvroIO Performance Test HDFS |
Run Dataflow Runner Nexmark Tests |
|
||
import java.io.IOException; | ||
import java.util.Collection; | ||
import java.util.Map; | ||
import org.apache.beam.repackaged.core.org.apache.commons.lang3.StringUtils; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
argh this commons import escaped here too. Can you use the non repackaged version please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah..I would swear I changed it...thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM module the fix on the commons.lang3 import.
Thanks !
12f2cbe
to
abfb1dd
Compare
Run Seed Job |
Run Dataflow Runner Nexmark Tests |
Run Beam Metrics deployment |
* changed unit from seconds to milliseconds * renamed `runtimeSec` field to `runtimeMs`
abfb1dd
to
e89324f
Compare
Run Seed Job |
Run Dataflow Runner Nexmark Tests |
Run JavaPortabilityApiJava11 PreCommit |
Run JavaPortabilityApi PreCommit |
Thanks @iemejia! |
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.