[BEAM-8133] Publishing results of Nexmark tests to InfluxDB #11956

kamilwu · 2020-06-09T12:34:06Z

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

Choose reviewer(s) and mention them in a comment (R: @username).
Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
Update CHANGES.md with noteworthy changes.
If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang	SDK	Apex	Dataflow	Gearpump	Samza
Go		---	---	---	---
Java
Python		---		---	---
XLang	---	---	---	---	---

Pre-Commit Tests Status (on master branch)

---	Java	Python	Go	Website
Non-portable
Portable	---		---	---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

kamilwu · 2020-06-10T13:52:20Z

Run Seed Job

kamilwu · 2020-06-10T14:13:23Z

Run Direct Runner Nexmark Tests

kamilwu · 2020-06-10T14:49:50Z

Run Seed Job

kamilwu · 2020-06-10T15:08:09Z

Run Direct Runner Nexmark Tests

kamilwu · 2020-06-10T15:24:20Z

Run Seed Job

kamilwu · 2020-06-10T15:42:38Z

Run Direct Runner Nexmark Tests

kamilwu · 2020-06-11T17:26:25Z

Run Seed Job

kamilwu · 2020-06-11T17:43:06Z

Run Direct Runner Nexmark Tests

kamilwu · 2020-06-11T17:48:41Z

Run Direct Runner Nexmark Tests

kamilwu · 2020-06-15T13:07:14Z

Run Seed Job

kamilwu · 2020-06-15T13:48:08Z

Run Direct Runner Nexmark Tests

kamilwu · 2020-06-15T14:46:35Z

R: @iemejia Could you take a look?

kamilwu · 2020-06-15T14:46:46Z

cc: @tysonjh

kamilwu · 2020-06-15T15:59:46Z

New results will be displayed in Grafana once this pull request is merged. The results of tests executed by phase triggering are written to different measurement (or table).

mwalenia · 2020-06-17T07:48:04Z

...ing/test-utils/src/main/java/org/apache/beam/sdk/testutils/publishing/InfluxDBPublisher.java

+    results.forEach(
+        map ->
+            metricBuilder
+                .append(map.get("measurement"))


I'm wondering if it would make sense to extract the appends of keys and values to a method, but I can't find a nice and clean way of doing it. @kamilwu do you have any thoughts about it?

How about adding a method getKV that would return a String in this format: "key=value", e.g. "runner=DataflowRunner"? This would reduce the number of appends.

@pawelpasterz

Yeah, makes sense, thanks!

kamilwu · 2020-06-18T13:53:23Z

Run Direct Runner Nexmark Tests

kamilwu · 2020-06-19T09:10:26Z

Run Seed Job

kamilwu · 2020-06-19T09:17:01Z

Run Direct Runner Nexmark Tests

kamilwu · 2020-06-19T12:29:44Z

Run Java AvroIO Performance Test HDFS

iemejia

Looks great so far. I let some questions, just minor stuff. it was not clear to me if we are conserving the same schema for compatibility reasons with BigQuery because the goal is to support both for some time, if it is not the case we should probably get rid of the BigQuery bits.

iemejia · 2020-06-24T09:08:42Z

.test-infra/metrics/kubernetes/beam-influxdb.yaml

@@ -24,6 +24,7 @@ metadata:
 data:
  init-script.iql: |
    CREATE RETENTION POLICY "a_year" ON "beam_test_metrics" DURATION 52w REPLICATION 1 DEFAULT
+    CREATE RETENTION POLICY "forever" ON "beam_test_metrics" DURATION INF REPLICATION 1


iemejia · 2020-06-24T09:17:22Z

sdks/java/testing/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java

+      final ImmutableMap<String, String> schema =
+          ImmutableMap.<String, String>builder()
+              .put("timestamp", "timestamp")
+              .put("runtimeSec", "float")


Since the goal is to improve the existing use case can we make this an integer and use ms instead to make it more precise?

Sure, but I think we need to change wording to runtimeMs as well, WDYT?

Yes good idea

iemejia · 2020-06-24T09:18:10Z

sdks/java/testing/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java

+          ImmutableMap.<String, String>builder()
+              .put("timestamp", "timestamp")
+              .put("runtimeSec", "float")
+              .put("eventsPerSec", "float")


Do we use this one? it looks with runtimeMs + numResults this is not needed anymore or we can deduce it if someone cares.

Sure, I'll remove it from implementation but not here (we want to preserve compatibility with BQ), I'll change it in influx publisher, thanks for the info!

iemejia · 2020-06-24T09:21:36Z

sdks/java/testing/nexmark/src/main/java/org/apache/beam/sdk/nexmark/Main.java

        savePerfsToBigQuery(
            BigQueryResultsPublisher.create(options.getBigQueryDataset(), schema),
            options,
            actual,
            start);
      }
+
+      if (options.getExportSummaryToInfluxDB()) {
+        final long timestamp = start.getMillis() / 1000; // seconds


Oh I thought timestamps in Influxe were in ms well probably we don't need that level of precision for the start timestamp.

The default precision is nanoseconds. In case of nexmark results we changed it and use seconds instead

return new HttpPost( settings.host + "/write?db=" + settings.database + "&" + retentionPolicy + "&precision=s");

We've thought that we don't really need milliseconds. Even seconds are probably more than enough

Yes you guys are right seconds is ok for the execution timestamp and ms is good for the benchmark run time

iemejia · 2020-06-24T09:23:33Z

sdks/java/testing/nexmark/src/main/java/org/apache/beam/sdk/nexmark/NexmarkQueryName.java

@@ -42,8 +42,8 @@
  PROCESSING_TIME_WINDOWS(12), // Query "12"

  // Other non-numbered queries
-  BOUNDED_SIDE_INPUT_JOIN,
-  SESSION_SIDE_INPUT_JOIN;
+  BOUNDED_SIDE_INPUT_JOIN(13),


iemejia · 2020-06-24T09:25:06Z

...ing/test-utils/src/main/java/org/apache/beam/sdk/testutils/publishing/InfluxDBPublisher.java

@@ -19,14 +19,19 @@

 import static java.nio.charset.StandardCharsets.UTF_8;
 import static java.util.Objects.requireNonNull;
+import static java.util.stream.Collectors.joining;
+import static org.apache.beam.repackaged.core.org.apache.commons.lang3.StringUtils.isBlank;


Do not depend on repackaged commons-lang3 this will probably be removed in the future so better add the explicit commons-lang3 import and corresponding classes.

iemejia · 2020-06-24T12:47:55Z

...ing/test-utils/src/main/java/org/apache/beam/sdk/testutils/publishing/InfluxDBPublisher.java


-    final StringBuilder metricBuilder = new StringBuilder();
-    results.stream()


nit: The original code with the strings looks uglier but somehow is easier to understand in a single read (so easier to maintain), the new one requires a lot of methods and jumping back and forth in code for not much. Can we go back to the older approach

Hm...sure we can.

kamilwu · 2020-06-25T11:18:17Z

Run Seed Job

kamilwu · 2020-06-25T11:25:29Z

Run Direct Runner Nexmark Tests

kamilwu · 2020-06-25T11:25:48Z

Run Java AvroIO Performance Test HDFS

kamilwu · 2020-06-25T11:26:06Z

Run Dataflow Runner Nexmark Tests

iemejia · 2020-06-25T13:10:59Z

...ing/test-utils/src/main/java/org/apache/beam/sdk/testutils/publishing/InfluxDBPublisher.java


 import java.io.IOException;
 import java.util.Collection;
+import java.util.Map;
+import org.apache.beam.repackaged.core.org.apache.commons.lang3.StringUtils;


argh this commons import escaped here too. Can you use the non repackaged version please.

ah..I would swear I changed it...thanks!

iemejia

LGTM module the fix on the commons.lang3 import.
Thanks !

kamilwu · 2020-06-25T14:53:48Z

Run Seed Job

kamilwu · 2020-06-25T15:00:06Z

Run Dataflow Runner Nexmark Tests

kamilwu · 2020-06-25T15:02:35Z

Run Beam Metrics deployment

* changed unit from seconds to milliseconds * renamed `runtimeSec` field to `runtimeMs`

kamilwu · 2020-06-25T15:39:22Z

Run Seed Job

kamilwu · 2020-06-25T15:46:03Z

Run Dataflow Runner Nexmark Tests

kamilwu · 2020-06-25T15:47:08Z

Run JavaPortabilityApiJava11 PreCommit

kamilwu · 2020-06-25T15:47:16Z

Run JavaPortabilityApi PreCommit

kamilwu · 2020-06-25T16:32:41Z

Thanks @iemejia!

probot-autolabeler bot added infra java labels Jun 9, 2020

kamilwu mentioned this pull request Jun 10, 2020

[BEAM-8134] Grafana dashboards for Nexmark tests #11909

Merged

4 tasks

kamilwu force-pushed the nexmark-influxdb-publisher branch from 8644628 to 7f2be43 Compare June 10, 2020 14:09

probot-autolabeler bot added the build label Jun 10, 2020

kamilwu force-pushed the nexmark-influxdb-publisher branch from 7f2be43 to fab9cff Compare June 10, 2020 14:12

kamilwu force-pushed the nexmark-influxdb-publisher branch from fab9cff to 5fedf7e Compare June 10, 2020 15:23

kamilwu force-pushed the nexmark-influxdb-publisher branch from 5fedf7e to e89172c Compare June 11, 2020 17:26

kamilwu force-pushed the nexmark-influxdb-publisher branch from e89172c to 1586c75 Compare June 15, 2020 11:08

mwalenia reviewed Jun 17, 2020

View reviewed changes

kamilwu force-pushed the nexmark-influxdb-publisher branch from 1586c75 to 92d802f Compare June 18, 2020 13:45

kamilwu force-pushed the nexmark-influxdb-publisher branch from 92d802f to 116ad4a Compare June 19, 2020 09:09

kamilwu force-pushed the nexmark-influxdb-publisher branch from 116ad4a to b52dc6d Compare June 19, 2020 12:59

iemejia reviewed Jun 24, 2020

View reviewed changes

kamilwu added 2 commits June 25, 2020 12:27

[BEAM-8133] Apply InfluxDB pipeline options in Nexmark tests

c1a17b7

[BEAM-8133] Infinite retention policy for InfluxDB

f58fb61

kamilwu force-pushed the nexmark-influxdb-publisher branch 2 times, most recently from 91d0081 to 12f2cbe Compare June 25, 2020 11:17

iemejia reviewed Jun 25, 2020

View reviewed changes

iemejia approved these changes Jun 25, 2020

View reviewed changes

[BEAM-8133] Update publish logic

49d5e87

kamilwu force-pushed the nexmark-influxdb-publisher branch from 12f2cbe to abfb1dd Compare June 25, 2020 14:53

[BEAM-8133] Update dashboards to reflect model changes

e89324f

* changed unit from seconds to milliseconds * renamed `runtimeSec` field to `runtimeMs`

kamilwu force-pushed the nexmark-influxdb-publisher branch from abfb1dd to e89324f Compare June 25, 2020 15:38

kamilwu merged commit 3f1db41 into apache:master Jun 25, 2020

kamilwu deleted the nexmark-influxdb-publisher branch June 25, 2020 16:33


		final StringBuilder metricBuilder = new StringBuilder();
		results.stream()

[BEAM-8133] Publishing results of Nexmark tests to InfluxDB #11956

[BEAM-8133] Publishing results of Nexmark tests to InfluxDB #11956

Conversation

kamilwu commented Jun 9, 2020 • edited

Post-Commit Tests Status (on master branch)

Pre-Commit Tests Status (on master branch)

kamilwu commented Jun 10, 2020

kamilwu commented Jun 10, 2020

kamilwu commented Jun 10, 2020

kamilwu commented Jun 10, 2020

kamilwu commented Jun 10, 2020

kamilwu commented Jun 10, 2020

kamilwu commented Jun 11, 2020

kamilwu commented Jun 11, 2020

kamilwu commented Jun 11, 2020

kamilwu commented Jun 15, 2020

kamilwu commented Jun 15, 2020

kamilwu commented Jun 15, 2020

kamilwu commented Jun 15, 2020

kamilwu commented Jun 15, 2020

Choose a reason for hiding this comment

kamilwu Jun 17, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kamilwu commented Jun 18, 2020

kamilwu commented Jun 19, 2020

kamilwu commented Jun 19, 2020

kamilwu commented Jun 19, 2020

iemejia left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pawelpasterz Jun 25, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iemejia Jun 25, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kamilwu commented Jun 25, 2020

kamilwu commented Jun 25, 2020

kamilwu commented Jun 25, 2020

kamilwu commented Jun 25, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iemejia left a comment

Choose a reason for hiding this comment

kamilwu commented Jun 25, 2020

kamilwu commented Jun 25, 2020

kamilwu commented Jun 25, 2020

kamilwu commented Jun 25, 2020

kamilwu commented Jun 25, 2020

kamilwu commented Jun 25, 2020

kamilwu commented Jun 25, 2020

kamilwu commented Jun 25, 2020

kamilwu commented Jun 9, 2020 •

edited

kamilwu Jun 17, 2020 •

edited

pawelpasterz Jun 25, 2020 •

edited

iemejia Jun 25, 2020 •

edited