[BEAM-1866] Plumb user metrics through Fn API. by robertwb · Pull Request #4344 · apache/beam

robertwb · 2018-01-04T21:15:33Z

The SDK worker is now periodically querried for progress
and user metrics.

Follow this checklist to help us incorporate your contribution quickly and easily:

Make sure there is a JIRA issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
Each commit in the pull request should have a meaningful subject line and body.
Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue.
Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
Run mvn clean verify to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.
If this contribution is large, please file an Apache Individual Contributor License Agreement.

The SDK worker is now periodically querried for progress and user metrics.

robertwb · 2018-01-04T21:17:48Z

R: @pabloem

pabloem

Thanks Robert. This is very cool. I left a couple nits.

I don't know the code in fn_api_runner.py well, but all metrics changes look good.
Approving.

pabloem · 2018-01-04T22:39:09Z

sdks/python/apache_beam/runners/worker/bundle_processor.py

+            for transform_id, op in self.ops.items()},
+        user=sum(
+            [op.metrics_container.to_runner_api() for op in self.ops.values()],
+            []))


Nit: Seems like you have an extra empty list here?

This empty list is the second argument to sum (which defaults to 0, but that doesn't work for lists).

pabloem · 2018-01-04T22:42:33Z

sdks/python/apache_beam/runners/worker/operations.py

+    self.step_name = operation_name
+    self.metrics_container = MetricsContainer(self.step_name)
+    self.scoped_metrics_container = ScopedMetricsContainer(
+        self.metrics_container)


Nice. This bothered me for a while : )
Can you then remove the setting of these variables in operations.py:407, and operations.py:615-617?

Unfortunately I can't remove them from operations.py as the step and stage names can't be unified in that case (with the legacy harness). But this code will all be going away soon.

pabloem

I'm not sure if the comment was recorded:
I left a couple nits, and I can't speak for fn_api_runner.py. All else looks good.
Thanks!

robertwb · 2018-01-04T23:00:52Z

Thanks. The fn_api_runner stuff is mostly refactoring.

lukecwik · 2018-01-17T19:24:10Z