Skip to content

[BEAM-1866] Plumb user metrics through Fn API.#4344

Merged
robertwb merged 1 commit intoapache:masterfrom
robertwb:user-metrics
Jan 5, 2018
Merged

[BEAM-1866] Plumb user metrics through Fn API.#4344
robertwb merged 1 commit intoapache:masterfrom
robertwb:user-metrics

Conversation

@robertwb
Copy link
Contributor

@robertwb robertwb commented Jan 4, 2018

The SDK worker is now periodically querried for progress
and user metrics.

Follow this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a JIRA issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
  • Each commit in the pull request should have a meaningful subject line and body.
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Run mvn clean verify to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

The SDK worker is now periodically querried for progress
and user metrics.
@robertwb
Copy link
Contributor Author

robertwb commented Jan 4, 2018

R: @pabloem

Copy link
Member

@pabloem pabloem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Robert. This is very cool. I left a couple nits.

I don't know the code in fn_api_runner.py well, but all metrics changes look good.
Approving.

for transform_id, op in self.ops.items()},
user=sum(
[op.metrics_container.to_runner_api() for op in self.ops.values()],
[]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Seems like you have an extra empty list here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This empty list is the second argument to sum (which defaults to 0, but that doesn't work for lists).

self.step_name = operation_name
self.metrics_container = MetricsContainer(self.step_name)
self.scoped_metrics_container = ScopedMetricsContainer(
self.metrics_container)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. This bothered me for a while : )
Can you then remove the setting of these variables in operations.py:407, and operations.py:615-617?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I can't remove them from operations.py as the step and stage names can't be unified in that case (with the legacy harness). But this code will all be going away soon.

Copy link
Member

@pabloem pabloem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the comment was recorded:
I left a couple nits, and I can't speak for fn_api_runner.py. All else looks good.
Thanks!

@robertwb
Copy link
Contributor Author

robertwb commented Jan 4, 2018

Thanks. The fn_api_runner stuff is mostly refactoring.

@robertwb robertwb merged commit d2690fa into apache:master Jan 5, 2018
// A key for identifying a metric at the most granular level.
message MetricKey {
// The step, if any, this metric is associated with.
string step = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ptransform_id?

string step = 1;

// (Required): The namespace of this metric.
string namespace = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this?

string namespace = 2;

// (Required): The name of this metric.
string name = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this is the user supplied name.


map<string, PTransform> ptransforms = 1;
map<string, User> user = 2;
repeated User user = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we could keep the map here from ptransform id to User and have User have a repeated field allowing to store multiple metrics.

}

// Data associated with a counter metric.
message CounterData {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only just thought of it while trying to find the JIRA associated with Fn API metrics, but I don't think user-defined metrics should necessarily be conflated with progress reporting. For some contexts, a runner may just support attempted metrics via a proxy over dropwizard. It might be nice to be able to stand this up as an independent service that runners can share.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what your trying to get at here with an independent service?

If you are saying that the Metrics type should be defined outside of beam_fn_api.proto in some place that is shared between job management and job execution then I agree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that the message sent from SDK harness to runner should be over an independent channel from progress reporting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context is that there has been an ask to be able to publish metrics to a metrics pusher that is either independent of the runner (can only support attempted metrics) or supported by a runner. In either case both APIs can share this metrics type or the metrics pusher API can define a new API which is relevant for its use case.

@kennknowles
Copy link
Member

By the way, it would actually be nice to have the proto changes in a parent commit for history. I expect they will be a bit more stable. And if you merged that earlier it would make it a little smoother for other language SDKs to get started, vs patching this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants