Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[data] standardize physical operator runtime metrics #40173

Merged
merged 23 commits into from
Oct 11, 2023

Conversation

raulchen
Copy link
Contributor

@raulchen raulchen commented Oct 6, 2023

Why are these changes needed?

Standardize metrics recording for physical operators. And introduce a new OpRuntimeMetrics class to decouple metrics with individual operator implementations.

Not implemented in this PR:

  • There are currently 4 groups of metrics: inputs, outputs, tasks, and object store. The first 2 support all operators. The last 2 only support map operators for now.
  • Integration with DatasetStats.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
@raulchen
Copy link
Contributor Author

raulchen commented Oct 8, 2023

This PR is ready for review.

continue
value = getattr(self, f.name)
result.append((f.name, value))
result.extend(self._extra_metrics.items())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we check for private fields in _extra_metrics as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be okay. the metrics in _extra_metrics are already meant to be exposed.
For those private fields in this classes, they are meant to track some internal states. That's why we don't expose them. I'll make the comment clearer.

Copy link
Contributor Author

@raulchen raulchen Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed those unneeded to be exposed to internal attributes (instead of data class "fields"). This should be clearer.

self._add_input_inner(refs, input_index)

def _add_input_inner(self, refs: RefBundle, input_index: int) -> None:
"""Subclasses should override this method to implement `add_input`."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also add a note in add_input docstring, that subclasses should really implement _add_input_inner instead? those unfamiliar with the codebase may not realize this from initially looking at the add_input method

also wondering, do we need to create the _add_input_inner method necessarily? could subclasses not call its super() add_input() and also have its own implementation inside its own add_input?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also add a note in add_input docstring, that subclasses should really implement _add_input_inner instead? those unfamiliar with the codebase may not realize this from initially looking at the add_input method

I didn't add this because I didn't want to mix "docstring for callers" and "docstring for subclasses". But after a second thought, I think this may not be a big deal. I will add it.

also wondering, do we need to create the _add_input_inner method necessarily? could subclasses not call its super() add_input() and also have its own implementation inside its own add_input?

The issue with this method is that it's easy for subclasses to forgot call super, and unclear where to call super (at the beginning or the end). Having a separate method can make the interface clearer.

Copy link
Contributor

@c21 c21 Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree to add comment, I hope Python has @private, @public, @protected ...

Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Copy link
Contributor

@c21 c21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@raulchen raulchen merged commit e9ed0f1 into ray-project:master Oct 11, 2023
43 of 44 checks passed
@raulchen raulchen deleted the op-runtime-metrics branch October 11, 2023 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants