-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[data] standardize physical operator runtime metrics #40173
Conversation
python/ray/data/_internal/execution/interfaces/op_runtime_metrics.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
453ed7c
to
879a088
Compare
This PR is ready for review. |
continue | ||
value = getattr(self, f.name) | ||
result.append((f.name, value)) | ||
result.extend(self._extra_metrics.items()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should we check for private fields in _extra_metrics
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be okay. the metrics in _extra_metrics
are already meant to be exposed.
For those private fields in this classes, they are meant to track some internal states. That's why we don't expose them. I'll make the comment clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed those unneeded to be exposed to internal attributes (instead of data class "fields"). This should be clearer.
self._add_input_inner(refs, input_index) | ||
|
||
def _add_input_inner(self, refs: RefBundle, input_index: int) -> None: | ||
"""Subclasses should override this method to implement `add_input`.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also add a note in add_input
docstring, that subclasses should really implement _add_input_inner
instead? those unfamiliar with the codebase may not realize this from initially looking at the add_input
method
also wondering, do we need to create the _add_input_inner
method necessarily? could subclasses not call its super()
add_input()
and also have its own implementation inside its own add_input
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also add a note in add_input docstring, that subclasses should really implement _add_input_inner instead? those unfamiliar with the codebase may not realize this from initially looking at the add_input method
I didn't add this because I didn't want to mix "docstring for callers" and "docstring for subclasses". But after a second thought, I think this may not be a big deal. I will add it.
also wondering, do we need to create the _add_input_inner method necessarily? could subclasses not call its super() add_input() and also have its own implementation inside its own add_input?
The issue with this method is that it's easy for subclasses to forgot call super
, and unclear where to call super
(at the beginning or the end). Having a separate method can make the interface clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree to add comment, I hope Python has @private
, @public
, @protected
...
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Why are these changes needed?
Standardize metrics recording for physical operators. And introduce a new
OpRuntimeMetrics
class to decouple metrics with individual operator implementations.Not implemented in this PR:
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.