-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[data] Implement zero-copy fusion for Read op #38789
Conversation
Signed-off-by: Hao Chen <chenh1024@gmail.com> comment and repr Signed-off-by: Hao Chen <chenh1024@gmail.com> lint Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Can we add a unit test for the optimization rule?
# In this case, we can drop the BuildOutputBlocksMapTransformFn. | ||
new_transform_fns = [] | ||
|
||
for i in range(len(transform_fns)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for i in range(len(transform_fns)): | |
for i in range(1, len(transform_fns) - 1): |
Nit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional. Because the fist and last transform_fns also need to be added to the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @raulchen! Can we also add unit test for the rule?
def __call__( | ||
self, input: Iterable[MapTransformFnData], ctx: TaskContext | ||
) -> Iterable[MapTransformFnData]: | ||
return self._callable(input, ctx) | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we do raise NotImplementedError
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using @abstractmethod
, no need to add raise NotImplementedError
.
Abstract classes are better than NotImplementedError, because an exception will raised when creating the object, instead of when calling the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: ... seems to be recommended practice https://docs.python.org/3/library/abc.html
# Physical operators won't be shared, | ||
# so it's safe to modify the transform_fns in place. | ||
map_transformer._transform_fns = new_transform_fns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be good to define a method in MapTransformer
Returns: | ||
The optimized transform_fns chain. | ||
""" | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we do raise NotImplementedError
instead?
pass | ||
|
||
|
||
class ReadOpZeroCopyMapFusion(ZeroCopyMapFusionRule): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we be more general for the naming? Such as EliminateBuildOutputblocks
? There's no read/write operator in physical world, and this optimization rule is really doing the work to eliminate unnecessary BuildOutputBlocksMapTransformFn
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
def __call__(self, input: Iterable[Row], ctx: TaskContext) -> Iterable[Row]: | ||
yield from self._row_fn(input, ctx) | ||
|
||
def __repr__(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently not being used. I only used this when debugging, and decided to keep it as having a better repr won't hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now it's used in the unit test.
Signed-off-by: Hao Chen <chenh1024@gmail.com>
def __call__( | ||
self, input: Iterable[MapTransformFnData], ctx: TaskContext | ||
) -> Iterable[MapTransformFnData]: | ||
return self._callable(input, ctx) | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: ... seems to be recommended practice https://docs.python.org/3/library/abc.html
"""Base class for zero-copy map fusion rules. | ||
|
||
Subclasses implement the optimization strategies for different combinations of | ||
fused map operators, by dropping unnecessary data conversion `MapTransformFn`s. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add more to the docstring? What is this rule doing? When should subclasses override this rule?
pass | ||
|
||
|
||
class ReadOpZeroCopyMapFusion(ZeroCopyMapFusionRule): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Added a unit test and all comments are addressed. |
This updates includes a few fixes for image classification benchmarks: use Dataset.map instead of Dataset.map_batches. [data] Implement zero-copy fusion for Read op #38789 ensures these will get fused with the Read, but map_batches also has some batch formatting overhead. fix a bug in the benchmark related to image array dimensions avoid a copy in the map transform --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
Optimize `Read -> Map/Write` fusion. In this case, we can drop the unnecessary `BuildOutputBlocks` transform_fn. Also change `MapTransformFn` to an abstract class and enforce implementations to use subclasses. This is for optimization rules to better detecting the pattern. --------- Signed-off-by: Hao Chen <chenh1024@gmail.com> Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
This updates includes a few fixes for image classification benchmarks: use Dataset.map instead of Dataset.map_batches. [data] Implement zero-copy fusion for Read op ray-project#38789 ensures these will get fused with the Read, but map_batches also has some batch formatting overhead. fix a bug in the benchmark related to image array dimensions avoid a copy in the map transform --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu> Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
This updates includes a few fixes for image classification benchmarks: use Dataset.map instead of Dataset.map_batches. [data] Implement zero-copy fusion for Read op ray-project#38789 ensures these will get fused with the Read, but map_batches also has some batch formatting overhead. fix a bug in the benchmark related to image array dimensions avoid a copy in the map transform --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
This updates includes a few fixes for image classification benchmarks: use Dataset.map instead of Dataset.map_batches. [data] Implement zero-copy fusion for Read op ray-project#38789 ensures these will get fused with the Read, but map_batches also has some batch formatting overhead. fix a bug in the benchmark related to image array dimensions avoid a copy in the map transform --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu> Signed-off-by: Jim Thompson <jimthompson5802@gmail.com>
Optimize `Read -> Map/Write` fusion. In this case, we can drop the unnecessary `BuildOutputBlocks` transform_fn. Also change `MapTransformFn` to an abstract class and enforce implementations to use subclasses. This is for optimization rules to better detecting the pattern. --------- Signed-off-by: Hao Chen <chenh1024@gmail.com> Signed-off-by: Victor <vctr.y.m@example.com>
This updates includes a few fixes for image classification benchmarks: use Dataset.map instead of Dataset.map_batches. [data] Implement zero-copy fusion for Read op ray-project#38789 ensures these will get fused with the Read, but map_batches also has some batch formatting overhead. fix a bug in the benchmark related to image array dimensions avoid a copy in the map transform --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu> Signed-off-by: Victor <vctr.y.m@example.com>
Why are these changes needed?
Optimize
Read -> Map/Write
fusion. In this case, we can drop the unnecessaryBuildOutputBlocks
transform_fn.Also change
MapTransformFn
to an abstract class and enforce implementations to use subclasses. This is for optimization rules to better detecting the pattern.Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.