[data] Implement zero-copy fusion for Read op #38789

raulchen · 2023-08-23T18:31:49Z

Why are these changes needed?

Optimize Read -> Map/Write fusion. In this case, we can drop the unnecessary BuildOutputBlocks transform_fn.

Also change MapTransformFn to an abstract class and enforce implementations to use subclasses. This is for optimization rules to better detecting the pattern.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Hao Chen <chenh1024@gmail.com> comment and repr Signed-off-by: Hao Chen <chenh1024@gmail.com> lint Signed-off-by: Hao Chen <chenh1024@gmail.com>

Signed-off-by: Hao Chen <chenh1024@gmail.com>

stephanie-wang

Looks good! Can we add a unit test for the optimization rule?

stephanie-wang · 2023-08-23T19:09:46Z

python/ray/data/_internal/logical/rules/zero_copy_map_fusion.py

+        # In this case, we can drop the BuildOutputBlocksMapTransformFn.
+        new_transform_fns = []
+
+        for i in range(len(transform_fns)):


Suggested change

for i in range(len(transform_fns)):

for i in range(1, len(transform_fns) - 1):

Nit

This is intentional. Because the fist and last transform_fns also need to be added to the result.

c21

Thanks @raulchen! Can we also add unit test for the rule?

c21 · 2023-08-23T19:15:45Z

python/ray/data/_internal/execution/operators/map_transformer.py

    def __call__(
        self, input: Iterable[MapTransformFnData], ctx: TaskContext
    ) -> Iterable[MapTransformFnData]:
-        return self._callable(input, ctx)
+        pass


can we do raise NotImplementedError instead?

When using @abstractmethod, no need to add raise NotImplementedError.
Abstract classes are better than NotImplementedError, because an exception will raised when creating the object, instead of when calling the method.

got it, thanks!

Nit: ... seems to be recommended practice https://docs.python.org/3/library/abc.html

c21 · 2023-08-23T19:20:02Z

python/ray/data/_internal/logical/rules/zero_copy_map_fusion.py

+            # Physical operators won't be shared,
+            # so it's safe to modify the transform_fns in place.
+            map_transformer._transform_fns = new_transform_fns


it would be good to define a method in MapTransformer

c21 · 2023-08-23T19:21:10Z

python/ray/data/_internal/logical/rules/zero_copy_map_fusion.py

+        Returns:
+            The optimized transform_fns chain.
+        """
+        pass


can we do raise NotImplementedError instead?

c21 · 2023-08-23T19:26:49Z

python/ray/data/_internal/logical/rules/zero_copy_map_fusion.py

+        pass
+
+
+class ReadOpZeroCopyMapFusion(ZeroCopyMapFusionRule):


Can we be more general for the naming? Such as EliminateBuildOutputblocks? There's no read/write operator in physical world, and this optimization rule is really doing the work to eliminate unnecessary BuildOutputBlocksMapTransformFn.

c21 · 2023-08-23T19:32:04Z

python/ray/data/_internal/execution/operators/map_transformer.py

+    def __call__(self, input: Iterable[Row], ctx: TaskContext) -> Iterable[Row]:
+        yield from self._row_fn(input, ctx)
+
+    def __repr__(self) -> str:


where is this being used?

Currently not being used. I only used this when debugging, and decided to keep it as having a better repr won't hurt.

now it's used in the unit test.

Signed-off-by: Hao Chen <chenh1024@gmail.com>

amogkam · 2023-08-23T19:46:17Z

python/ray/data/_internal/execution/operators/map_transformer.py

    def __call__(
        self, input: Iterable[MapTransformFnData], ctx: TaskContext
    ) -> Iterable[MapTransformFnData]:
-        return self._callable(input, ctx)
+        pass


Nit: ... seems to be recommended practice https://docs.python.org/3/library/abc.html

amogkam · 2023-08-23T19:47:27Z

python/ray/data/_internal/logical/rules/zero_copy_map_fusion.py

+    """Base class for zero-copy map fusion rules.
+
+    Subclasses implement the optimization strategies for different combinations of
+    fused map operators, by dropping unnecessary data conversion `MapTransformFn`s.


Add more to the docstring? What is this rule doing? When should subclasses override this rule?

python/ray/data/_internal/logical/rules/zero_copy_map_fusion.py

amogkam · 2023-08-23T19:49:11Z

python/ray/data/_internal/logical/rules/zero_copy_map_fusion.py

+        pass
+
+
+class ReadOpZeroCopyMapFusion(ZeroCopyMapFusionRule):


Signed-off-by: Hao Chen <chenh1024@gmail.com>

c21

LGTM

Signed-off-by: Hao Chen <chenh1024@gmail.com>

raulchen · 2023-08-23T20:39:48Z

Added a unit test and all comments are addressed.
I used another test PR to run the release tests. But not sure if there is an infra issue, the job always stucks at waiting for the image building https://buildkite.com/ray-project/release-tests-pr/builds/50420#018a235a-0b83-47db-a45c-9d971c5313b7
I'll use this PR to run release tests again. If it still won't run. I think we can also merge this PR first.

Signed-off-by: Hao Chen <chenh1024@gmail.com>

This updates includes a few fixes for image classification benchmarks: use Dataset.map instead of Dataset.map_batches. [data] Implement zero-copy fusion for Read op #38789 ensures these will get fused with the Read, but map_batches also has some batch formatting overhead. fix a bug in the benchmark related to image array dimensions avoid a copy in the map transform --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>

Optimize `Read -> Map/Write` fusion. In this case, we can drop the unnecessary `BuildOutputBlocks` transform_fn. Also change `MapTransformFn` to an abstract class and enforce implementations to use subclasses. This is for optimization rules to better detecting the pattern. --------- Signed-off-by: Hao Chen <chenh1024@gmail.com> Signed-off-by: e428265 <arvind.chandramouli@lmco.com>

This updates includes a few fixes for image classification benchmarks: use Dataset.map instead of Dataset.map_batches. [data] Implement zero-copy fusion for Read op ray-project#38789 ensures these will get fused with the Read, but map_batches also has some batch formatting overhead. fix a bug in the benchmark related to image array dimensions avoid a copy in the map transform --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu> Signed-off-by: e428265 <arvind.chandramouli@lmco.com>

This updates includes a few fixes for image classification benchmarks: use Dataset.map instead of Dataset.map_batches. [data] Implement zero-copy fusion for Read op ray-project#38789 ensures these will get fused with the Read, but map_batches also has some batch formatting overhead. fix a bug in the benchmark related to image array dimensions avoid a copy in the map transform --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>

This updates includes a few fixes for image classification benchmarks: use Dataset.map instead of Dataset.map_batches. [data] Implement zero-copy fusion for Read op ray-project#38789 ensures these will get fused with the Read, but map_batches also has some batch formatting overhead. fix a bug in the benchmark related to image array dimensions avoid a copy in the map transform --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu> Signed-off-by: Jim Thompson <jimthompson5802@gmail.com>

Optimize `Read -> Map/Write` fusion. In this case, we can drop the unnecessary `BuildOutputBlocks` transform_fn. Also change `MapTransformFn` to an abstract class and enforce implementations to use subclasses. This is for optimization rules to better detecting the pattern. --------- Signed-off-by: Hao Chen <chenh1024@gmail.com> Signed-off-by: Victor <vctr.y.m@example.com>

This updates includes a few fixes for image classification benchmarks: use Dataset.map instead of Dataset.map_batches. [data] Implement zero-copy fusion for Read op ray-project#38789 ensures these will get fused with the Read, but map_batches also has some batch formatting overhead. fix a bug in the benchmark related to image array dimensions avoid a copy in the map transform --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu> Signed-off-by: Victor <vctr.y.m@example.com>

raulchen added 2 commits August 23, 2023 11:28

Implement zero-copy fusion for read op

1ee542e

Signed-off-by: Hao Chen <chenh1024@gmail.com> comment and repr Signed-off-by: Hao Chen <chenh1024@gmail.com> lint Signed-off-by: Hao Chen <chenh1024@gmail.com>

default rule

a8d1c1a

Signed-off-by: Hao Chen <chenh1024@gmail.com>

raulchen requested review from ericl, scv119, c21, amogkam, scottjlee and bveeramani as code owners August 23, 2023 18:31

raulchen added 3 commits August 23, 2023 11:38

comment

4ed03fb

Signed-off-by: Hao Chen <chenh1024@gmail.com>

lint

998865d

Signed-off-by: Hao Chen <chenh1024@gmail.com>

unskip test

61640a4

Signed-off-by: Hao Chen <chenh1024@gmail.com>

stephanie-wang approved these changes Aug 23, 2023

View reviewed changes

c21 reviewed Aug 23, 2023

View reviewed changes

raulchen added 2 commits August 23, 2023 12:47

setter getter

b24e206

Signed-off-by: Hao Chen <chenh1024@gmail.com>

rename

679ff65

Signed-off-by: Hao Chen <chenh1024@gmail.com>

amogkam reviewed Aug 23, 2023

View reviewed changes

unit test

d8a3a68

Signed-off-by: Hao Chen <chenh1024@gmail.com>

c21 approved these changes Aug 23, 2023

View reviewed changes

refine

d56561e

Signed-off-by: Hao Chen <chenh1024@gmail.com>

comment

5bbef7d

Signed-off-by: Hao Chen <chenh1024@gmail.com>

raulchen merged commit 1419cfb into ray-project:master Aug 23, 2023
49 of 55 checks passed

raulchen deleted the read-zero-copy-fusion branch August 23, 2023 23:08

stephanie-wang mentioned this pull request Aug 25, 2023

[data][tests] Update image classification benchmarks #38902

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[data] Implement zero-copy fusion for Read op #38789

[data] Implement zero-copy fusion for Read op #38789

raulchen commented Aug 23, 2023

stephanie-wang left a comment

stephanie-wang Aug 23, 2023

raulchen Aug 23, 2023

c21 left a comment

c21 Aug 23, 2023

raulchen Aug 23, 2023

c21 Aug 23, 2023

amogkam Aug 23, 2023

c21 Aug 23, 2023

c21 Aug 23, 2023

c21 Aug 23, 2023

amogkam Aug 23, 2023

c21 Aug 23, 2023

raulchen Aug 23, 2023

raulchen Aug 23, 2023

amogkam Aug 23, 2023

amogkam Aug 23, 2023

amogkam Aug 23, 2023

c21 left a comment

raulchen commented Aug 23, 2023 •

edited

Loading

	for i in range(len(transform_fns)):
	for i in range(1, len(transform_fns) - 1):

[data] Implement zero-copy fusion for Read op #38789

[data] Implement zero-copy fusion for Read op #38789

Conversation

raulchen commented Aug 23, 2023

Why are these changes needed?

Related issue number

Checks

stephanie-wang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

c21 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

c21 left a comment

Choose a reason for hiding this comment

raulchen commented Aug 23, 2023 • edited Loading

raulchen commented Aug 23, 2023 •

edited

Loading