Skip to content

Optimize slicing when possible by copying bigger blocks at once#13261

Merged
yuslepukhin merged 7 commits intomainfrom
yuslepukhin/slice_max_block
Oct 18, 2022
Merged

Optimize slicing when possible by copying bigger blocks at once#13261
yuslepukhin merged 7 commits intomainfrom
yuslepukhin/slice_max_block

Conversation

@yuslepukhin
Copy link
Copy Markdown
Member

@yuslepukhin yuslepukhin commented Oct 10, 2022

Description

Currently, SliceIterator copies inner dimension size at once at best.
However, there are many slices when several inner dimensions can be copied at once.
Furthermore, even if a dimension is sliced, it may employ step 1 and, therefore, has a continuous block of inner dimensions that can be copied at once.

Motivation and Context

For example, [N, C, H, W] with slice [:, :, i:, :] and [N, C, H-i, W]. Meaning, we slice along single axis, with step = 1. Current implementation does C * (H-i) memcpy with W elements each. With this change we can do C memcpy with (H-i)*W elements each.

The optimization produces ~11% savings on certain internal models.

@yuslepukhin yuslepukhin requested review from skottmckay and snnn and removed request for snnn October 10, 2022 20:33
@lgtm-com
Copy link
Copy Markdown

lgtm-com bot commented Oct 11, 2022

This pull request introduces 1 alert when merging 5e8f3aa into 25c0a66 - view on LGTM.com

new alerts:

  • 1 for Suspicious pointer scaling

@lgtm-com
Copy link
Copy Markdown

lgtm-com bot commented Oct 12, 2022

This pull request introduces 1 alert when merging bb1ca40 into b2353fa - view on LGTM.com

new alerts:

  • 1 for Suspicious pointer scaling

@yuslepukhin yuslepukhin marked this pull request as ready for review October 12, 2022 17:18
@yuslepukhin yuslepukhin force-pushed the yuslepukhin/slice_max_block branch from bb1ca40 to 73f72db Compare October 12, 2022 21:55
@lgtm-com
Copy link
Copy Markdown

lgtm-com bot commented Oct 12, 2022

This pull request introduces 1 alert when merging 73f72db into 6895918 - view on LGTM.com

new alerts:

  • 1 for Suspicious pointer scaling

if (dim < steps_size && steps[dim] != 1) {
break;
}
max_copyable_elements *= extents_[dim];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there code coverage of this from the existing unit tests? Not sure if you validated with those or a specific model.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Slice now takes advantage of the new interface and I have validated it with the customer model. The new code is covered.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it covered by a unit test I can look at? e.g. one that has a similar inputs to the Slice in the customer model.

Wondering if it would make sense to store a list of block sizes instead of having a single max_copying_elements_block_ value - but I don't have an example of the Slice params you were testing with to figure out if that would be of value.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good example of coverage.

TEST(SliceTest, Slice5D_LargeStep) {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List of copyable block sizes is stored in extents_

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore the part about storing a list - that would only be useful if an axis could be repeated in axes which isn't allowed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't Slice5D_LargeStep behave the same without your change as it's slicing on axis 1 and only taking 1 element, so it's the same amount of data that flattening the last 3 dims would have picked.

If so could we add a test where the new code is a) used for multiple slices, and b) the block size would differ from what the existing code does?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test SliceTest.Slice5D_CopyAxis2LargeBlock that would copy blocks by 8, not by 4, but still does it in multiple slices because of axis 1 slicing.

@lgtm-com
Copy link
Copy Markdown

lgtm-com bot commented Oct 18, 2022

This pull request introduces 1 alert when merging 93372a3 into e398241 - view on LGTM.com

new alerts:

  • 1 for Suspicious pointer scaling

Copy link
Copy Markdown
Contributor

@skottmckay skottmckay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@yuslepukhin yuslepukhin merged commit 9189ebb into main Oct 18, 2022
@yuslepukhin yuslepukhin deleted the yuslepukhin/slice_max_block branch October 18, 2022 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants