Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shlo_ref: Add convolution op and unit tests #66299

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

LokeshReddyOVS-MCW
Copy link

  • Adds initial version of convolution op
  • Adds unit tests for convolution op

CC: @rascani @qukhan

- Adds initial version of convolution op
- Adds unit tests for convolution op
@google-ml-butler google-ml-butler bot added the size:XL CL Change Size:Extra Large label Apr 23, 2024
Copy link

google-cla bot commented Apr 23, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link
Contributor

@rascani rascani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments, but I still have a lot more to go through.

return std::unique(vec.begin(), vec.end()) == vec.end();
}

bool IsInRange(DimVector<int64_t>& vec, size_t N) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vec can be const.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the vec, using set now(you can check it in PR#67245)
using const in appropriate places

using DimVector = absl::InlinedVector<T, 6>;

bool IsUnique(DimVector<int64_t>& vec) {
std::sort(vec.begin(), vec.end());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will modify the vec. You may want to consider making a copy, sorting, then testing for uniqueness. You could also use a set.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to set, you can check it in PR#67245

}
// malloc is used to have the storage space available out of prepare function
// scope and it's pointer is stored in class data member to
// deallocate the memory in destructor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to figure out a container for storing these rather than using malloc directly. I made a utility class TensorWithData that could pair a Tensor with the backing data in a std::vector<std::byte>, but it was originally aimed at tests. Would something like that work?

That said, do all of these need to be Tensors? The permutations & transposed seem to simply be 1d vectors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you could use a std::vector<std::byte> directly that would handle the memory for you in place of malloc calls.

Ideally, the ops should not own any memory and only keep a pointer to the external data that is provided. I understand that if you want to call another op you may need to create that op's parameters.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to std::vectorstd::byte(you can check it in PR#67245)

const int64_t* window_strides_pointer =
op.attributes.window_strides.GetDataAs<DataType::kSI64>();

// Constraints Check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should perform constraint checks before any other prepare work is done.

.precision_configs = precision_configs});

Vector<StorageT> expected_data;
if (std::is_same<StorageT, BF16>::value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These expected results seem to be differing to a larger degree than I would expect. Where were these generated from?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we got the expected from StableHlo repo, now float types are seperated into different tests(you can check it in PR#67245)

ASSERT_OK(Evaluate(op, lhs, rhs, output_tensor));

constexpr double kEpsilon = 0.1;
EXPECT_THAT(output_data, Pointwise(FloatEq(), expected_data));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this using FloatEq for integer storage?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to Eq() for integer storage(you can check it in PR#67245)

TYPED_TEST_SUITE(QuantizedIntConvolutionTest, QuantizedTestTypes,
TestParamNames);

TYPED_TEST(QuantizedIntConvolutionTest, PerTensorsRaiseAnError) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't seem to expect an error to be raised.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to appropriate tests names


Vector<StorageT> expected_data =
Vector<StorageT>{5, 10, 15, 20, 25, 5, 10, 15, 20, -25};
;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra semicolon

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

EXPECT_THAT(output_data, Pointwise(FloatEq(), expected_data));
}

TYPED_TEST(QuantizedIntConvolutionTest, PerAxisRaiseAnError) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't seem to raise an error.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to appropriate test name

Vector<StorageT> expected_data =
Vector<StorageT>{5, 10, 15, 20, 25, 5, 10, 15, 20, -25};
;
if (std::is_same<StorageT, I4>::value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest splitting out a separate test case for i4 where we can ensure the input and expected data are within the i4 range without having to clamp most of the values.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved by calling quantize function ,now works for all int types without clamping

using StorageT = StorageType<storage_type>;

// Transpose prepare
const int64_t* window_spacial_pointer =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spacial -> spatial

Comment on lines +41 to +46
for (int64_t dim : vec) {
if (dim >= N || dim < 0) {
return false;
}
}
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int64_t dim : vec) {
if (dim >= N || dim < 0) {
return false;
}
}
return true;
return absl::c_all_of(dim, [N](int64_t v) { return v >= N || v < 0; });

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to absl::c_all_of(you can check it in PR#67245)

.rhs_contracting_dimensions = rhs_contracting_dimensions,
.precision_configs = precision_configs});

auto state = Prepare(op.dot_general_op, lhs_dot_general, rhs_dot_general,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless the type is ridiculously complicated, prefer explicit types.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stopped auto usage

"stablehlo.convolution: Size of precision_config must be two.");
}
if (op.attributes.precision_configs[0] != PrecisionTypes::DEFAULT &&
op.attributes.precision_configs[1] != PrecisionTypes::DEFAULT) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be an ||. You could also use the same implementation for all the configs.

for (size_t i = 0; i < n; ++i) {
if (check_buffer[i] == 0) {
is_greater_than_zero = false;
exit;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exit;?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Comment on lines +728 to +774
// Padding op basic implimentation in context of Convolution
template <DataType storage_type>
void PaddingOp(ConvolutionOp& op, const Tensor& x, const Tensor& padding,
const Tensor& lhs_dilations) {
using StorageT = StorageType<storage_type>;
using int64_t = StorageType<DataType::kSI64>;
const StorageT* x_buffer = x.GetDataAs<storage_type>();
const int64_t* lhs_dilation_buffer =
lhs_dilations.GetDataAs<DataType::kSI64>();
const int64_t* padding_buffer = padding.GetDataAs<DataType::kSI64>();
StorageT* lhs_buffer = op.lhs_padded.GetDataAs<storage_type>();
size_t j = 0;
for (size_t i = 0; i < op.lhs_padded.NumElements(); ++i) {
int x_spacials[x.Rank() - 2];
size_t depth = 1;

for (int64_t m = x.Rank() - 3; m >= 0; --m) {
x_spacials[m] =
(i / depth) % static_cast<size_t>(op.lhs_padded.shape().Dim(m + 2));
depth *= static_cast<size_t>(op.lhs_padded.shape().Dim(m + 2));
}
bool check = true;
for (int64_t k = x.Rank() - 3; k >= 0; --k) {
check *= x_spacials[k] >= (padding_buffer[2 * k]) &&
x_spacials[k] < x.shape().Dim(k + 2) +
(lhs_dilation_buffer[k] - 1) *
(x.shape().Dim(k + 2) - 1) +
padding_buffer[2 * k];
}

if (check) {
for (int64_t k = x.Rank() - 3; k >= 0; --k) {
check *= static_cast<size_t>(lhs_dilation_buffer[k]) != 0;
}
if (check) {
for (int64_t k = x.Rank() - 3; k >= 0; --k) {
check *= static_cast<size_t>(x_spacials[k] - padding_buffer[2 * k]) %
static_cast<size_t>(lhs_dilation_buffer[k]) ==
0;
}
if (check) {
lhs_buffer[i] = x_buffer[j++];
}
}
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StableHLO Pad already has a somewhat optimized version in TFLite that you should try to adapt: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/kernels/stablehlo_pad.cc

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adapted stablehlo_pad function(you can check it in PR#67246)

Comment on lines +41 to +52
free(lhs_permutation_data);
free(lhs_transposed_data);
free(rhs_permutation_data);
free(rhs_transposed_data);
free(output_permutation_data);
free(output_transposed_data);
free(lhs_padded_data);
free(lhs_dot_general_data);
free(rhs_dot_general_data);
free(output_dot_general_data);
free(lhs_contracting_dimensions_data);
free(rhs_contracting_dimensions_data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make use of RAII.

Comment on lines +66 to +81
Tensor window_strides;
Tensor padding;
Tensor lhs_dilation;
Tensor rhs_dilation;
Tensor window_reversal;
int64_t input_batch_dimension;
int64_t input_feature_dimension;
Tensor input_spacial_dimensions;
int64_t kernel_input_feature_dimension;
int64_t kernel_output_feature_dimension;
Tensor kernel_spacial_dimensions;
int64_t output_batch_dimension;
int64_t output_feature_dimension;
Tensor output_spacial_dimensions;
int64_t feature_group_count;
int64_t batch_group_count;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of these tensors are defined in the spec as 64 bit int 1D tensors. Use an absl::Span<int64_t>.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed 1D tensors to span(you can check it in PR#67245)

Tensor output_spacial_dimensions;
int64_t feature_group_count;
int64_t batch_group_count;
absl::InlinedVector<PrecisionTypes, 2> precision_configs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a fixed size in the spec. No need for the flexibility of absl::InlinedVector. Use std::array<PrecisionTypes, 2>.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to std::array<PrecisionTypes, 2>

Comment on lines +101 to +115
void* lhs_permutation_data;
void* lhs_transposed_data;
void* rhs_permutation_data;
void* rhs_transposed_data;
void* output_permutation_data;
void* output_transposed_data;
void* lhs_padded_data;
void* lhs_dot_general_data;
void* rhs_dot_general_data;
void* output_dot_general_data;
void* lhs_contracting_dimensions_data;
void* rhs_contracting_dimensions_data;
void* lhs_dequantized_data;
void* rhs_dequantized_data;
void* output_dequantized_data;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are owned by the op, make use of RAII.

@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Apr 26, 2024
@gbaned
Copy link
Contributor

gbaned commented Apr 26, 2024

Hi @LokeshReddyOVS-MCW This PR is in draft, any update on this? Please. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:XL CL Change Size:Extra Large
Projects
PR Queue
  
Assigned Reviewer
Development

Successfully merging this pull request may close these issues.

None yet

4 participants