-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shlo_ref: Add convolution op and unit tests #66299
base: master
Are you sure you want to change the base?
shlo_ref: Add convolution op and unit tests #66299
Conversation
- Adds initial version of convolution op - Adds unit tests for convolution op
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments, but I still have a lot more to go through.
return std::unique(vec.begin(), vec.end()) == vec.end(); | ||
} | ||
|
||
bool IsInRange(DimVector<int64_t>& vec, size_t N) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vec
can be const.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed the vec, using set now(you can check it in PR#67245)
using const in appropriate places
using DimVector = absl::InlinedVector<T, 6>; | ||
|
||
bool IsUnique(DimVector<int64_t>& vec) { | ||
std::sort(vec.begin(), vec.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will modify the vec
. You may want to consider making a copy, sorting, then testing for uniqueness. You could also use a set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to set, you can check it in PR#67245
} | ||
// malloc is used to have the storage space available out of prepare function | ||
// scope and it's pointer is stored in class data member to | ||
// deallocate the memory in destructor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to figure out a container for storing these rather than using malloc directly. I made a utility class TensorWithData that could pair a Tensor with the backing data in a std::vector<std::byte>
, but it was originally aimed at tests. Would something like that work?
That said, do all of these need to be Tensors? The permutations & transposed seem to simply be 1d vectors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, you could use a std::vector<std::byte>
directly that would handle the memory for you in place of malloc
calls.
Ideally, the ops should not own any memory and only keep a pointer to the external data that is provided. I understand that if you want to call another op you may need to create that op's parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to std::vectorstd::byte(you can check it in PR#67245)
const int64_t* window_strides_pointer = | ||
op.attributes.window_strides.GetDataAs<DataType::kSI64>(); | ||
|
||
// Constraints Check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should perform constraint checks before any other prepare work is done.
.precision_configs = precision_configs}); | ||
|
||
Vector<StorageT> expected_data; | ||
if (std::is_same<StorageT, BF16>::value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These expected results seem to be differing to a larger degree than I would expect. Where were these generated from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we got the expected from StableHlo repo, now float types are seperated into different tests(you can check it in PR#67245)
ASSERT_OK(Evaluate(op, lhs, rhs, output_tensor)); | ||
|
||
constexpr double kEpsilon = 0.1; | ||
EXPECT_THAT(output_data, Pointwise(FloatEq(), expected_data)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this using FloatEq for integer storage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to Eq() for integer storage(you can check it in PR#67245)
TYPED_TEST_SUITE(QuantizedIntConvolutionTest, QuantizedTestTypes, | ||
TestParamNames); | ||
|
||
TYPED_TEST(QuantizedIntConvolutionTest, PerTensorsRaiseAnError) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test doesn't seem to expect an error to be raised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to appropriate tests names
|
||
Vector<StorageT> expected_data = | ||
Vector<StorageT>{5, 10, 15, 20, 25, 5, 10, 15, 20, -25}; | ||
; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra semicolon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
EXPECT_THAT(output_data, Pointwise(FloatEq(), expected_data)); | ||
} | ||
|
||
TYPED_TEST(QuantizedIntConvolutionTest, PerAxisRaiseAnError) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test doesn't seem to raise an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to appropriate test name
Vector<StorageT> expected_data = | ||
Vector<StorageT>{5, 10, 15, 20, 25, 5, 10, 15, 20, -25}; | ||
; | ||
if (std::is_same<StorageT, I4>::value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest splitting out a separate test case for i4 where we can ensure the input and expected data are within the i4 range without having to clamp most of the values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved by calling quantize function ,now works for all int types without clamping
using StorageT = StorageType<storage_type>; | ||
|
||
// Transpose prepare | ||
const int64_t* window_spacial_pointer = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spacial -> spatial
for (int64_t dim : vec) { | ||
if (dim >= N || dim < 0) { | ||
return false; | ||
} | ||
} | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for (int64_t dim : vec) { | |
if (dim >= N || dim < 0) { | |
return false; | |
} | |
} | |
return true; | |
return absl::c_all_of(dim, [N](int64_t v) { return v >= N || v < 0; }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to absl::c_all_of(you can check it in PR#67245)
.rhs_contracting_dimensions = rhs_contracting_dimensions, | ||
.precision_configs = precision_configs}); | ||
|
||
auto state = Prepare(op.dot_general_op, lhs_dot_general, rhs_dot_general, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless the type is ridiculously complicated, prefer explicit types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stopped auto usage
"stablehlo.convolution: Size of precision_config must be two."); | ||
} | ||
if (op.attributes.precision_configs[0] != PrecisionTypes::DEFAULT && | ||
op.attributes.precision_configs[1] != PrecisionTypes::DEFAULT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be an ||
. You could also use the same implementation for all the configs.
for (size_t i = 0; i < n; ++i) { | ||
if (check_buffer[i] == 0) { | ||
is_greater_than_zero = false; | ||
exit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exit;
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
// Padding op basic implimentation in context of Convolution | ||
template <DataType storage_type> | ||
void PaddingOp(ConvolutionOp& op, const Tensor& x, const Tensor& padding, | ||
const Tensor& lhs_dilations) { | ||
using StorageT = StorageType<storage_type>; | ||
using int64_t = StorageType<DataType::kSI64>; | ||
const StorageT* x_buffer = x.GetDataAs<storage_type>(); | ||
const int64_t* lhs_dilation_buffer = | ||
lhs_dilations.GetDataAs<DataType::kSI64>(); | ||
const int64_t* padding_buffer = padding.GetDataAs<DataType::kSI64>(); | ||
StorageT* lhs_buffer = op.lhs_padded.GetDataAs<storage_type>(); | ||
size_t j = 0; | ||
for (size_t i = 0; i < op.lhs_padded.NumElements(); ++i) { | ||
int x_spacials[x.Rank() - 2]; | ||
size_t depth = 1; | ||
|
||
for (int64_t m = x.Rank() - 3; m >= 0; --m) { | ||
x_spacials[m] = | ||
(i / depth) % static_cast<size_t>(op.lhs_padded.shape().Dim(m + 2)); | ||
depth *= static_cast<size_t>(op.lhs_padded.shape().Dim(m + 2)); | ||
} | ||
bool check = true; | ||
for (int64_t k = x.Rank() - 3; k >= 0; --k) { | ||
check *= x_spacials[k] >= (padding_buffer[2 * k]) && | ||
x_spacials[k] < x.shape().Dim(k + 2) + | ||
(lhs_dilation_buffer[k] - 1) * | ||
(x.shape().Dim(k + 2) - 1) + | ||
padding_buffer[2 * k]; | ||
} | ||
|
||
if (check) { | ||
for (int64_t k = x.Rank() - 3; k >= 0; --k) { | ||
check *= static_cast<size_t>(lhs_dilation_buffer[k]) != 0; | ||
} | ||
if (check) { | ||
for (int64_t k = x.Rank() - 3; k >= 0; --k) { | ||
check *= static_cast<size_t>(x_spacials[k] - padding_buffer[2 * k]) % | ||
static_cast<size_t>(lhs_dilation_buffer[k]) == | ||
0; | ||
} | ||
if (check) { | ||
lhs_buffer[i] = x_buffer[j++]; | ||
} | ||
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StableHLO Pad already has a somewhat optimized version in TFLite that you should try to adapt: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/kernels/stablehlo_pad.cc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adapted stablehlo_pad function(you can check it in PR#67246)
free(lhs_permutation_data); | ||
free(lhs_transposed_data); | ||
free(rhs_permutation_data); | ||
free(rhs_transposed_data); | ||
free(output_permutation_data); | ||
free(output_transposed_data); | ||
free(lhs_padded_data); | ||
free(lhs_dot_general_data); | ||
free(rhs_dot_general_data); | ||
free(output_dot_general_data); | ||
free(lhs_contracting_dimensions_data); | ||
free(rhs_contracting_dimensions_data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make use of RAII.
Tensor window_strides; | ||
Tensor padding; | ||
Tensor lhs_dilation; | ||
Tensor rhs_dilation; | ||
Tensor window_reversal; | ||
int64_t input_batch_dimension; | ||
int64_t input_feature_dimension; | ||
Tensor input_spacial_dimensions; | ||
int64_t kernel_input_feature_dimension; | ||
int64_t kernel_output_feature_dimension; | ||
Tensor kernel_spacial_dimensions; | ||
int64_t output_batch_dimension; | ||
int64_t output_feature_dimension; | ||
Tensor output_spacial_dimensions; | ||
int64_t feature_group_count; | ||
int64_t batch_group_count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these tensors are defined in the spec as 64 bit int 1D tensors. Use an absl::Span<int64_t>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed 1D tensors to span(you can check it in PR#67245)
Tensor output_spacial_dimensions; | ||
int64_t feature_group_count; | ||
int64_t batch_group_count; | ||
absl::InlinedVector<PrecisionTypes, 2> precision_configs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has a fixed size in the spec. No need for the flexibility of absl::InlinedVector
. Use std::array<PrecisionTypes, 2>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to std::array<PrecisionTypes, 2>
void* lhs_permutation_data; | ||
void* lhs_transposed_data; | ||
void* rhs_permutation_data; | ||
void* rhs_transposed_data; | ||
void* output_permutation_data; | ||
void* output_transposed_data; | ||
void* lhs_padded_data; | ||
void* lhs_dot_general_data; | ||
void* rhs_dot_general_data; | ||
void* output_dot_general_data; | ||
void* lhs_contracting_dimensions_data; | ||
void* rhs_contracting_dimensions_data; | ||
void* lhs_dequantized_data; | ||
void* rhs_dequantized_data; | ||
void* output_dequantized_data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these are owned by the op, make use of RAII.
Hi @LokeshReddyOVS-MCW This PR is in draft, any update on this? Please. Thank you! |
CC: @rascani @qukhan