New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor conv to share code between reference and optimized kernels #47112
Conversation
Thanks for contributing to TensorFlow Lite Micro. To keep this process moving along, we'd like to make sure that you have completed the items on this list:
We would like to have a discussion on the Github issue first to determine the best path forward, and then proceed to the PR review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of my comments are for the existing code and not just the refactor -- apologies for that.
tensorflow/lite/micro/kernels/BUILD
Outdated
cc_library( | ||
name = "conv", | ||
srcs = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are ordering the build rules alphabetically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -178,144 +92,70 @@ TfLiteStatus Prepare(TfLiteContext* context, TfLiteNode* node) { | |||
data->output_zero_point = output->params.zero_point; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can these also be moved inside the CacluateOpDataConv function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
tensorflow/lite/micro/kernels/conv.h
Outdated
// Index to buffer for optimizations if applicable. | ||
int buffer_idx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only needed for cmsis_nn, right?
Let's follow the pattern of fully_connected where we only add the extra op data members for the optimized kernels that need them:
tensorflow/tensorflow/lite/micro/kernels/cmsis_nn/fully_connected.cc
Lines 32 to 37 in 8e4ba81
struct OpData { | |
OpDataFullyConnected reference_op_data; | |
// Index to buffer for optimizations if applicable. | |
int buffer_idx; | |
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
// Returns a ConvParams struct with all the parameters needed for a | ||
// quantized computation. | ||
ConvParams ConvParamsQuantized(TfLiteConvParams* params, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like params is also an input param, make it const TfLiteConvParams& params
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
inline PaddingType RuntimePaddingType(TfLitePadding padding) { | ||
switch (padding) { | ||
case TfLitePadding::kTfLitePaddingSame: | ||
return PaddingType::kSame; | ||
case TfLitePadding::kTfLitePaddingValid: | ||
return PaddingType::kValid; | ||
case TfLitePadding::kTfLitePaddingUnknown: | ||
default: | ||
return PaddingType::kNone; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move the definition to the .cc to avoid overuse of inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
int input_width = input->dims->data[2]; | ||
int input_height = input->dims->data[1]; | ||
int filter_width = filter->dims->data[2]; | ||
int filter_height = filter->dims->data[1]; | ||
int output_width = output->dims->data[2]; | ||
int output_height = output->dims->data[1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make all of these const?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
TF_LITE_ENSURE(context, affine_quantization); | ||
TF_LITE_ENSURE(context, affine_quantization->scale); | ||
TF_LITE_ENSURE(context, affine_quantization->zero_point); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we really should be passing in bools to TF_LITE_ENSURE.
We are simply checking for scale and zero point being not null?
Let's make them DCHECK(scale != nullptr), DCHECK(affine_quatization != nullptr), DCHECK(zero_point != nullptr)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
data->input_zero_point = input->params.zero_point; | ||
data->filter_zero_point = filter->params.zero_point; | ||
data->output_zero_point = output->params.zero_point; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move these to CalculateOpDataConv?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
void* InitConv(TfLiteContext* context, const char* buffer, size_t length) { | ||
TFLITE_DCHECK(context->AllocatePersistentBuffer != nullptr); | ||
return context->AllocatePersistentBuffer(context, sizeof(OpDataConv)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove? It is not declared in conv.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
TfLiteStatus Prepare(TfLiteContext* context, TfLiteNode* node) { | ||
TFLITE_DCHECK(node->user_data != nullptr); | ||
TFLITE_DCHECK(node->builtin_data != nullptr); | ||
auto* params = reinterpret_cast<TfLiteConvParams*>(node->builtin_data); | ||
|
||
TfLiteTensor* output = GetOutput(context, node, kOutputTensor); | ||
const TfLiteTensor* input = GetInput(context, node, kInputTensor); | ||
const TfLiteTensor* filter = GetInput(context, node, kFilterTensor); | ||
TfLiteTensor* output = GetOutput(context, node, kConvOutputTensor); | ||
const TfLiteTensor* input = GetInput(context, node, kConvInputTensor); | ||
const TfLiteTensor* filter = GetInput(context, node, kConvWeightsTensor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks identical to the Prepare in the reference kernel.
You do have PrepareConv in conv_common.cc but not in conv.h. Did you mean to share this code between the reference and xtensa kernels?
As a minor nit, at least in softmax.h we are going with the naming convention of SoftmaxInit and SoftmaxPrepare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed in favor of shared ConvPrepare method.
Summary:
Move shared structs / helper functions into conv_common.cc
Clean up some of the existing code to directly call the reference implementations (made possible by the refactor of the helper functions).
Tested with:
make -j8 -f tensorflow/lite/micro/tools/make/Makefile test_kernel_conv_test
make -j8 -f tensorflow/lite/micro/tools/make/Makefile OPTIMIZED_KERNEL_DIR=cmsis_nn TARGET=stm32f4 test_kernel_conv_test
make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=hifimini XTENSA_CORE=mini1m1m_RG test_kernel_conv_test