Refactor conv to share code between reference and optimized kernels #47112

njeffrie · 2021-02-12T11:48:16Z

Summary:

Move shared structs / helper functions into conv_common.cc
Clean up some of the existing code to directly call the reference implementations (made possible by the refactor of the helper functions).

Tested with:
make -j8 -f tensorflow/lite/micro/tools/make/Makefile test_kernel_conv_test

make -j8 -f tensorflow/lite/micro/tools/make/Makefile OPTIMIZED_KERNEL_DIR=cmsis_nn TARGET=stm32f4 test_kernel_conv_test

make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=hifimini XTENSA_CORE=mini1m1m_RG test_kernel_conv_test

google-ml-butler · 2021-02-12T11:48:19Z

Thanks for contributing to TensorFlow Lite Micro.

To keep this process moving along, we'd like to make sure that you have completed the items on this list:

Read the contributing guidelines for TensorFlow Lite Micro
Created a TF Lite Micro Github issue
Linked to the issue from the PR description

We would like to have a discussion on the Github issue first to determine the best path forward, and then proceed to the PR review.

advaitjain

some of my comments are for the existing code and not just the refactor -- apologies for that.

advaitjain · 2021-02-12T21:02:56Z

tensorflow/lite/micro/kernels/BUILD

+cc_library(
+    name = "conv",
+    srcs = [


we are ordering the build rules alphabetically.

advaitjain · 2021-02-12T21:05:02Z

tensorflow/lite/micro/kernels/conv.cc

@@ -178,144 +92,70 @@ TfLiteStatus Prepare(TfLiteContext* context, TfLiteNode* node) {
  data->output_zero_point = output->params.zero_point;


can these also be moved inside the CacluateOpDataConv function?

advaitjain · 2021-02-12T21:09:09Z

tensorflow/lite/micro/kernels/conv.h

+  // Index to buffer for optimizations if applicable.
+  int buffer_idx;


this is only needed for cmsis_nn, right?

Let's follow the pattern of fully_connected where we only add the extra op data members for the optimized kernels that need them:

tensorflow/tensorflow/lite/micro/kernels/cmsis_nn/fully_connected.cc

Lines 32 to 37 in 8e4ba81

struct OpData {

OpDataFullyConnected reference_op_data;

// Index to buffer for optimizations if applicable.

int buffer_idx;

};

advaitjain · 2021-02-12T21:12:12Z

tensorflow/lite/micro/kernels/conv_common.cc

+
+// Returns a ConvParams struct with all the parameters needed for a
+// quantized computation.
+ConvParams ConvParamsQuantized(TfLiteConvParams* params,


looks like params is also an input param, make it const TfLiteConvParams& params?

advaitjain · 2021-02-12T21:18:54Z

tensorflow/lite/micro/kernels/kernel_util.h

+inline PaddingType RuntimePaddingType(TfLitePadding padding) {
+  switch (padding) {
+    case TfLitePadding::kTfLitePaddingSame:
+      return PaddingType::kSame;
+    case TfLitePadding::kTfLitePaddingValid:
+      return PaddingType::kValid;
+    case TfLitePadding::kTfLitePaddingUnknown:
+    default:
+      return PaddingType::kNone;
+  }
+}


let's move the definition to the .cc to avoid overuse of inline.

advaitjain · 2021-02-12T21:48:33Z

tensorflow/lite/micro/kernels/conv_common.cc

+  int input_width = input->dims->data[2];
+  int input_height = input->dims->data[1];
+  int filter_width = filter->dims->data[2];
+  int filter_height = filter->dims->data[1];
+  int output_width = output->dims->data[2];
+  int output_height = output->dims->data[1];


make all of these const?

advaitjain · 2021-02-12T21:59:02Z

tensorflow/lite/micro/kernels/conv_common.cc

+    TF_LITE_ENSURE(context, affine_quantization);
+    TF_LITE_ENSURE(context, affine_quantization->scale);
+    TF_LITE_ENSURE(context, affine_quantization->zero_point);


we really should be passing in bools to TF_LITE_ENSURE.

We are simply checking for scale and zero point being not null?

Let's make them DCHECK(scale != nullptr), DCHECK(affine_quatization != nullptr), DCHECK(zero_point != nullptr)?

advaitjain · 2021-02-12T21:59:35Z

tensorflow/lite/micro/kernels/conv_common.cc

+  data->input_zero_point = input->params.zero_point;
+  data->filter_zero_point = filter->params.zero_point;
+  data->output_zero_point = output->params.zero_point;


move these to CalculateOpDataConv?

advaitjain · 2021-02-12T22:04:39Z

tensorflow/lite/micro/kernels/conv_common.cc

+void* InitConv(TfLiteContext* context, const char* buffer, size_t length) {
+  TFLITE_DCHECK(context->AllocatePersistentBuffer != nullptr);
+  return context->AllocatePersistentBuffer(context, sizeof(OpDataConv));
+}


remove? It is not declared in conv.h

advaitjain · 2021-02-12T22:08:34Z

tensorflow/lite/micro/kernels/xtensa/conv.cc

 TfLiteStatus Prepare(TfLiteContext* context, TfLiteNode* node) {
  TFLITE_DCHECK(node->user_data != nullptr);
  TFLITE_DCHECK(node->builtin_data != nullptr);
  auto* params = reinterpret_cast<TfLiteConvParams*>(node->builtin_data);

-  TfLiteTensor* output = GetOutput(context, node, kOutputTensor);
-  const TfLiteTensor* input = GetInput(context, node, kInputTensor);
-  const TfLiteTensor* filter = GetInput(context, node, kFilterTensor);
+  TfLiteTensor* output = GetOutput(context, node, kConvOutputTensor);
+  const TfLiteTensor* input = GetInput(context, node, kConvInputTensor);
+  const TfLiteTensor* filter = GetInput(context, node, kConvWeightsTensor);


This looks identical to the Prepare in the reference kernel.

You do have PrepareConv in conv_common.cc but not in conv.h. Did you mean to share this code between the reference and xtensa kernels?

As a minor nit, at least in softmax.h we are going with the naming convention of SoftmaxInit and SoftmaxPrepare.

Removed in favor of shared ConvPrepare method.

google-ml-butler bot added the size:L CL Change Size: Large label Feb 12, 2021

google-cla bot added the cla: yes label Feb 12, 2021

njeffrie requested a review from advaitjain February 12, 2021 11:48

Refactor conv to share code between reference and optimized kernels

3aa4d99

njeffrie force-pushed the conv branch from 61e0683 to 3aa4d99 Compare February 12, 2021 12:19

gbaned self-assigned this Feb 12, 2021

gbaned added the comp:micro Related to TensorFlow Lite Microcontrollers label Feb 12, 2021

gbaned added this to Assigned Reviewer in PR Queue via automation Feb 12, 2021

advaitjain requested changes Feb 12, 2021

View reviewed changes

PR Queue automation moved this from Assigned Reviewer to Reviewer Requested Changes Feb 12, 2021

Address review comments.

0a775a0

gbaned requested a review from advaitjain February 15, 2021 15:59

advaitjain approved these changes Feb 16, 2021

View reviewed changes

PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer Feb 16, 2021

google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Feb 16, 2021

kokoro-team removed the kokoro:force-run Tests on submitted change label Feb 16, 2021

copybara-service bot merged commit 2ebd622 into tensorflow:master Feb 17, 2021

PR Queue automation moved this from Approved by Reviewer to Merged Feb 17, 2021

advaitjain mentioned this pull request Mar 2, 2021

TFLM: Reduce memory usage for some types #47471

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor conv to share code between reference and optimized kernels #47112

Refactor conv to share code between reference and optimized kernels #47112

njeffrie commented Feb 12, 2021 •

edited

google-ml-butler bot commented Feb 12, 2021

advaitjain left a comment

advaitjain Feb 12, 2021

njeffrie Feb 13, 2021

advaitjain Feb 12, 2021

njeffrie Feb 13, 2021

advaitjain Feb 12, 2021

njeffrie Feb 13, 2021

advaitjain Feb 12, 2021

njeffrie Feb 13, 2021

advaitjain Feb 12, 2021

njeffrie Feb 13, 2021

advaitjain Feb 12, 2021

njeffrie Feb 13, 2021 •

edited

advaitjain Feb 12, 2021

njeffrie Feb 13, 2021

advaitjain Feb 12, 2021

njeffrie Feb 13, 2021

advaitjain Feb 12, 2021

njeffrie Feb 13, 2021

advaitjain Feb 12, 2021

njeffrie Feb 13, 2021 •

edited

		@@ -178,144 +92,70 @@ TfLiteStatus Prepare(TfLiteContext* context, TfLiteNode* node) {
		data->output_zero_point = output->params.zero_point;

		// Index to buffer for optimizations if applicable.
		int buffer_idx;

	struct OpData {
	OpDataFullyConnected reference_op_data;

	// Index to buffer for optimizations if applicable.
	int buffer_idx;
	};

Refactor conv to share code between reference and optimized kernels #47112

Refactor conv to share code between reference and optimized kernels #47112

Conversation

njeffrie commented Feb 12, 2021 • edited

google-ml-butler bot commented Feb 12, 2021

advaitjain left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

njeffrie Feb 13, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

njeffrie Feb 13, 2021 • edited

Choose a reason for hiding this comment

njeffrie commented Feb 12, 2021 •

edited

njeffrie Feb 13, 2021 •

edited

njeffrie Feb 13, 2021 •

edited