Refactoring fully_connected to share code between reference and optimized kernels. #46242

advaitjain · 2021-01-07T00:47:35Z

Summary:

Move shared structs / helper functions into fully_connected_common.cc
Clean up some of the existing code to directly call the reference implementations (made possible by the refactor of the helper functions).

Also, this refactor addresses the sign flip in fully_connected: http://b/138810107

google-ml-butler · 2021-01-07T00:47:38Z

Thanks for contributing to TensorFlow Lite Micro.

To keep this process moving along, we'd like to make sure that you have completed the items on this list:

Read the contributing guidelines for TensorFlow Lite Micro
Created a TF Lite Micro Github issue
Linked to the issue from the PR description

We would like to have a discussion on the Github issue first to determine the best path forward, and then proceed to the PR review.

…ized kernels. This change is currently for discussion and to figure out what parts of this refactor we like and what we do not. Also note that we have `#if !defined(XTENSA)` in fully_connected_common.cc because the linker appears to be failing to discard the unused symbols. We will discuss this further with the Cadence engineers but having some repro case that can be merged would likely be useful. TODO: make a github issue describing this linker behavior in more detail. Also, this refactor addresses the sign flip in fully_connected: http://b/138810107

advaitjain · 2021-01-07T00:57:11Z

tensorflow/lite/micro/kernels/fully_connected_common.cc

+      context, sizeof(OpDataFullyConnectedReference));
+}
+
+#if !defined(XTENSA)


@pnikam-cad @nyadla-sys @kpraving

These ifdefs are very undesirable but currently needed because the xtensa linker does not appear to be dropping unused symbols.

If I build the keyword_benchmark without this define the size is ~2.5 KB larger than if I keep this define. However, all the functions within this define are unused and I would expect the linker to be able to drop those symbols.

I will give more details tomorrow but wanted to flag this issue.

this was user-error on my part -- I very likely missed a make clean.

Filed #46261 to fix the underlying cause.

advaitjain · 2021-01-07T01:01:48Z

@freddan80 @felix-johnny: we're trying to see what might work to increase code-sharing between reference and optimized kernels. Let us know if you have any suggestions.

@yair-ehrenwald: this PR is a stab at improved code sharing. Let us know if you have any suggestions.

advaitjain

@njeffrie and I talked some more (see the review comments).

advaitjain · 2021-01-07T18:55:47Z

tensorflow/lite/micro/kernels/cmsis-nn/fully_connected.cc

-      return EvalFloat(context, node, params->activation, input, filter, bias,
-                       output);
+      return EvalFloatFullyConnectedReference(context, node, params->activation,
+                                              input, filter, bias, output);


probably better to avoid the intermediate EvalFloatFullyConnectedReference altogether:

tflite::reference_ops::FullyConnected( ToFloatParams(activation), tflite::micro::GetTensorShape(input), tflite::micro::GetTensorData<float>(input), tflite::micro::GetTensorShape(filter), tflite::micro::GetTensorData<float>(filter), tflite::micro::GetTensorShape(bias), tflite::micro::GetTensorData<float>(bias), tflite::micro::GetTensorShape(output), tflite::micro::GetTensorData<float>(output));

advaitjain · 2021-01-07T18:57:34Z

tensorflow/lite/micro/kernels/fully_connected_common.cc

+  return kTfLiteOk;
+}
+
+TfLiteStatus EvalQuantizedInt8FullyConnectedReference(


remove in favor of directly calling the reference implementation, now that the OpData to OpParams is a single function call.

advaitjain · 2021-01-07T19:05:21Z

tensorflow/lite/micro/kernels/fully_connected.h

+  }
+};
+
+extern const int kFullyConnectedInputTensor;


we'll make these:

inline int fully_Connected_input_tensor_index() { return 0};

kept as extern const.

advaitjain · 2021-01-07T19:14:04Z

tensorflow/lite/micro/kernels/fully_connected_common.cc

+}
+
+#if !defined(XTENSA)
+TfLiteStatus CalculateOpDataFullyConnectedReference(


we'll keep this function in common

advaitjain · 2021-01-07T19:15:50Z

tensorflow/lite/micro/kernels/fully_connected_common.cc

+  return kTfLiteOk;
+}
+
+TfLiteStatus EvalQuantizedFullyConnectedReference(


this will be removed.

advaitjain · 2021-01-07T19:20:21Z

tensorflow/lite/micro/kernels/fully_connected.h


 namespace tflite {

+struct OpDataFullyConnectedReference {


call this OpDataFullyConnected (drop the reference suffix).

advaitjain · 2021-01-07T23:31:01Z

Ready for review again.

freddan80 · 2021-01-11T09:31:02Z

@advaitjain I'll havbe a look at it today

freddan80

Looks nice! I have a minor clean-up comment.

freddan80 · 2021-01-11T11:10:53Z

tensorflow/lite/micro/kernels/cmsis-nn/fully_connected.cc

 TfLiteStatus CalculateOpData(TfLiteContext* context,
                             TfLiteFusedActivation activation,
                             TfLiteType data_type, const TfLiteTensor* input,
                             const TfLiteTensor* filter,


CalculateOpData() can be removed if we propagate "data->buffer_idx = -1;" into Prepare().

google-ml-butler bot added the size:L CL Change Size: Large label Jan 7, 2021

google-cla bot added the cla: yes label Jan 7, 2021

advaitjain added the comp:micro Related to TensorFlow Lite Microcontrollers label Jan 7, 2021

advaitjain requested a review from njeffrie January 7, 2021 00:47

advaitjain force-pushed the refactor-fully_connected branch from 55870c4 to 0a469be Compare January 7, 2021 00:52

advaitjain commented Jan 7, 2021

View reviewed changes

advaitjain added the kokoro:force-run Tests on submitted change label Jan 7, 2021

kokoro-team removed the kokoro:force-run Tests on submitted change label Jan 7, 2021

gbaned self-assigned this Jan 7, 2021

advaitjain commented Jan 7, 2021

View reviewed changes

advaitjain force-pushed the refactor-fully_connected branch from 49ed518 to 14be7a3 Compare January 7, 2021 21:31

advaitjain added the kokoro:force-run Tests on submitted change label Jan 7, 2021

kokoro-team removed the kokoro:force-run Tests on submitted change label Jan 7, 2021

advaitjain force-pushed the refactor-fully_connected branch from 14be7a3 to fa2d61e Compare January 7, 2021 21:46

advaitjain mentioned this pull request Jan 7, 2021

different TFLM builds use the same output directory. #46261

Closed

Updates based on discussion with Nat.

bbd388b

advaitjain force-pushed the refactor-fully_connected branch from fa2d61e to bbd388b Compare January 7, 2021 23:30

advaitjain added the kokoro:force-run Tests on submitted change label Jan 7, 2021

kokoro-team removed the kokoro:force-run Tests on submitted change label Jan 7, 2021

njeffrie approved these changes Jan 8, 2021

View reviewed changes

google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Jan 8, 2021

kokoro-team removed the kokoro:force-run Tests on submitted change label Jan 8, 2021

freddan80 suggested changes Jan 11, 2021

View reviewed changes

advaitjain mentioned this pull request Jan 11, 2021

layering check mismatch between internal and open-source TFLM bazel builds #46347

Closed

google-ml-butler bot removed the ready to pull PR ready for merge process label Jan 12, 2021

address review comments.

c8bc253

advaitjain force-pushed the refactor-fully_connected branch from a2a53b8 to c8bc253 Compare January 12, 2021 04:57

advaitjain added the kokoro:force-run Tests on submitted change label Jan 12, 2021

kokoro-team removed the kokoro:force-run Tests on submitted change label Jan 12, 2021

advaitjain added ready to pull PR ready for merge process kokoro:force-run Tests on submitted change and removed ready to pull PR ready for merge process labels Jan 12, 2021

kokoro-team removed the kokoro:force-run Tests on submitted change label Jan 12, 2021

copybara-service bot merged commit 88e5764 into tensorflow:master Jan 12, 2021

advaitjain deleted the refactor-fully_connected branch January 12, 2021 23:07

advaitjain mentioned this pull request Jan 19, 2021

TFLM Added optimized fully connected (float32 and int8) for CEVA-BX1 #45606

Closed

Refactoring fully_connected to share code between reference and optimized kernels. #46242

Refactoring fully_connected to share code between reference and optimized kernels. #46242

Uh oh!

Conversation

advaitjain commented Jan 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

google-ml-butler bot commented Jan 7, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

advaitjain commented Jan 7, 2021

Uh oh!

advaitjain left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

advaitjain commented Jan 7, 2021

Uh oh!

freddan80 commented Jan 11, 2021

Uh oh!

freddan80 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

advaitjain commented Jan 7, 2021 •

edited

Loading