Use xa_nnlib for depthwise_conv for Fusion F1 #47380

njeffrie · 2021-02-24T20:13:09Z

The code in this change is the subset of functionality needed for int8 svdf for Hifi4 copied from pnikam-cad/tensorflow@a737c1e/tensorflow/lite/micro/kernels/xtensa_hifi/depthwise_conv.cc

Note that the current change has not pulled in the floating point, uint8 implementation or the Hifi5 implementation.

Profiled the person_detection_benchmark with the following command:

make -f tensorflow/lite/micro/tools/make/Makefile TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=F1_190305_swupgrade run_person_detection_benchmark -j8
gives a latency of 9.661M ticks with this change vs 73.761M ticks without this change.

Per OP latency with this change:

WithPersonDataIterations(1) took 9661310 ticks (9661 ms)
DEPTHWISE_CONV_2D took 1156157 ticks (1156 ms).
DEPTHWISE_CONV_2D took 628525 ticks (628 ms).
CONV_2D took 987374 ticks (987 ms).
DEPTHWISE_CONV_2D took 395420 ticks (395 ms).
CONV_2D took 554630 ticks (554 ms).
DEPTHWISE_CONV_2D took 545252 ticks (545 ms).
CONV_2D took 665222 ticks (665 ms).
DEPTHWISE_CONV_2D took 172412 ticks (172 ms).
CONV_2D took 334262 ticks (334 ms).
DEPTHWISE_CONV_2D took 283280 ticks (283 ms).
CONV_2D took 444854 ticks (444 ms).
DEPTHWISE_CONV_2D took 88394 ticks (88 ms).
CONV_2D took 225302 ticks (225 ms).
DEPTHWISE_CONV_2D took 158090 ticks (158 ms).
CONV_2D took 335894 ticks (335 ms).
DEPTHWISE_CONV_2D took 158090 ticks (158 ms).
CONV_2D took 335894 ticks (335 ms).
DEPTHWISE_CONV_2D took 158090 ticks (158 ms).
CONV_2D took 335894 ticks (335 ms).
DEPTHWISE_CONV_2D took 158090 ticks (158 ms).
CONV_2D took 335894 ticks (335 ms).
DEPTHWISE_CONV_2D took 158090 ticks (158 ms).
CONV_2D took 335894 ticks (335 ms).
DEPTHWISE_CONV_2D took 59525 ticks (59 ms).
CONV_2D took 173270 ticks (173 ms).
DEPTHWISE_CONV_2D took 112424 ticks (112 ms).
CONV_2D took 283862 ticks (283 ms).
AVERAGE_POOL_2D took 75604 ticks (75 ms).
CONV_2D took 3398 ticks (3 ms).
RESHAPE took 290 ticks (0 ms).
SOFTMAX took 1933 ticks (1 ms).

Without this change:

KeywordRunNIerations(1) took 38516 ticks (38 ms)
DEPTHWISE_CONV_2D took 11961939 ticks (11961 ms).
DEPTHWISE_CONV_2D took 12296923 ticks (12296 ms).
CONV_2D took 987358 ticks (987 ms).
DEPTHWISE_CONV_2D took 6138259 ticks (6138 ms).
CONV_2D took 554614 ticks (554 ms).
DEPTHWISE_CONV_2D took 12063331 ticks (12063 ms).
CONV_2D took 665206 ticks (665 ms).
DEPTHWISE_CONV_2D took 3018615 ticks (3018 ms).
CONV_2D took 334246 ticks (334 ms).
DEPTHWISE_CONV_2D took 5837463 ticks (5837 ms).
CONV_2D took 444838 ticks (444 ms).
DEPTHWISE_CONV_2D took 1462009 ticks (1462 ms).
CONV_2D took 225286 ticks (225 ms).
DEPTHWISE_CONV_2D took 2734009 ticks (2734 ms).
CONV_2D took 335878 ticks (335 ms).
DEPTHWISE_CONV_2D took 2734009 ticks (2734 ms).
CONV_2D took 335878 ticks (335 ms).
DEPTHWISE_CONV_2D took 2734009 ticks (2734 ms).
CONV_2D took 335878 ticks (335 ms).
DEPTHWISE_CONV_2D took 2734009 ticks (2734 ms).
CONV_2D took 335878 ticks (335 ms).
DEPTHWISE_CONV_2D took 2734009 ticks (2734 ms).
CONV_2D took 335878 ticks (335 ms).
DEPTHWISE_CONV_2D took 685980 ticks (685 ms).
CONV_2D took 173254 ticks (173 ms).
DEPTHWISE_CONV_2D took 1197084 ticks (1197 ms).
CONV_2D took 283846 ticks (283 ms).
AVERAGE_POOL_2D took 75604 ticks (75 ms).
CONV_2D took 3382 ticks (3 ms).
RESHAPE took 290 ticks (0 ms).
SOFTMAX took 1933 ticks (1 ms).

Confirmed that the kernel_conv_test passes with:

make -f tensorflow/lite/micro/tools/make/Makefile TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=F1_190305_swupgrade test_kernel_depthwise_conv_test -j8

Progress towards http://b/177457688

google-ml-butler · 2021-02-24T20:13:12Z

Thanks for contributing to TensorFlow Lite Micro.

To keep this process moving along, we'd like to make sure that you have completed the items on this list:

Read the contributing guidelines for TensorFlow Lite Micro
Created a TF Lite Micro Github issue
Linked to the issue from the PR description

We would like to have a discussion on the Github issue first to determine the best path forward, and then proceed to the PR review.

advaitjain · 2021-02-25T18:00:52Z

tensorflow/lite/micro/kernels/xtensa/depthwise_conv.cc

+  return kTfLiteOk;
+}
+
+#if defined(Fusion_F1)


uppercase FUSION_F1

Fixed Ran both Fusion F1 build + tests to verify.

google-ml-butler bot added the size:L CL Change Size: Large label Feb 24, 2021

google-cla bot added the cla: yes label Feb 24, 2021

njeffrie requested a review from advaitjain February 24, 2021 21:03

njeffrie force-pushed the f1_depthwise branch from 75bed77 to 2eaf7d1 Compare February 24, 2021 23:48

gbaned self-assigned this Feb 25, 2021

gbaned added the comp:micro Related to TensorFlow Lite Microcontrollers label Feb 25, 2021

gbaned added this to Assigned Reviewer in PR Queue via automation Feb 25, 2021

njeffrie added 3 commits February 25, 2021 09:41

Use xa_nnlib for depthwise_conv for Fusion F1

b0e48d9

Address review comments.

2280efe

Fix hifimini errors

6a6e4e3

njeffrie force-pushed the f1_depthwise branch from 3016266 to 6a6e4e3 Compare February 25, 2021 17:42

advaitjain requested changes Feb 25, 2021

View reviewed changes

PR Queue automation moved this from Assigned Reviewer to Reviewer Requested Changes Feb 25, 2021

Capitalization of FUSION_F1

77cee3d

njeffrie requested a review from advaitjain February 25, 2021 18:09

advaitjain approved these changes Feb 25, 2021

View reviewed changes

PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer Feb 25, 2021

google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Feb 25, 2021

kokoro-team removed the kokoro:force-run Tests on submitted change label Feb 25, 2021

copybara-service bot merged commit dd854c7 into tensorflow:master Feb 25, 2021

PR Queue automation moved this from Approved by Reviewer to Merged Feb 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use xa_nnlib for depthwise_conv for Fusion F1 #47380

Use xa_nnlib for depthwise_conv for Fusion F1 #47380

njeffrie commented Feb 24, 2021

google-ml-butler bot commented Feb 24, 2021

advaitjain Feb 25, 2021

njeffrie Feb 25, 2021 •

edited

Use xa_nnlib for depthwise_conv for Fusion F1 #47380

Use xa_nnlib for depthwise_conv for Fusion F1 #47380

Conversation

njeffrie commented Feb 24, 2021

google-ml-butler bot commented Feb 24, 2021

advaitjain Feb 25, 2021

Choose a reason for hiding this comment

njeffrie Feb 25, 2021 • edited

Choose a reason for hiding this comment

njeffrie Feb 25, 2021 •

edited