New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use xa_nnlib for depthwise_conv for Fusion F1 #47380
Conversation
Thanks for contributing to TensorFlow Lite Micro. To keep this process moving along, we'd like to make sure that you have completed the items on this list:
We would like to have a discussion on the Github issue first to determine the best path forward, and then proceed to the PR review. |
75bed77
to
2eaf7d1
Compare
3016266
to
6a6e4e3
Compare
return kTfLiteOk; | ||
} | ||
|
||
#if defined(Fusion_F1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uppercase FUSION_F1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed Ran both Fusion F1 build + tests to verify.
The code in this change is the subset of functionality needed for int8 svdf for Hifi4 copied from pnikam-cad/tensorflow@a737c1e/tensorflow/lite/micro/kernels/xtensa_hifi/depthwise_conv.cc
Note that the current change has not pulled in the floating point, uint8 implementation or the Hifi5 implementation.
Profiled the person_detection_benchmark with the following command:
make -f tensorflow/lite/micro/tools/make/Makefile TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=F1_190305_swupgrade run_person_detection_benchmark -j8
gives a latency of 9.661M ticks with this change vs 73.761M ticks without this change.
Per OP latency with this change:
Without this change:
Confirmed that the kernel_conv_test passes with:
Progress towards http://b/177457688