Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The inference speed using the old version is 5-8 times faster than the latest version #202

Open
yingeo opened this issue Mar 8, 2024 · 7 comments

Comments

@yingeo
Copy link

yingeo commented Mar 8, 2024

1

@alexw92
Copy link

alexw92 commented Mar 15, 2024

I also noticed the inference increasing significantly making live detection with more than super primitive model very ugly. Do you remember which version started the inference to increase @yingeo ?

@timukasr
Copy link

I looked into it and it seems that issues in Flutter. Forked one older repo and tested it:

Environment

This was tested on the following environment:

  • Built on Windows 10 with Java 11.0.12
  • Run on Pixel 5 (Android 14)

Just detected same blank(ish) camera image. Not too scientific (e.g. phone might heat up and give lower results), so few ms difference should not be taken too seriously, but the 7x or 16x difference is significant.

Results

Average time until settles (times in ms).

Test Inference Pre-processing Total predict Total elapsed
tflite_flutter: 0.9.5
Flutter: 3.13.9
22 17 39 51
tflite_flutter: 0.9.5
Flutter: 3.16.9
359 18 378 387
tflite_flutter: 0.10.4
Flutter: 3.13.9
22 15 38 50
tflite_flutter: 0.10.4
Flutter: 3.16.9
351 16 368 376

Conclusion

  • No difference between tflite_flutter 0.9.5 and 0.10.4
  • Big difference between Flutter 3.13.9 and Flutter 3.16.9 - 16x slower inference time and 7.5x slower total time.

Flutter 3.19.3 seemed to be as slow as 3.16.9. So it seems that the problem is not yet fixed.

My repo: https://github.com/timukasr/object_detection_flutter

Something similar results can be reproduced with tflite_flutter sample app live_object_detection_ssd_mobilenet. This app seems a bit broken - does not display results on screen. But it reports inference time. With Flutter 3.13.9, it was 170-200ms, with Flutter 3.19.3, it was 500-550ms. For some reason it is slow even with older Flutter, but even slower with newer flutter.

@alexw92
Copy link

alexw92 commented Mar 16, 2024

@timukasr Thanks a lot for you insights! I didnt even try out different flutter versions. I noticed a inference increase by just changing the tflite_flutter version when I upgraded to 0.10.4 (from 0.10.0) I think. I will continue to investigate this when I find time.

@alexw92
Copy link

alexw92 commented Mar 16, 2024

I take it all back. I was not able to reproduce significant inference differences in 0.10.x versions. The massive spikes due to distinct flutter versions should be addressed at some point.

@yingeo
Copy link
Author

yingeo commented Mar 19, 2024

It has nothing to do with the version of tflite_flutter, but to the version of Flutter. Currently, I use tflite_flutter: version 0.10.4, and Flutter version is 3.7.12. The inference time is consistent with the native speed of Android. If you upgrade flutter to the latest version, the inference speed will be slow. terrible.

@yanghoonkim
Copy link

@yingeo Did you compare 3.7.12 with the other version(such as 3.13.9). Having a hard time after downgrade to 3.7.12 (several config should be change lol)

@yanghoonkim
Copy link

yanghoonkim commented May 3, 2024

tried tflite_flutter 0.10.4 + flutter 3.13.9 + video classification(MoviNet) + isolateInterpreter.runForMultipleInputs.
can't figure out any difference between the 3.13.9 and the latest version of flutter.

My conclusion is that tflite_flutter is currently not optimized for some models/operators... one step of inference (on a single frame image) takes more than 3000ms on Samsung galaxy s8 / more than 2000ms on Samsung galaxy z flip 3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants