Skip to content

Low Latency Deep Learning on Smartphones using TensorFlow Lite and CoreML

License

Notifications You must be signed in to change notification settings

sukumarh/low-latency-mobile-CV

Repository files navigation

Low Latency Deep Learning on Smartphones

License

In this project, we empirically evaluate the performance of two mobile DL frameworks TensorFlow Lite and CoreML , on the inference performance on Android and iOS devices, respectively, using various Convolution Neural Network Architectures.

Frameworks

  1. TensorFlow Lite (for Android)
  2. CoreML (for iOS)

On-board devices and API Support

  1. CPU:
  2. GPU
  3. Apple Neural Engine (ANE) [CoreML, iOS]
  4. Neural Network API (NNAPI) [TF Lite, Android]

Usage

TFLite Performance

  1. In Android Studio, use the Open an Existing Project option and select the Android-TFLite\TFLitePerformance folder.
  2. If you wish to use a custom TF-Lite model, copy the .tflite file to the app\assets folder. Also, update the path in the model's classifier java file (like ClassifierSqueezeNet.java), in the lib_support library. The path is present in the getModelPath function.
  3. Use the Device File Explorer to upload the test data onto the device. Default location of data is /data/local/tmp/DataSet/. This can be modified by updating the following line of code at line 114 in the MainActivity.java.
    File dataset_folder = new File(<Dataset location on device>);
  4. Build the project using Build > Make Project.
  5. Run the application on device/emulator using Run > Run 'app'.
  6. If you wish to start profiling with application launch, use Run > Profile 'app' instead.
  7. On Device/Emulator, ensure the correct model and device is selected.
  8. Click the Benchmark button on the mobile application to initiate the inferencing.

Benchmark Metrics

CPU

  • TensorFlow Lite

    These metrics were recorded using Android Studio CPU Profiler.

    MobileNet V2 on CPU (no delegates)
    4 Threads

    MobileNet V2, CPU

    8 Threads

    MobileNet V2, CPU

    MobileNet V2 using GPU delegate

    MobileNet V2, CPU

    MobileNet V2 using NNAPI delegate

    MobileNet V2, CPU

  • CoreML

    Metrics are recorded in three different scenarios ios-cpu-utilization

GPU

  • TensorFlow Lite

    These metrics were recorded using Qualcomm Snapdragon Profiler.

    MobileNet V2 on CPU (no delegates)

    MobileNet V2, CPU

    MobileNet V2 using GPU delegate

    MobileNet V2, CPU

    MobileNet V2 using NNAPI delegate

    MobileNet V2, CPU

Throughput

  • CoreML

    MobileNet V2, CPU

Memory

  • CoreML

    MobileNet V2, CPU

FPS

For evluation purposes Frames Per second, captured in real time video is restriced to 50, following is the FPS processed by the model in real time.

  • CoreML

    MobileNet V2, CPU

Impact of quantization on accuracy

  • CoreML

    MobileNet V2, CPU

Evaluation Results

Size vs Accuracy Inference Time vs Accuracy
Size vs Accuracy Inference Time vs Accuracy

Key Points:

  • The same AI model on different platform has different accuracy.
  • On-board computation device (CPU, GPU) selection is highly dependent on the mobile device.
  • Neural network optimized capabilities such as ANE (CoreML, iOS) and NNAPI (TFLite, Android) are not always the best choice for inferencing.
  • There is a substantial trade-off between accuracy and throughput. This trade-off must be addressed as per the application requirements.

Environments

  1. Xcode
  2. Android Studio
  3. Qualcomm Snapdragon Profiler

Releases

No releases published

Packages

No packages published