Prediction using tflite_flutter takes too long (8 seconds) while same model in Kotlin predicts in 200ms?? #66

farazk86 · 2021-01-21T19:58:05Z

I'm at my wits end and can't figure out how else to optimize my model but in flutter, prediction takes about 8 to 9 seconds, which is very long.. I thought something was wrong with my model but when I tried the same model in Kotlin, it gave result in under 200 ms.

I'm only taking into account the interpreter.run() command and using Stopwatch() to keep track of it.

 timer.start();
_interpreter.run(inputIds, predictions);
print('inference done in ' + timer.elapsedMilliseconds.toString());
timer.reset();

I'm initializing the model like:

var interpreterOptions = InterpreterOptions()..threads = NUM_LITE_THREADS;
    _interpreter = await Interpreter.fromAsset(
      modelFile,
      options: interpreterOptions,
    );

I'm not using NNAPI as it does not improve the inference speed, and can't use gpudelegate as it fails to initialize model.

My input is of the shape [1, 32] and is of type int8. My outputs are of shape [1, 32, 50527] and if of type float32

I thought this was an error in my model but when I ran the same model in Kotlin using:

tflite.runForMultipleInputsOutputs(arrayOf(inputIds), outputs)

I get the same prediction in under 200ms.. The Kotlin model is initialized on the CPU just like the flutter one is:

private suspend fun loadModel(): Interpreter = withContext(Dispatchers.IO) {
        val assetFileDescriptor = getApplication<Application>().assets.openFd(MODEL_PATH)
        assetFileDescriptor.use {
            val fileChannel = FileInputStream(assetFileDescriptor.fileDescriptor).channel
            val modelBuffer = fileChannel.map(FileChannel.MapMode.READ_ONLY, it.startOffset, it.declaredLength)

            val opts = Interpreter.Options()
            opts.setNumThreads(NUM_LITE_THREADS)
            return@use Interpreter(modelBuffer, opts)
        }
    }

Is there any reason why the model is performing so poorly in flutter? What can I change fix? Any thoughts on this will be very helpful.

Thank you

The text was updated successfully, but these errors were encountered:

am15h · 2021-01-22T14:49:22Z

@farazk86 Are you using multidimensional dart lists for inputIds and predictions, if yes , can you try using TensorBuffer from tflite_flutter_helper instead? I will try to investigate the cause of such terrible performance with dart lists.

farazk86 · 2021-01-25T12:36:59Z

@farazk86 Are you using multidimensional dart lists for inputIds and predictions, if yes , can you try using TensorBuffer from tflite_flutter_helper instead? I will try to investigate the cause of such terrible performance with dart lists.

Thanks for the reply. Unfortunately, using TensorBuffer did not provide any considerable speedup boost, it reduced inference time by a couple seconds only.

I ended up using flutter's invoke channelMethod to do inference in java. This reduced the inference time to 300ms :)

farazk86 closed this as completed Jan 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prediction using tflite_flutter takes too long (8 seconds) while same model in Kotlin predicts in 200ms?? #66

Prediction using tflite_flutter takes too long (8 seconds) while same model in Kotlin predicts in 200ms?? #66

farazk86 commented Jan 21, 2021

am15h commented Jan 22, 2021

farazk86 commented Jan 25, 2021

Prediction using tflite_flutter takes too long (8 seconds) while same model in Kotlin predicts in 200ms?? #66

Prediction using tflite_flutter takes too long (8 seconds) while same model in Kotlin predicts in 200ms?? #66

Comments

farazk86 commented Jan 21, 2021

am15h commented Jan 22, 2021

farazk86 commented Jan 25, 2021