You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm at my wits end and can't figure out how else to optimize my model but in flutter, prediction takes about 8 to 9 seconds, which is very long.. I thought something was wrong with my model but when I tried the same model in Kotlin, it gave result in under 200 ms.
I'm only taking into account the interpreter.run() command and using Stopwatch() to keep track of it.
timer.start();
_interpreter.run(inputIds, predictions);
print('inference done in '+ timer.elapsedMilliseconds.toString());
timer.reset();
I'm initializing the model like:
var interpreterOptions =InterpreterOptions()..threads =NUM_LITE_THREADS;
_interpreter =awaitInterpreter.fromAsset(
modelFile,
options: interpreterOptions,
);
I'm not using NNAPI as it does not improve the inference speed, and can't use gpudelegate as it fails to initialize model.
My input is of the shape [1, 32] and is of type int8. My outputs are of shape [1, 32, 50527] and if of type float32
I thought this was an error in my model but when I ran the same model in Kotlin using:
@farazk86 Are you using multidimensional dart lists for inputIds and predictions, if yes , can you try using TensorBuffer from tflite_flutter_helper instead? I will try to investigate the cause of such terrible performance with dart lists.
@farazk86 Are you using multidimensional dart lists for inputIds and predictions, if yes , can you try using TensorBuffer from tflite_flutter_helper instead? I will try to investigate the cause of such terrible performance with dart lists.
Thanks for the reply. Unfortunately, using TensorBuffer did not provide any considerable speedup boost, it reduced inference time by a couple seconds only.
I ended up using flutter's invoke channelMethod to do inference in java. This reduced the inference time to 300ms :)
Hi @am15h ,
I'm at my wits end and can't figure out how else to optimize my model but in flutter, prediction takes about 8 to 9 seconds, which is very long.. I thought something was wrong with my model but when I tried the same model in Kotlin, it gave result in under 200 ms.
I'm only taking into account the
interpreter.run()
command and usingStopwatch()
to keep track of it.I'm initializing the model like:
I'm not using
NNAPI
as it does not improve the inference speed, and can't usegpudelegate
as it fails to initialize model.My input is of the shape
[1, 32]
and is of typeint8
. My outputs are of shape[1, 32, 50527]
and if of typefloat32
I thought this was an error in my model but when I ran the same model in Kotlin using:
I get the same prediction in under 200ms.. The Kotlin model is initialized on the CPU just like the flutter one is:
Is there any reason why the model is performing so poorly in flutter? What can I change fix? Any thoughts on this will be very helpful.
Thank you
The text was updated successfully, but these errors were encountered: