-
Notifications
You must be signed in to change notification settings - Fork 25
Deserializing models can take a long time on some Android devices #36
Comments
@joseph-o3h , what happens when you add the following line? var modelData = ...
modelData.computeTarget = MLModelData.ComputeTarget.CPUOnly; // <-- add this
var model = new MLEdgeModel(modelData); I suspect the time is being spent creating an NNAPI representation of the model for execution. |
I cannot use |
@joseph-o3h working on #35 . We're batching a few things into the next update. Regarding this issue, we've changed the API to make model creation asynchronous: // Fetch the model data
var modelData = await MLModelData.FromHub(tag);
// Create the model
var model = await MLEdgeModel.Create(modelData); // <-- this is offloaded to a native background worker I'll update this thread once we have an ETA. |
But even with the asynchronous model creation it will take the same amount of time won't it? It is just not going to block the main thread. |
That's correct, though the time taken should be on the order of a few frames. |
Is this when inference is run on the CPU? That might improve the model load/compile time but could also impact run time performance. NNAPI supports caching of compiled models (https://developer.android.com/ndk/reference/group/neural-networks#aneuralnetworkscompilation_setcaching), is it possible to use it in NatML (if it is not being used already)? |
Hey @joseph-o3h Happy New Year! I've got inline responses below:
That's correct!
NatML doesn't use NNAPI caching unfortunately. Adding support for IR caching is on the mid-to-longer term roadmap. |
We've had an engineering slowdown over the holidays, but we're picking back up now. ETA on the update with async model creation should be sometime next week. |
Okay minor follow up on the caching question: we're likely gonna add support for this in the near-term, but to iOS, macOS, and Windows first. |
Hey @joseph-o3h we've updated model creation to be async in the NatML 1.1 update. For device-specific delays in creating the model, the culprit is likely building the NNAPI representation, so you can either keep the model on the CPU (won't advise this); or you can hide the delay since the process is now async. I'm closing this issue; feel free to reopen another issue if you run into something similar. |
These models are taking a long time to load on a Google Pixel 6 phone, even though they are embedded into the build with
MLModelDataEmbed
:We have found by profiling that almost all of this time is spent in
NatML.CreateModel
.On a Samsung Galazy S21 Ultra the same models take only ~1.5 seconds to load in total.
Is there a format conversion or any expensive computation happening here? If so, is it possible to do this conversion ahead of time (such as when the app starts) without having to keep the model in memory for the entire lifetime of the app?
The text was updated successfully, but these errors were encountered: