Deserializing models can take a long time on some Android devices #36

joseph-o3h · 2022-12-06T08:27:04Z

These models are taking a long time to load on a Google Pixel 6 phone, even though they are embedded into the build with MLModelDataEmbed:

https://hub.natml.ai/@natml/blazepose-detector: ~6 seconds
https://hub.natml.ai/@natml/blazepose-landmark: ~14 seconds

We have found by profiling that almost all of this time is spent in NatML.CreateModel.

On a Samsung Galazy S21 Ultra the same models take only ~1.5 seconds to load in total.

Is there a format conversion or any expensive computation happening here? If so, is it possible to do this conversion ahead of time (such as when the app starts) without having to keep the model in memory for the entire lifetime of the app?

The text was updated successfully, but these errors were encountered:

olokobayusuf · 2022-12-07T04:11:20Z

@joseph-o3h , what happens when you add the following line?

var modelData = ...
modelData.computeTarget = MLModelData.ComputeTarget.CPUOnly; // <-- add this
var model = new MLEdgeModel(modelData);

I suspect the time is being spent creating an NNAPI representation of the model for execution.

joseph-o3h · 2022-12-07T07:43:08Z

@joseph-o3h , what happens when you add the following line?

I cannot use computeTarget right now as it was added in 1.0.18, but we are still on 1.0.13 because of #35

olokobayusuf · 2022-12-08T16:09:26Z

@joseph-o3h working on #35 . We're batching a few things into the next update. Regarding this issue, we've changed the API to make model creation asynchronous:

// Fetch the model data
var modelData = await MLModelData.FromHub(tag);
// Create the model
var model = await MLEdgeModel.Create(modelData); // <-- this is offloaded to a native background worker

I'll update this thread once we have an ETA.

joseph-o3h · 2022-12-15T18:42:34Z

But even with the asynchronous model creation it will take the same amount of time won't it? It is just not going to block the main thread.

olokobayusuf · 2022-12-16T01:34:05Z

But even with the asynchronous model creation it will take the same amount of time won't it? It is just not going to block the main thread.

That's correct, though the time taken should be on the order of a few frames.

joseph-o3h · 2022-12-23T04:08:15Z

That's correct, though the time taken should be on the order of a few frames.

Is this when inference is run on the CPU? That might improve the model load/compile time but could also impact run time performance.

NNAPI supports caching of compiled models (https://developer.android.com/ndk/reference/group/neural-networks#aneuralnetworkscompilation_setcaching), is it possible to use it in NatML (if it is not being used already)?

olokobayusuf · 2023-01-09T23:30:33Z

Hey @joseph-o3h Happy New Year! I've got inline responses below:

Is this when inference is run on the CPU? That might improve the model load/compile time but could also impact run time performance.

That's correct!

NNAPI supports caching of compiled models (https://developer.android.com/ndk/reference/group/neural-networks#aneuralnetworkscompilation_setcaching), is it possible to use it in NatML (if it is not being used already)?

NatML doesn't use NNAPI caching unfortunately. Adding support for IR caching is on the mid-to-longer term roadmap.

olokobayusuf · 2023-01-09T23:31:16Z

We've had an engineering slowdown over the holidays, but we're picking back up now. ETA on the update with async model creation should be sometime next week.

olokobayusuf · 2023-01-09T23:41:08Z

Okay minor follow up on the caching question: we're likely gonna add support for this in the near-term, but to iOS, macOS, and Windows first.

olokobayusuf · 2023-02-01T02:01:37Z

Hey @joseph-o3h we've updated model creation to be async in the NatML 1.1 update. For device-specific delays in creating the model, the culprit is likely building the NNAPI representation, so you can either keep the model on the CPU (won't advise this); or you can hide the delay since the process is now async. I'm closing this issue; feel free to reopen another issue if you run into something similar.

olokobayusuf added critical Issues of utmost importance enhancement New feature or request labels Dec 7, 2022

olokobayusuf closed this as completed Feb 1, 2023

jhdot mentioned this issue Sep 18, 2023

Jenkins pipeline build error #60

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deserializing models can take a long time on some Android devices #36

Deserializing models can take a long time on some Android devices #36

joseph-o3h commented Dec 6, 2022

olokobayusuf commented Dec 7, 2022

joseph-o3h commented Dec 7, 2022 •

edited

Loading

olokobayusuf commented Dec 8, 2022 •

edited

Loading

joseph-o3h commented Dec 15, 2022

olokobayusuf commented Dec 16, 2022

joseph-o3h commented Dec 23, 2022 •

edited

Loading

olokobayusuf commented Jan 9, 2023

olokobayusuf commented Jan 9, 2023

olokobayusuf commented Jan 9, 2023

olokobayusuf commented Feb 1, 2023

Deserializing models can take a long time on some Android devices #36

Deserializing models can take a long time on some Android devices #36

Comments

joseph-o3h commented Dec 6, 2022

olokobayusuf commented Dec 7, 2022

joseph-o3h commented Dec 7, 2022 • edited Loading

olokobayusuf commented Dec 8, 2022 • edited Loading

joseph-o3h commented Dec 15, 2022

olokobayusuf commented Dec 16, 2022

joseph-o3h commented Dec 23, 2022 • edited Loading

olokobayusuf commented Jan 9, 2023

olokobayusuf commented Jan 9, 2023

olokobayusuf commented Jan 9, 2023

olokobayusuf commented Feb 1, 2023

joseph-o3h commented Dec 7, 2022 •

edited

Loading

olokobayusuf commented Dec 8, 2022 •

edited

Loading

joseph-o3h commented Dec 23, 2022 •

edited

Loading