Why mobileNet.predict() hit a performance cliff after certain amount of runs? #145

nsthorat · 2018-04-07T15:02:44Z

From @samwyi on April 5, 2018 22:21

TensorFlow.js version: 0.6.0

TensorFlow.js Core version: 0.6.0

Browser version: Chrome 65.0.3325.181 (Official Build) (64-bit)

Describe the problem or feature request

mobileNet.predict() hit a performance cliff after certain amount of runs.
On my MacBook Pro, the average of 100 runs is ~11ms, while the average of 200 runs drops to ~46ms. Similar issue happened on my Android device. I wonder what caused the performance drop? Anyway to avoid it? Thanks.

Code to reproduce the bug / link to feature request:

Change tfjs-converter/demo/index.js cat.onload() code to call mobileNet.predict() multiple times in a loop:

console.time('Subsequent predictions');
for (let i = 0; i < 200; i++) {
result = mobileNet.predict(pixels);
}
console.timeEnd('Subsequent prediction');

Copied from original issue: tensorflow/tfjs-core#925

nsthorat · 2018-04-07T15:02:45Z

Try await tf.nextFrame() between each tick (you may have to wrap your code in a function marked as async).

async function run() {
  console.time('Subsequent predictions');
  for (let i = 0; i < 200; i++) {
    result = mobileNet.predict(pixels);
    await tf.nextFrame();
  }
  console.timeEnd('Subsequent prediction');
}
run();

samwyi · 2018-04-07T23:15:43Z

@nsthorat Adding "await tf.nextFrame()" really makes the performance number consistent on my laptop!

But on smartphones, the performance cliff still exits. After profiling with the Chrome performance dev tool, I noticed that after every 20-30 predictions, there is a long wait of 30-60 seconds (not ms!), with GPU shown as busy (solid green). Since I put "await tf.nextFrame()" between each prediction, there should be at most 1 inference running on the GPU, I wonder what is causing the long wait? Any idea?

I tested on two Android phones, both have similar issue.

nsthorat · 2018-04-07T23:59:15Z

Ah, apologies, you also probably have a memory leak and need to dispose the result of each predict. Try this code:

async function run() {
  console.time('Subsequent predictions');
  for (let i = 0; i < 200; i++) {
    result = mobileNet.predict(pixels);
    await tf.nextFrame();
    result.dispose();  // Get rid of the result
  }
  console.timeEnd('Subsequent prediction');
}
run();

Check out our section on memory in this tutorial: https://js.tensorflow.org/tutorials/core-concepts.html

samwyi · 2018-04-09T18:04:12Z

Tried calling result.dispose() after each prediction. On my Android phone (Huawei BLN-L24, Android 7, GPU: Mali-T830MP2), it really eliminated the 60 seconds wait between groups of 20-30 predictions, BUT at the cost of 2 seconds wait for each result.dispose(). I'm surprised to see that releasing GPU memory took much longer time than running the model itself ;-) Is this a issue in tensorflow.js or the GPU driver?

nsthorat · 2018-04-10T01:08:53Z

Interesting, it really shouldn't take 2 seconds for a dispose(). Actually, our dispose just marks memory for reuse, it doesn't actually trash the memory.

One thing to test, console log dl.memory().numTensors() at each tick and make sure the number of tensors isn't increasing.

samwyi · 2018-04-10T21:43:25Z

tfc.memory.numTensors increases by 4 after each mobileNet.predict() call. Calling dispose() doesn't seem to help :(

* vectorize min/max/logsumexp/nan shaders * vectorize reduce sum

dsmilkov · 2018-06-06T14:02:54Z

Hi Samwyi,

Can you share some simple code to reproduce this? This way we can take a closer look. Especially regarding the number of tensors going up by 4. after each mobileNet.predict()

RadEdje · 2018-07-06T04:09:35Z

Hello, just wanted to ask if the perforamance cliff has been figured out on android 5.0?
I built a proof of concept web app that uses the videocam of a phone or the webcam of a desktop/laptop to detect/recognize radiographic findings.

The web apps are at 2 versions:

https://radhorizon.com/SITES/RadLense/
(this uses tj.js version 0.10.0)

and

https://radhorizon.com/SITES/RadLense/index3.html
(this uses the latest tensorflow.js ver 0.11.7)

These all work on the latest firefox, opera and chrome on desktop as well as the latest chrome on android 8.0.

the latest 0.11.7 is blazing fast I must say compared to ver 0.10.0 but they both seem to have the same problem; they only work on android 8.0 but not android 5.0.

I've tried numerous ways to debug and look for the cause of the problem.
No errors show up on the error log so it's not a javascript issue.
The web cam runs so the phone is detecting and putting the web cam data in a video element.
The AI/ML model loads properly since I rigged the app to stop at the splash screen if it doesn't.
I had to manually put console.log("check here"); to see which part of the app was stalling since no actual errors were showing up on console. This is when I narrowed it down to model.predict() and that's how I found this thread. I tried the numerous solutions but it still does not seem to work. Just hoping to know if anyone gets tj.js to run on android 5.0? or I should just wait for everyone to end up with android 8.0. thanks.

nsthorat · 2018-10-24T20:03:30Z

Hi, can you rerun this with the latest version? Thanks!

nsthorat · 2019-08-12T19:34:26Z

Closing this out for inactivity.

This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/tensorflow/tfjs-node/145)

easadler pushed a commit to easadler/tfjs that referenced this issue Apr 12, 2018

vectorize min/max/logsumexp/nan/reduce_sum (tensorflow#145)

50c3805

* vectorize min/max/logsumexp/nan shaders * vectorize reduce sum

davidsoergel added type:bug Something isn't working comp:core labels May 10, 2018

RadEdje mentioned this issue Jun 6, 2018

Is it working on Android ? google/emoji-scavenger-hunt#22

Open

nsthorat closed this as completed Aug 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why mobileNet.predict() hit a performance cliff after certain amount of runs? #145

Why mobileNet.predict() hit a performance cliff after certain amount of runs? #145

nsthorat commented Apr 7, 2018

nsthorat commented Apr 7, 2018

samwyi commented Apr 7, 2018

nsthorat commented Apr 7, 2018 •

edited

Loading

samwyi commented Apr 9, 2018

nsthorat commented Apr 10, 2018

samwyi commented Apr 10, 2018

dsmilkov commented Jun 6, 2018 •

edited

Loading

RadEdje commented Jul 6, 2018

nsthorat commented Oct 24, 2018

nsthorat commented Aug 12, 2019

Why mobileNet.predict() hit a performance cliff after certain amount of runs? #145

Why mobileNet.predict() hit a performance cliff after certain amount of runs? #145

Comments

nsthorat commented Apr 7, 2018

TensorFlow.js version: 0.6.0

TensorFlow.js Core version: 0.6.0

Browser version: Chrome 65.0.3325.181 (Official Build) (64-bit)

Describe the problem or feature request

Code to reproduce the bug / link to feature request:

nsthorat commented Apr 7, 2018

samwyi commented Apr 7, 2018

nsthorat commented Apr 7, 2018 • edited Loading

samwyi commented Apr 9, 2018

nsthorat commented Apr 10, 2018

samwyi commented Apr 10, 2018

dsmilkov commented Jun 6, 2018 • edited Loading

RadEdje commented Jul 6, 2018

nsthorat commented Oct 24, 2018

nsthorat commented Aug 12, 2019

nsthorat commented Apr 7, 2018 •

edited

Loading

dsmilkov commented Jun 6, 2018 •

edited

Loading