Skip to content
This repository has been archived by the owner on Apr 18, 2023. It is now read-only.

[Windows/clDNN] The browser crashed when running Inception v4(TFlite) and Inception Resnet v2(TFlite) on Windows WebML #507

Closed
Christywl opened this issue Jan 22, 2019 · 11 comments
Assignees
Labels
bug Something isn't working p1 Highest priority verified

Comments

@Christywl
Copy link
Contributor

Christywl commented Jan 22, 2019

Test Env:
Chromium Version: nightly build 70.0.3503.0 (8b58220)
Platform: Windows 10(Dell XPS 13)[CPU: Intel i5-8250U, GPU: Intel UHD Craphics 620(driver: 25.20.100.6471), Memory: 8GB]

Expected Result:
Inception v4(TFlite) and Inception Resnet v2(TFlite) work.

Actual Result:
The browser crashed when running Inception v4(TFlite) and Inception Resnet v2(TFlite) on Windows WebML
cldnn

How to Reproduce:

  1. git clone https://github.com/intel/webml-polyfill
  2. npm i & npm run build
  3. Download the models
  4. npm start
  5. Visit http://127.0.0.1:8080/examples/image_classification/index.html
  6. Select Inception v4(TFlite) [or Inception Resnet v2(TFlite) ] and SUSTAINED_SPEED
@Christywl Christywl added bug Something isn't working p1 Highest priority labels Jan 22, 2019
@Christywl
Copy link
Contributor Author

DenseNet(Onnx) has the same issue on Windows clDNN.

@huningxin
Copy link
Contributor

@Christywl thanks for reporting this issue!

Is this a regression? According to https://github.com/intel/webml-polyfill/wiki/WebML-Examples-Results-on-Different-Backends-and-Platforms, all these models worked on 8755e6b. Is that correct?

@Christywl
Copy link
Contributor Author

Christywl commented Jan 23, 2019

@huningxin , retested and not a regression, it's my mistake in the previous testing. Now I'm testing the examples based on the newer build and codes, and will update the table in the wiki later.

@huningxin
Copy link
Contributor

That's fine. Thanks for the update. However, does it happen on Linux/clDNN?

@Christywl
Copy link
Contributor Author

No, these examples work on Linux/clDNN.

@huningxin
Copy link
Contributor

No, these examples work on Linux/clDNN.

@Christywl , thanks! May I know whether the Linux machine and Windows machine under test have same hardware configuration? I guess the crash may be related to memory limit. But I don't know whether it is Windows specific. So could you please help verify it on Linux machine with same memory amount with Windows one. And as I know, our Linux test will use "--no-sandbox" option, could you please also apply that for Windows for testing? Thanks!

@Christywl
Copy link
Contributor Author

Christywl commented Jan 25, 2019

@huningxin , the Linux and Windows machine configuration:

  • Windows: Dell XPS 13 [CPU: Intel i5-8250U, GPU: Intel UHD Graphics 620, Memory: 8GB]
  • Linux: Dell XPS 13 [CPU: Intel i7-8550U, GPU: Intel UHD Graphics 620, Memory: 16GB]

And I also tried on Windows with "--no-sandbox", this issue still happened.

So could you please help verify it on Linux machine with same memory amount with Windows one.

I don't have the Linux machine with 8GB memory. But I tried another Windows and Linux with the same configuration[Dell Inspiron 13 7000 Series, CPU: i5-6200U, GPU: Intel HD Graphics 520, Memory: 4GB], the crash issue only happened on Windows. The examples worked fine on Linux.

@huningxin
Copy link
Contributor

But I tried another Windows and Linux with the same configuration[Dell Inspiron 13 7000 Series, CPU: i5-6200U, GPU: Intel HD Graphics 520, Memory: 4GB], the crash issue only happened on Windows. The examples worked fine on Linux.

That's very helpful. Thanks much @Christywl !

I plan to upgrade clDNN to latest version (Drop 12.1) which has memory leak fixing might be related to this issue. We can verify whether this issue still happen on the latest clDNN version. I've opened #527 to track it.

@huningxin
Copy link
Contributor

The root cause might be that the long time compilation of these models trigger the gpu watch dog which kills the GPU process.

@Christywl , could you please help try again with "--disable-gpu-watchdog"? This workaround works in my environment.

#514 has the same root cause. Please help verify as well.

To fix this, we may need to implement off main thread ml service as #517.

@Christywl
Copy link
Contributor Author

@huningxin , I tried again with "--disable-gpu-watchdog" on Windows, the examples worked, the browser didn't crash. SSDLite MobileNetV2(TFlite) in #514 also worked with this workaround.

@Christywl
Copy link
Contributor Author

This issue has been fixed on the nightly build a5a8547.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working p1 Highest priority verified
Projects
None yet
Development

No branches or pull requests

3 participants