Result not correct when applying it to SSD (Single Shot MultiBox Detector) #10

heagoo · 2016-10-27T01:50:46Z

I merged to code of SSD (https://github.com/weiliu89/caffe/tree/ssd) and intel caffe, but when run ssd_detect with SSD300, the result is not correct if MKL2017_AS_DEFAULT_ENGINE is enabled.
After code merge, following 2 things have been confirmed:

when only with MKL, ssd_detect shows correct result
when enable MKL2017_AS_DEFAULT_ENGINE, cpp_classification work correctly with VGG16 (classification result is correct)
I guess SSD300 has a different input shape (3x300x300, not 3x224x224), thus make the convolution result is not correct.
Could you please help to look into it? Thanks!

nickltj86 · 2016-10-27T01:54:48Z

hi @heagoo would you be willing to share your work on the merge with only MKL? would be interesting to see the results and performance with intelcaffe. do you have rough performance numbers per frame?

heagoo · 2016-10-27T02:05:00Z

@nickltj86 The performance is not so good (262ms per frame with 32 cores), I think it could be much better after enable MKL2017_AS_DEFAULT_ENGINE, but unfortunately, the result is not correct yet.

nickltj86 · 2016-10-27T02:25:34Z

Thanks. Good to know the performance. I think even after enabling MKL2017, the performance may not increase too much. From the benchmark, the new layers which are not optimized with MKL or MKL2017 such as the mbox layer will cause the most delays. The performance of the common cnn layers are very optimized and seems like as fast as those running on CUDA. the other non-optimized layers are a different story.

Do you have a fork with the merge? I think the Intel folks will need that to evaluate it as well.

heagoo · 2016-10-27T07:53:53Z

Shared to:
https://github.com/heagoo/intel_caffe_ssd/

jdukat · 2016-10-28T14:15:32Z

Could you please try with MKL2017 engine enabled, but for layers that cause problems explicitly select engine CAFFE?
For reference, you can see here how to select different engine is selected layer:
https://github.com/intel/caffe/blob/master/models/mkl2017_googlenet_v1_knl/train_val.prototxt

We are working on fixes to simplify this process and avoid manual engine selection.

heagoo · 2016-10-28T15:13:40Z

@jdukat Yes, I've already do this, but the result is different from just using MKL, much different, and totally not correct. I tried:

only add MKL2017 engine to the first conv layer, works. (but seems still call the im2col implementation)
add MKL2017 to any other conv layers, NOT work.
It may be related with the DNN function. Is there any detail description for the MKL DNN functions?
And, the strange thing is that VGG16 works, do you think the input size matters? (VGG16 is 224x224, and SSD is 300x300 or 500x500)

jdukat · 2016-10-29T04:16:39Z

Input size matters for performance with MKL2017 for sure. It should not matter for correctness and it seems to be a bug. We should take a closer look at this, but at the moment I am not able to put this on top of my priorities.

I'll see what I can do about documentation for MKL DNN functions.

jdukat · 2016-10-30T19:27:19Z

MKL DNN API documentation

heagoo · 2016-10-31T11:29:40Z

Thanks!
Confirmed, it's a bug of MKL, we can solve the issue by downloading https://github.com/intel-mxnet/mkl-release/releases/download/self_containted_MKLGOLD/mklml_lnx_2017.0.1.20161005.tgz
And now, performance boost to 88ms under the same setting, great!

jdukat · 2016-11-02T14:22:46Z

Great, the same MKL version will be released for Caffe within next few days.
@heagoo do you plan to issue a pull request with your changes for Intel Caffe?

heagoo · 2016-11-05T12:40:58Z

Sure, will do it. @jdukat

ngaloppo · 2016-11-09T18:42:21Z

Is it confirmed that this bug exists in current master? If so, can we open this issue again to track the bug until it is fixed?

heagoo · 2016-11-10T14:47:05Z

This bug does not exist any more in the latest update with MKL 2017 update 1.

amoussawi · 2016-11-21T21:31:24Z

Hey @heagoo
Could you please tell me what intel processor are you using if you don't mind? Thanks :)

heagoo · 2016-11-25T03:59:08Z

I am using E5-2699 v4, the most powerful one, :)

…

On Tue, Nov 22, 2016 at 5:31 AM, Abdallah Moussawi ***@***.*** > wrote: Hey @heagoo <https://github.com/heagoo> Could you please tell me what intel processor are you using if you don't mind? Thanks :) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFsLw4b9l5JImymyMnvS54hWvYTrxmtUks5rAg2tgaJpZM4Kh3BG> .

-- -- "The first step is as good as half over."

heagoo closed this as completed Nov 2, 2016

coolbei mentioned this issue Jan 23, 2019

memory error with mkldnn concat layer #258

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Result not correct when applying it to SSD (Single Shot MultiBox Detector) #10

Result not correct when applying it to SSD (Single Shot MultiBox Detector) #10

heagoo commented Oct 27, 2016 •

edited

Loading

nickltj86 commented Oct 27, 2016

heagoo commented Oct 27, 2016 •

edited

Loading

nickltj86 commented Oct 27, 2016

heagoo commented Oct 27, 2016

jdukat commented Oct 28, 2016

heagoo commented Oct 28, 2016 •

edited

Loading

jdukat commented Oct 29, 2016

jdukat commented Oct 30, 2016

heagoo commented Oct 31, 2016 •

edited

Loading

jdukat commented Nov 2, 2016

heagoo commented Nov 5, 2016 •

edited

Loading

ngaloppo commented Nov 9, 2016

heagoo commented Nov 10, 2016

amoussawi commented Nov 21, 2016

heagoo commented Nov 25, 2016 via email

Result not correct when applying it to SSD (Single Shot MultiBox Detector) #10

Result not correct when applying it to SSD (Single Shot MultiBox Detector) #10

Comments

heagoo commented Oct 27, 2016 • edited Loading

nickltj86 commented Oct 27, 2016

heagoo commented Oct 27, 2016 • edited Loading

nickltj86 commented Oct 27, 2016

heagoo commented Oct 27, 2016

jdukat commented Oct 28, 2016

heagoo commented Oct 28, 2016 • edited Loading

jdukat commented Oct 29, 2016

jdukat commented Oct 30, 2016

heagoo commented Oct 31, 2016 • edited Loading

jdukat commented Nov 2, 2016

heagoo commented Nov 5, 2016 • edited Loading

ngaloppo commented Nov 9, 2016

heagoo commented Nov 10, 2016

amoussawi commented Nov 21, 2016

heagoo commented Nov 25, 2016 via email

heagoo commented Oct 27, 2016 •

edited

Loading

heagoo commented Oct 27, 2016 •

edited

Loading

heagoo commented Oct 28, 2016 •

edited

Loading

heagoo commented Oct 31, 2016 •

edited

Loading

heagoo commented Nov 5, 2016 •

edited

Loading