-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducing NNPACK numbers on SKL i5-6600K #4
Comments
Note also that because convnet-benchmarks is using old-style prototxt, the NNPackConvolutionParameter message is not parsed correctly by caffe (e.g. to set algorithm: FFT_16x16) |
If the reported timings are per image (not per batch), then I take my comments about not being reproduce back. However, it could be nice to add some notes to the README on how to enable NNPACK as per my instructions above:
|
@ngaloppo The timings are per batch. The parameters of the networks are from |
@Maratyszcza thanks for the link to the Regarding the convolution algorithm: did you have to convert the prototxt to the new format to pick anything aside from |
I changed the defaults in Caffe.proto and recompiled Caffe for each algorithm |
So this is for an Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz. I'm getting pretty good results, in the lines of what your numbers are, except for conv2. I'm not sure if that's related to the difference between i5 and i7, or something else. See below:
|
@ngaloppo Do you use prototxt from |
@Maratyszcza Yes, from the |
How many threads are running here? Can you control the number of threads for NNPACK. Even when I am setting OMP_NUM_THREADS to 1, I can see multiple threads running in parallel in htop. |
@anijain2305 NNPACK would use |
yes, also I got similar results as @ngaloppo reported. openblas + fft16x16 in a i7 machine |
@Maratyszcza hi, it's unfortunate that i also cannot get the result in the NNPACK README.md in my i7-4720HQ machine, i used --enable-psimd configuration and complied the latest NNPACK version. As for timing i choose nnpack-pr and modified a few lines of code to fit the new interface of NNPACK. But when i add engine: NNPACK inside conv_param the time of relevant convolution layers even become slower,backward is very fast because it not implement. I have tried some methods but still can't solve this problem, looking forward to your help, thanks. (i use prototxt from cpu branch of convnet-benchmark directly and use caffe time command for testing time) |
@wangxi123 If you want to reproduce results from README, don't use |
@Maratyszcza well, i think, i just want to test the conv speedup compare to which is not adding engine: NNPACK in conv_param, but it seems that i don't get some speedup finally,i don't know whether i left out some necessary operations. That's OK, i will try again in another machine which has AVX2 instruction. and should i use the latest NNPACK with nnpack-pr ? I hope you can recommand a NNPACK version for me. |
@wangxi123 When you add |
@Maratyszcza i'm pleased to see some speedup (~1.3x) in my machine which has AVX2 instruction set. But when i change the algorithm DEFAULT config in proto/caffe.proto and recompile caffe, it seems to have little change between AUTO and FFT_16x16 option , i'm so confused ...what's more, when i run WINOGRAD option , caffe will crash somehow, and i got a message Check failed: nnp_status_success == status (0 vs. 26),is that expected ? thank you for your patience to answer. |
@wangxi123 |
@Maratyszcza yes, i got it. I modify conv2 for 3x3 kernels with pad 1 to test WINOGRAD algorithm, i get ~1.6x speedup in alexnet and |
@wangxi123 In the current implementation of most convolution functions in NNPACK you need quite large batch size to get speedup (at least 128, better 256). No that it doesn't affect |
I'm having trouble reproducing the performance numbers for AlexNet in the NNPACK README.md. I'm using the nnpack-pr branch here, and timing using the
caffe time
invocation as in the convnet-benchmark scripts.I'm using the prototxt from convnet-benchmark. I added
engine: NNPACK
to conv2-conv5 and double checked that NNPACK is being invoked.There are a few open issues:
The text was updated successfully, but these errors were encountered: