-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nnpack in android cost more time in conv than openblas with singlethread. #39
Comments
Which convolution parameters and algorithm do you use? |
default param. AUTO BLOCK_BASED |
this is my prototxt |
@conansherry nnpack only supports conv with 1 stride, when stride > 1, nnpack also uses im2col + sgemm. |
@austingg oh i see the source code. and you are right. |
@conansherry thanks for sharing your experiments result. I will do some further experiments on openblas with gfortran. |
@Maratyszcza @austingg does the nnpack only support specify size kernel like 3x3 or 16x16? in my new test, the kernel size 5 and strid 1 come the wrong results. |
@Maratyszcza @austingg oh, i lookup the caffe2 implements. i use the tuple_based and everything is ok. I also check the souce code in convolution-inference.c other mode is not implement. |
@Maratyszcza @austingg openblas with gfortran NNPACK FFT16X16 NNPACK FFT8X8 NNPACK AUTO NNPACK SGEMM |
@conansherry that's pretty good result for a cnn application only costs about 10 ms on mobile devices. According to my research, gfortran is only related to LAPACK, the conv layers only use gemm. Have you ever do some experiments without gfortran, correct me if i am wrong. |
|
im2col + openblas sgemm cost 1/2 time which compared with nnpack in single thread. (multi thread will interference other program in weak cpu of android)
batchsize = 1 using inference mode.
so I continue to use openblas sgemm + im2col.
The text was updated successfully, but these errors were encountered: