-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gemmer::sgemm not match caffe_cpu_gemm<float> completely #46
Comments
Is the 'loss' you mentioned the accurancy loss cased by quantification? |
I mean if the input matrix need transpose ,I use this function cause a wrong result ,may cause the loss is bigger. do you knonw the function on "caffe caffe_cpu_gemm" |
Our Gemmer is not support transposing matrix indeed, we put transpose procedure in the conversion of model, in order to accelerate matrix manipulation. @victorygogogo |
`/**
}` in caffe2mdl.cpp @victorygogogo |
OK ,thank you |
You're welcome. |
I found that ,I use this function ,it is slower than the openblas lib of gemm.by the way ,I use neon on a phone. |
You found our gemm is slower than openblas? |
openblas considered the cache control&reuse,share job,multithreading , that three factors speed up the gemm |
@cocodark yes , I use neon ,but some matrix needs transpose,so I did transpose in this function to match the "cblas_sgemm" function. even if not use transpose ,I test 30 times cost time is about6.7s, if use the cblas_sgemm,it costs about 2.7s. |
@victorygogogo ,excellent research work, currently our gemm is accelerated by neon, we'll try the tricks mentioned by @wangshankun ,such as cache control&reuse 、 multithreading to make it faster.If you are interested in gemm optimization work, code contributions will be appreciated. |
Gemmer::sgemm not match caffe_cpu_gemm completely ,
when you test a model,you can found that the loss is iscrease.
then I found that Gemmer::sgemm not support the transformed mat.
can baidu solve it ?
The text was updated successfully, but these errors were encountered: