Gemmer::sgemm not match caffe_cpu_gemm<float> completely #46

victorygogogo · 2017-09-27T12:33:33Z

Gemmer::sgemm not match caffe_cpu_gemm completely ,
when you test a model,you can found that the loss is iscrease.
then I found that Gemmer::sgemm not support the transformed mat.

can baidu solve it ?

cocodark · 2017-09-28T05:21:30Z

Is the 'loss' you mentioned the accurancy loss cased by quantification?

victorygogogo · 2017-09-28T05:26:44Z

I mean if the input matrix need transpose ,I use this function cause a wrong result ,may cause the loss is bigger. do you knonw the function on "caffe caffe_cpu_gemm"
void caffe_cpu_gemm(const CBLAS_TRANSPOSE TransA,
const CBLAS_TRANSPOSE TransB, const int M, const int N, const int K,
const float alpha, const float* A, const float* B, const float beta,
float* C) {
LOG(INFO) << "Running for caffe_cpu_gemm trans start" ;
int lda = (TransA == CblasNoTrans) ? K : M;
int ldb = (TransB == CblasNoTrans) ? N : K;
cblas_sgemm(CblasRowMajor, TransA, TransB, M, N, K, alpha, A, lda, B,
ldb, beta, C, N);
}

cocodark · 2017-09-28T05:29:31Z

Our Gemmer is not support transposing matrix indeed, we put transpose procedure in the conversion of model, in order to accelerate matrix manipulation. @victorygogogo

cocodark · 2017-09-28T05:32:22Z

`/**

transpose matrix in advance
@param data
@param shape
@return
*/
float *trans_matrix(const float *data, vector shape) {

int m = shape[0];
int n = shape[1];

float *trans = new float[m * n];

for (int i = 0; i < n; ++i) {
for (int j = 0; j < m; ++j) {
```
     trans[i * m + j] = data[j * n + i];
 }
```
}
return trans;

}` in caffe2mdl.cpp @victorygogogo

victorygogogo · 2017-09-28T11:19:50Z

OK ,thank you

cocodark · 2017-09-28T11:21:52Z

You're welcome.

victorygogogo · 2017-09-29T05:07:51Z

I found that ,I use this function ,it is slower than the openblas lib of gemm.by the way ,I use neon on a phone.

cocodark · 2017-09-29T05:28:50Z

You found our gemm is slower than openblas?

wangshankun · 2017-09-29T05:55:02Z

openblas considered the cache control&reuse，share job，multithreading , that three factors speed up the gemm

victorygogogo · 2017-09-29T07:21:48Z

@cocodark yes , I use neon ,but some matrix needs transpose,so I did transpose in this function to match the "cblas_sgemm" function. even if not use transpose ,I test 30 times cost time is about6.7s, if use the cblas_sgemm,it costs about 2.7s.

cocodark · 2017-09-30T02:11:08Z

@victorygogogo ,excellent research work, currently our gemm is accelerated by neon, we'll try the tricks mentioned by @wangshankun ,such as cache control&reuse 、 multithreading to make it faster.If you are interested in gemm optimization work, code contributions will be appreciated.

allonli assigned yangyanzhan Sep 27, 2017

cocodark closed this as completed Sep 28, 2017

cocodark reopened this Sep 29, 2017

allonli assigned cocodark Sep 29, 2017

allonli added the good first issue label Oct 17, 2017

allonli closed this as completed Dec 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemmer::sgemm not match caffe_cpu_gemm<float> completely #46

Gemmer::sgemm not match caffe_cpu_gemm<float> completely #46

victorygogogo commented Sep 27, 2017

cocodark commented Sep 28, 2017

victorygogogo commented Sep 28, 2017

cocodark commented Sep 28, 2017

cocodark commented Sep 28, 2017

victorygogogo commented Sep 28, 2017

cocodark commented Sep 28, 2017

victorygogogo commented Sep 29, 2017

cocodark commented Sep 29, 2017

wangshankun commented Sep 29, 2017

victorygogogo commented Sep 29, 2017

cocodark commented Sep 30, 2017 •

edited

Gemmer::sgemm not match caffe_cpu_gemm<float> completely #46

Gemmer::sgemm not match caffe_cpu_gemm<float> completely #46

Comments

victorygogogo commented Sep 27, 2017

cocodark commented Sep 28, 2017

victorygogogo commented Sep 28, 2017

cocodark commented Sep 28, 2017

cocodark commented Sep 28, 2017

victorygogogo commented Sep 28, 2017

cocodark commented Sep 28, 2017

victorygogogo commented Sep 29, 2017

cocodark commented Sep 29, 2017

wangshankun commented Sep 29, 2017

victorygogogo commented Sep 29, 2017

cocodark commented Sep 30, 2017 • edited

cocodark commented Sep 30, 2017 •

edited