Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A problem about openblas threads? #28

Closed
zazd opened this issue Jun 14, 2016 · 9 comments
Closed

A problem about openblas threads? #28

zazd opened this issue Jun 14, 2016 · 9 comments

Comments

@zazd
Copy link

zazd commented Jun 14, 2016

I use openblas and in the demo, the code that:
caffeMobile.setNumThreads(4);
and in the jni:
openblas_set_num_threads(num_threads);

I seems that this change the number of cores that we use. However, when I change the number in setNumThreads(), from 1-6. It changes nothing. For example, it cost me 5s to run a image when the number is 1, and 5s as the number is 4(or others).

My arm has 6 cores.

So, would you tell me something about the function setNumThreads()?

Thank you!

@sh1r0
Copy link
Owner

sh1r0 commented Jun 15, 2016

For OpenBLAS with libs on armeabi-v7a, I had tested that the number of threads do effect the performance.

@zazd
Copy link
Author

zazd commented Jun 15, 2016

I try it again just a moment ago. But it still seems nothing changed.

what I do:
export USE_OPENBLAS=1
export ANDROID_ABI="armeabi-v7a-hard-softfp with NEON"
./build.sh <path/to/ndk>

@sh1r0
Copy link
Owner

sh1r0 commented Jun 15, 2016

Did you try the prebuilt libs in this repo before? I'm pretty sure that it works with openmp support.

@zazd
Copy link
Author

zazd commented Jun 15, 2016

what is "the prebuilt libs "? and where can I find it?

@sh1r0
Copy link
Owner

sh1r0 commented Jun 15, 2016

@zazd
Copy link
Author

zazd commented Jun 16, 2016

Yes, I always do it, but find ineffective.
I run this demo in TK1, but find that it always run the program in one cpu core(and in other devices like rk3288 or xiaomi box3, it is ineffective when changed the number of openblas threads, too). I make a test in TK1 of Matrix multiplication. I find that even I make sure that I have 4 threads, the time it cost is the same as 1 thread. It seems that TK1 has only a core to calculate the floating point(I guess).

So, would you tell me, what device you use when it do effect the performance ?

@sh1r0
Copy link
Owner

sh1r0 commented Jun 16, 2016

Sorry, I found that I made a mistake and thus caffe libs would be built with eigen by build.sh even USE_OPENBLAS=1 is set. Therefore, the prebuilt libs were all built with eigen instead of openblas. I fixed the bug in the dev branch of caffe-android-lib. As it is time-consuming to rebuild the libs for all abi, please build caffe-android-lib on your own. Sorry again for the inconvenience.

@zazd
Copy link
Author

zazd commented Jun 17, 2016

Could tell me the mistake so I can build the lib by myself? I am not familiar with shell so I cannot find it.
Thank you!

@zazd
Copy link
Author

zazd commented Jun 17, 2016

I find it. I delete build_eigen.sh in build.sh. And just use build_Openblas.sh. It works.
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants