Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile and run opencl version of caffe for android #22

Closed
DVEfremov opened this issue Jan 10, 2017 · 7 comments
Closed

Compile and run opencl version of caffe for android #22

DVEfremov opened this issue Jan 10, 2017 · 7 comments
Assignees

Comments

@DVEfremov
Copy link
Contributor

Steps to reprodice

  • run ck run program:caffe-time --target_os=android21-arm-v7a
  • select opencl version for caffe

Now compilation fails with error like:

/home/daniil/CK-TOOLS/lib-clblast-development-android-ndk-4.9.x-android21-arm-v7a-32/src/src/utilities/utilities.cpp:208:34: error: 'stod' is not a member of 'std'
   auto val = static_cast<double>(std::stod(value));
                                  ^
/home/daniil/CK-TOOLS/lib-clblast-development-android-ndk-4.9.x-android21-arm-v7a-32/src/src/utilities/utilities.cpp:209:26: error: no matching function for call to 'std::complex<double>::complex(<brace-enclosed initializer list>)'
   return double2{val, val};
                          ^
/home/daniil/CK-TOOLS/lib-clblast-development-android-ndk-4.9.x-android21-arm-v7a-32/src/src/utilities/utilities.cpp:209:26: note: candidates are:
In file included from /home/daniil/CK-TOOLS/lib-clblast-development-android-ndk-4.9.x-android21-arm-v7a-32/src/src/utilities/utilities.hpp:22:0,
                 from /home/daniil/CK-TOOLS/lib-clblast-development-android-ndk-4.9.x-android21-arm-v7a-32/src/src/utilities/utilities.cpp:14:
/home/daniil/Soft/android-ndk-r13b/sources/cxx-stl/gnu-libstdc++/4.9/include/complex:1508:3: note: constexpr std::complex<double>::complex(const std::complex<long double>&)
   compl

This problem also discussed at CNugteren/CLBlast#65

@DVEfremov DVEfremov self-assigned this Jan 10, 2017
@psyhtest
Copy link
Member

@DVEfremov A single file implementing missing functions would be a good solution, which could be merged into CLBlast itself.

DVEfremov pushed a commit that referenced this issue Jan 18, 2017
- I've merged my wraper program without gflags with opencl version of caffe.cpp
  (opencl version uses different caffe API)
  and it works fine now using ck without segfaults and same json timing as well as caffe-time version
gfursin added a commit that referenced this issue Jan 18, 2017
Compile and run opencl version of caffe for android #22
DVEfremov pushed a commit that referenced this issue Jan 24, 2017
- non invasive changes for caffe.cpp from BVLC/caffe branch opencl to remove gflags
  (gflags leads to segmentation faults on android)

- switch pachage to fork based BVLC/caffe branch opencl with required changes to get it works on android

- some experiment with disabling USE_INTEL_SPATIAL but it's not cause of problem still persist with kernal
DVEfremov pushed a commit that referenced this issue Jan 24, 2017
 - remove uselesss disabling ling USE_INTEL_SPATIAL (it's OFF by default)
gfursin added a commit that referenced this issue Jan 24, 2017
Compile and run opencl version of caffe for android #22 - intermediate version
@DVEfremov
Copy link
Contributor Author

So I've fixed issue with run time cl code compilation

Build Status = -2 ( Err = -11 )
Log: error: built-in function 'native_powr' called with unsupported argument type
error: built-in function 'native_powr' called with unsupported argument type
error: built-in function 'native_powr' called with unsupported argument type
error: built-in function 'native_powr' called with unsupported argument type
error: Compiler frontend failed (error code 58)

with explicit abstract type cast to float:
DVEfremov/caffe@7148076
DVEfremov/caffe@a7694a1

because

looks strange and seems like leads to unsupported argument type errors

@DVEfremov
Copy link
Contributor Author

So I'me faced with another problem now

I0124 19:50:40.501785 21124 caffe.cpp:420] Performing Forward
^[[1;5FF0124 19:50:55.229424 21124 syncedmem.cpp:217] Check failed: mapped_ptr == cpu_ptr_ (0 vs. 0xaf8e8000) Device claims it support zero copy but failed to create correct user ptr buffer
*** Check failure stack trace: ***
Aborted 

and I've found some related issues
rickyHong/caffe-for-clblast-branch#1

DVEfremov pushed a commit that referenced this issue Jan 25, 2017
@DVEfremov
Copy link
Contributor Author

I've found cause for my case

  1. I've dded additional debug info
    DVEfremov/caffe@6cb389d

and got

      I0125 14:13:27.095819  5529 caffe.cpp:420] Performing Forward
      I0125 14:13:27.114055  5529 syncedmem.cpp:217] mapped_ptr: 0xa9e00000
      I0125 14:13:27.114166  5529 syncedmem.cpp:218] cpu_ptr_: 0xa9e00000
      I0125 14:13:27.132802  5529 syncedmem.cpp:217] mapped_ptr: 0x9a200000
      I0125 14:13:27.132863  5529 syncedmem.cpp:218] cpu_ptr_: 0x9a200000
      I0125 14:13:29.678921  5529 syncedmem.cpp:217] mapped_ptr: 0x98500000
      I0125 14:13:29.679060  5529 syncedmem.cpp:218] cpu_ptr_: 0x98500000
      I0125 14:13:29.691509  5529 syncedmem.cpp:217] mapped_ptr: 0x97900000
      I0125 14:13:29.691671  5529 syncedmem.cpp:218] cpu_ptr_: 0x97900000
      I0125 14:13:29.802301  5529 syncedmem.cpp:217] mapped_ptr: 0x95600000
      I0125 14:13:29.802474  5529 syncedmem.cpp:218] cpu_ptr_: 0x95600000
      I0125 14:13:29.806849  5529 syncedmem.cpp:217] mapped_ptr: 0x95300000
      I0125 14:13:29.807009  5529 syncedmem.cpp:218] cpu_ptr_: 0x95300000
      I0125 14:13:29.886665  5529 syncedmem.cpp:217] mapped_ptr: 0x93d00000
      I0125 14:13:29.886879  5529 syncedmem.cpp:218] cpu_ptr_: 0x93d00000
      I0125 14:13:33.813133  5529 syncedmem.cpp:217] mapped_ptr: 0x92c00000
      I0125 14:13:33.813302  5529 syncedmem.cpp:218] cpu_ptr_: 0x92c00000
      I0125 14:13:33.822790  5529 syncedmem.cpp:217] mapped_ptr: 0x92400000
      I0125 14:13:33.822999  5529 syncedmem.cpp:218] cpu_ptr_: 0x92400000
      I0125 14:13:33.897541  5529 syncedmem.cpp:217] mapped_ptr: 0x90d00000
      I0125 14:13:33.897711  5529 syncedmem.cpp:218] cpu_ptr_: 0x90d00000
      I0125 14:13:33.900748  5529 syncedmem.cpp:217] mapped_ptr: 0x90b00000
      I0125 14:13:33.900909  5529 syncedmem.cpp:218] cpu_ptr_: 0x90b00000
      I0125 14:13:33.941565  5529 syncedmem.cpp:217] mapped_ptr: 0x8ff00000
      I0125 14:13:33.941716  5529 syncedmem.cpp:218] cpu_ptr_: 0x8ff00000
      I0125 14:13:36.368584  5529 syncedmem.cpp:217] mapped_ptr: 0x8f600000
      I0125 14:13:36.368737  5529 syncedmem.cpp:218] cpu_ptr_: 0x8f600000
      I0125 14:13:38.099359  5529 syncedmem.cpp:217] mapped_ptr: 0x8ee00000
      I0125 14:13:38.099504  5529 syncedmem.cpp:218] cpu_ptr_: 0x8ee00000
      I0125 14:13:39.246372  5529 syncedmem.cpp:217] mapped_ptr: 0x8e800000
      I0125 14:13:39.246781  5529 syncedmem.cpp:218] cpu_ptr_: 0x8e800000
      I0125 14:13:39.247478  5529 syncedmem.cpp:217] mapped_ptr: 0xa917a000
      I0125 14:13:39.247591  5529 syncedmem.cpp:218] cpu_ptr_: 0xa917a000
      I0125 14:13:39.268347  5529 syncedmem.cpp:217] mapped_ptr: 0xa9405000
      I0125 14:13:39.268448  5529 syncedmem.cpp:218] cpu_ptr_: 0xa9405000
      I0125 14:13:42.451174  5529 syncedmem.cpp:217] mapped_ptr: 0x8e201000
      I0125 14:13:42.451611  5529 syncedmem.cpp:218] cpu_ptr_: 0x8e201000
      I0125 14:13:42.466282  5529 syncedmem.cpp:217] mapped_ptr: 0x8e2cb000
      I0125 14:13:42.466444  5529 syncedmem.cpp:218] cpu_ptr_: 0x8e2cb000
      I0125 14:13:43.275159  5529 syncedmem.cpp:217] mapped_ptr: 0x8dec9000
      I0125 14:13:43.275262  5529 syncedmem.cpp:218] cpu_ptr_: 0x8dec9000
      I0125 14:13:43.314358  5529 syncedmem.cpp:217] mapped_ptr: 0xa92d3000
      I0125 14:13:43.314492  5529 syncedmem.cpp:218] cpu_ptr_: 0xa92d3000
      I0125 14:13:43.852869  5529 syncedmem.cpp:217] mapped_ptr: 0
      I0125 14:13:43.852962  5529 syncedmem.cpp:218] cpu_ptr_: 0xaabba000
      F0125 14:13:43.855269  5529 syncedmem.cpp:220] Check failed: mapped_ptr == cpu_ptr_ (0 vs. 0xaabba000) Device claims it support zero copy but failed to create correct user ptr buffer
      *** Check failure stack trace: ***
      Aborted 

so just for my device it means just not enough memory
I've fixed it with batch_size = 1 at deploy.prototxt

name: "AlexNet"
input: "data"
input_shape: {
  dim: 1
  dim: 3
  dim: 227
  dim: 227
}

@DVEfremov
Copy link
Contributor Author

I've tested CPU version works fine after all my changes.

@DVEfremov
Copy link
Contributor Author

DVEfremov commented Jan 25, 2017

I've tested caffe-time-opencl for

  • Caffe model (net and weights) (deepscale, squeezenet, 1.1) - v1.1
  • Caffe model (net and weights) (deepscale, squeezenet, 1.0) - v1.0
  • Caffe model (net and weights) (bvlc, googlenet)
  • Caffe model (net and weights)

works fine

DVEfremov pushed a commit that referenced this issue Jan 25, 2017
 - classification.cpp for opencl version of caffe
gfursin added a commit that referenced this issue Jan 25, 2017
Compile and run opencl version of caffe for android #22
DVEfremov pushed a commit that referenced this issue Jan 26, 2017
- classification.cpp for opencl version of caffe
@DVEfremov
Copy link
Contributor Author

It works fine now and and merged to official branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants