Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot find any dGPU?I use A10-7850 #37

Open
x-xiaojian opened this issue Mar 18, 2016 · 36 comments
Open

Cannot find any dGPU?I use A10-7850 #37

x-xiaojian opened this issue Mar 18, 2016 · 36 comments

Comments

@x-xiaojian
Copy link

Could not create logging file: No such file or directory
COULD NOT CREATE A LOGGINGFILE 20160319-002926.5520!Could not create logging file: No such file or directory
COULD NOT CREATE A LOGGINGFILE 20160319-002926.5520!Could not create logging file: No such file or directory
COULD NOT CREATE A LOGGINGFILE 20160319-002926.5520!Could not create logging file: No such file or directory
COULD NOT CREATE A LOGGINGFILE 20160319-002926.5520!F0319 00:29:26.820322 5520 device.cpp:95] Cannot find any dGPU!
*** Check failure stack trace: ***
@ 0x7fbf1b7d0ea4 (unknown)
@ 0x7fbf1b7d0deb (unknown)
@ 0x7fbf1b7d07bf (unknown)
@ 0x7fbf1b7d3a35 (unknown)
@ 0x7fbf1bad9a77 caffe::Device::Init()
@ 0x7fbf1badaedd caffe::Caffe::Caffe()
@ 0x409865 train()
@ 0x4069d1 main
@ 0x7fbf1aab0a40 (unknown)
@ 0x407019 _start
@ (nil) (unknown)
Aborted (core dumped)

@gujunli
Copy link
Contributor

gujunli commented Mar 18, 2016

Do you have a GPU?

Sent from my iPhone

On Mar 18, 2016, at 9:36 AM, x-xiaojian notifications@github.com wrote:

Could not create logging file: No such file or directory
COULD NOT CREATE A LOGGINGFILE 20160319-002926.5520!Could not create logging file: No such file or directory
COULD NOT CREATE A LOGGINGFILE 20160319-002926.5520!Could not create logging file: No such file or directory
COULD NOT CREATE A LOGGINGFILE 20160319-002926.5520!Could not create logging file: No such file or directory
COULD NOT CREATE A LOGGINGFILE 20160319-002926.5520!F0319 00:29:26.820322 5520 device.cpp:95] Cannot find any dGPU!
*** Check failure stack trace: ***
@ 0x7fbf1b7d0ea4 (unknown)
@ 0x7fbf1b7d0deb (unknown)
@ 0x7fbf1b7d07bf (unknown)
@ 0x7fbf1b7d3a35 (unknown)
@ 0x7fbf1bad9a77 caffe::Device::Init()
@ 0x7fbf1badaedd caffe::Caffe::Caffe()
@ 0x409865 train()
@ 0x4069d1 main
@ 0x7fbf1aab0a40 (unknown)
@ 0x407019 _start
@ (nil) (unknown)
Aborted (core dumped)


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub

@x-xiaojian
Copy link
Author

only a APU,
this is my "clinfo" print
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 2.0 AMD-APP (1729.3)
Platform Name: AMD Accelerated Parallel Processing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices

Platform Name: AMD Accelerated Parallel Processing
Number of devices: 2
Device Type: CL_DEVICE_TYPE_GPU
Vendor ID: 1002h
Board name: AMD Radeon(TM) R7 Graphics
Device Topology: PCI[ B#0, D#1, F#0 ]
Max compute units: 8
Max work items dimensions: 3
Max work items[0]: 256
Max work items[1]: 256
Max work items[2]: 256
Max work group size: 256
Preferred vector width char: 4
Preferred vector width short: 2
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Native vector width char: 4
Native vector width short: 2
Native vector width int: 1
Native vector width long: 1
Native vector width float: 1
Native vector width double: 1
Max clock frequency: 720Mhz
Address bits: 64
Max memory allocation: 419168256
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 64
Max image 2D width: 16384
Max image 2D height: 16384
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 1024
Alignment (bits) of base address: 2048
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: No
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 1676673024
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768
Max pipe arguments: 16
Max pipe active reservations: 16
Max pipe packet size: 419168256
Max global variable size: 377251328
Max global variable preferred total size: 1676673024
Max read/write image args: 64
Max on device events: 1024
Queue on device max size: 524288
Max on device queues: 1
Queue on device preferred size: 262144
SVM capabilities:
Coarse grain buffer: Yes
Fine grain buffer: Yes
Fine grain system: No
Atomics: Yes
Preferred platform atomic alignment: 0
Preferred global atomic alignment: 0
Preferred local atomic alignment: 0
Kernel Preferred work group size multiple: 64
Error correction support: 0
Unified memory for Host and Device: 1
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue on Host properties:
Out-of-Order: No
Profiling : Yes
Queue on Device properties:
Out-of-Order: Yes
Profiling : Yes
Platform ID: 0x7f4f7f5058f0
Name: Spectre
Vendor: Advanced Micro Devices, Inc.
Device OpenCL C version: OpenCL C 2.0
Driver version: 1729.3 (VM)
Profile: FULL_PROFILE
Version: OpenCL 2.0 AMD-APP (1729.3)
Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images

Device Type: CL_DEVICE_TYPE_CPU
Vendor ID: 1002h
Board name:
Max compute units: 4
Max work items dimensions: 3
Max work items[0]: 1024
Max work items[1]: 1024
Max work items[2]: 1024
Max work group size: 1024
Preferred vector width char: 16
Preferred vector width short: 8
Preferred vector width int: 4
Preferred vector width long: 2
Preferred vector width float: 8
Preferred vector width double: 4
Native vector width char: 16
Native vector width short: 8
Native vector width int: 4
Native vector width long: 2
Native vector width float: 8
Native vector width double: 4
Max clock frequency: 3000Mhz
Address bits: 64
Max memory allocation: 3909534720
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 64
Max image 2D width: 8192
Max image 2D height: 8192
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 4096
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 15638138880
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Global
Local memory size: 32768
Max pipe arguments: 16
Max pipe active reservations: 16
Max pipe packet size: 3909534720
Max global variable size: 1879048192
Max global variable preferred total size: 1879048192
Max read/write image args: 64
Max on device events: 0
Queue on device max size: 0
Max on device queues: 0
Queue on device preferred size: 0
SVM capabilities:
Coarse grain buffer: No
Fine grain buffer: No
Fine grain system: No
Atomics: No
Preferred platform atomic alignment: 0
Preferred global atomic alignment: 0
Preferred local atomic alignment: 0
Kernel Preferred work group size multiple: 1
Error correction support: 0
Unified memory for Host and Device: 1
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: Yes
Queue on Host properties:
Out-of-Order: No
Profiling : Yes
Queue on Device properties:
Out-of-Order: No
Profiling : No
Platform ID: 0x7f4f7f5058f0
Name: AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G
Vendor: AuthenticAMD
Device OpenCL C version: OpenCL C 1.2
Driver version: 1729.3 (sse2,avx,fma4)
Profile: FULL_PROFILE
Version: OpenCL 1.2 AMD-APP (1729.3)
Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_spir cl_khr_gl_event

@mpekalski
Copy link

mpekalski commented Mar 18, 2016 via email

@x-xiaojian
Copy link
Author

F0321 18:29:37.102532 9402 device.cpp:95] Cannot find any dGPU!
*** Check failure stack trace: ***
@ 0x7efc76358ea4 (unknown)
@ 0x7efc76358deb (unknown)
@ 0x7efc763587bf (unknown)
@ 0x7efc7635ba35 (unknown)
@ 0x7efc76661a77 caffe::Device::Init()
@ 0x7efc76662edd caffe::Caffe::Caffe()
@ 0x409865 train()
@ 0x4069d1 main
@ 0x7efc75638a40 (unknown)
@ 0x407019 _start
@ (nil) (unknown)
Aborted (core dumped)

I have created a log folder in Caffe directory.but there also is a erro

@tseckin
Copy link

tseckin commented Mar 22, 2016

Also I have similar problems building. I successfully run cmake and make commands. Apparently I have Radeon 7850 but make runtest command gives me

Cannot find any dGPU!

error:

...............................
[ 98%] Building CXX object src/caffe/test/CMakeFiles/test.testbin.dir/test_random_number_generator.cpp.o
[100%] Linking CXX executable ../../../test/test.testbin
[100%] Built target test.testbin
Scanning dependencies of target runtest
Current device id: 0
F0322 11:27:36.232533  9290 device.cpp:95] Cannot find any dGPU! 
*** Check failure stack trace: ***
    @     0x7f6b9255fddd  (unknown)
    @     0x7f6b92561cc0  (unknown)
    @     0x7f6b9255f9ac  (unknown)
    @     0x7f6b925626be  (unknown)
    @     0x7f6b92ec70ab  caffe::Device::Init()
    @           0x6ecdb2  main
    @     0x7f6b8ce44700  __libc_start_main
    @           0x6f2139  _start
/bin/sh: line 1:  9290 Aborted                 (core dumped) /home/user/OpenCL-caffe-stable/build/test/test.testbin --gtest_shuffle --gtest_filter="-*GPU*"
src/caffe/test/CMakeFiles/runtest.dir/build.make:57: recipe for target 'src/caffe/test/CMakeFiles/runtest' failed
make[3]: *** [src/caffe/test/CMakeFiles/runtest] Error 134
CMakeFiles/Makefile2:328: recipe for target 'src/caffe/test/CMakeFiles/runtest.dir/all' failed
make[2]: *** [src/caffe/test/CMakeFiles/runtest.dir/all] Error 2
CMakeFiles/Makefile2:335: recipe for target 'src/caffe/test/CMakeFiles/runtest.dir/rule' failed
make[1]: *** [src/caffe/test/CMakeFiles/runtest.dir/rule] Error 2
Makefile:240: recipe for target 'runtest' failed

My clinfo command:

Number of platforms                               1
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 MESA 10.6.9
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     AMD PITCAIRN
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 MESA 10.6.9
  Driver Version                                  10.6.9
  Device OpenCL C Version                         OpenCL C 1.1 
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Max compute units                               16
  Max clock frequency                             860MHz
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              In file included from <built-in>:296:
In file included from <command line>:2:
In file included from /usr/include/clc/clc.h:15:
/usr/include/clc/clctypes.h:3:10: fatal error: 'stddef.h' file not found

  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 0 / 0        (n/a)
    float                                                4 / 4       
    double                                               0 / 0        (n/a)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (n/a)
  Address bits                                    32, Little-Endian
  Global memory size                              1073741824 (1024MiB)
  Error Correction support                        No
  Max memory allocation                           268435456 (256MiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        None
  Image support                                   No
  Local memory type                               Local
  Local memory size                               32768 (32KiB)
  Max constant buffer size                        268435456 (256MiB)
  Max number of constant args                     16
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Profiling timer resolution                      0ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  Device Available                                Yes
  Compiler Available                              Yes
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Clover
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [MESA]
  clCreateContext(NULL, ...) [default]            Success [MESA]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD PITCAIRN
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD PITCAIRN

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.3
  ICD loader Profile                              OpenCL 1.2

@janchk
Copy link

janchk commented Nov 22, 2016

Lol. Same problem.

My clinfo

Number of platforms                               1
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 13.1.0-devel (git-151aeca 2016-11-13 xenial-oibaf-ppa)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     AMD OLAND (DRM 2.43.0 / 4.4.0-47-generic, LLVM 3.9.0)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 Mesa 13.1.0-devel (git-151aeca 2016-11-13 xenial-oibaf-ppa)
  Driver Version                                  13.1.0-devel
  Device OpenCL C Version                         OpenCL C 1.1 
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Max compute units                               6
  Max clock frequency                             825MHz
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
=== CL_PROGRAM_BUILD_LOG ===
<unknown>:0:0: in function sum void (float addrspace(1)*, float addrspace(1)*, float addrspace(1)*): unsupported call to function get_local_size
  Preferred work group size multiple              <unknown>:0:0: in function sum void (float addrspace(1)*, float addrspace(1)*, float addrspace(1)*): unsupported call to function get_local_size

  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 0 / 0        (n/a)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              2147483648 (2GiB)
  Error Correction support                        No
  Max memory allocation                           1503238553 (1.4GiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        None
  Image support                                   No
  Local memory type                               Local
  Local memory size                               32768 (32KiB)
  Max constant buffer size                        1503238553 (1.4GiB)
  Max number of constant args                     16
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Profiling timer resolution                      0ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  Device Available                                Yes
  Compiler Available                              Yes
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_fp64

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [MESA]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform

My error log

F1122 23:19:05.806424 21980 device.cpp:95] Cannot find any dGPU! 
*** Check failure stack trace: ***
    @     0x7f4c8ee2a5cd  google::LogMessage::Fail()
    @     0x7f4c8ee2c433  google::LogMessage::SendToLog()
    @     0x7f4c8ee2a15b  google::LogMessage::Flush()
    @     0x7f4c8ee2ce1e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f4c8f4c4c5b  caffe::Device::Init()
    @           0x6f04d2  main
    @     0x7f4c8cf1c830  __libc_start_main
    @           0x6f3f59  _start
    @              (nil)  (unknown)
Aborted (core dumped)
src/caffe/test/CMakeFiles/runtest.dir/build.make:57: recipe for target 'src/caffe/test/CMakeFiles/runtest' failed
make[3]: *** [src/caffe/test/CMakeFiles/runtest] Error 134
CMakeFiles/Makefile2:328: recipe for target 'src/caffe/test/CMakeFiles/runtest.dir/all' failed
make[2]: *** [src/caffe/test/CMakeFiles/runtest.dir/all] Error 2
CMakeFiles/Makefile2:335: recipe for target 'src/caffe/test/CMakeFiles/runtest.dir/rule' failed
make[1]: *** [src/caffe/test/CMakeFiles/runtest.dir/rule] Error 2
Makefile:240: recipe for target 'runtest' failed
make: *** [runtest] Error 2

@gujunli
Copy link
Contributor

gujunli commented Nov 22, 2016

It seems that you have a really old GPU with OpenCL 1.1.
I suspect the error is caused by the incompatibility
Thanks
Junli

Sent from my iPhone

On Nov 22, 2016, at 12:29 PM, janchk notifications@github.com wrote:

Lol. Same problem.

My clinfo

`Number of platforms 1
Platform Name Clover
Platform Vendor Mesa
Platform Version OpenCL 1.1 Mesa 13.1.0-devel (git-151aeca 2016-11-13 xenial-oibaf-ppa)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix MESA

Platform Name Clover
Number of devices 1
Device Name AMD OLAND (DRM 2.43.0 / 4.4.0-47-generic, LLVM 3.9.0)
Device Vendor AMD
Device Vendor ID 0x1002
Device Version OpenCL 1.1 Mesa 13.1.0-devel (git-151aeca 2016-11-13 xenial-oibaf-ppa)
Driver Version 13.1.0-devel
Device OpenCL C Version OpenCL C 1.1
Device Type GPU
Device Profile FULL_PROFILE
Max compute units 6
Max clock frequency 825MHz
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
=== CL_PROGRAM_BUILD_LOG ===
:0:0: in function sum void (float addrspace(1), float addrspace(1), float addrspace(1)): unsupported call to function get_local_size
Preferred work group size multiple :0:0: in function sum void (float addrspace(1), float addrspace(1), float addrspace(1)): unsupported call to function get_local_size

Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 2 / 2
half 0 / 0 (n/a)
float 4 / 4
double 2 / 2 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Address bits 64, Little-Endian
Global memory size 2147483648 (2GiB)
Error Correction support No
Max memory allocation 1503238553 (1.4GiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type None
Image support No
Local memory type Local
Local memory size 32768 (32KiB)
Max constant buffer size 1503238553 (1.4GiB)
Max number of constant args 16
Max size of kernel argument 1024
Queue properties
Out-of-order execution No
Profiling Yes
Profiling timer resolution 0ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Device Available Yes
Compiler Available Yes
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_fp64

NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform
clCreateContext(NULL, ...) [default] No platform
clCreateContext(NULL, ...) [other] Success [MESA]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) No platform
`

My error log

`F1122 23:19:05.806424 21980 device.cpp:95] Cannot find any dGPU!
*** Check failure stack trace: ***
@ 0x7f4c8ee2a5cd google::LogMessage::Fail()
@ 0x7f4c8ee2c433 google::LogMessage::SendToLog()
@ 0x7f4c8ee2a15b google::LogMessage::Flush()
@ 0x7f4c8ee2ce1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f4c8f4c4c5b caffe::Device::Init()
@ 0x6f04d2 main
@ 0x7f4c8cf1c830 __libc_start_main
@ 0x6f3f59 _start
@ (nil) (unknown)
Aborted (core dumped)
src/caffe/test/CMakeFiles/runtest.dir/build.make:57: recipe for target 'src/caffe/test/CMakeFiles/runtest' failed
make[3]: *** [src/caffe/test/CMakeFiles/runtest] Error 134
CMakeFiles/Makefile2:328: recipe for target 'src/caffe/test/CMakeFiles/runtest.dir/all' failed
make[2]: *** [src/caffe/test/CMakeFiles/runtest.dir/all] Error 2
CMakeFiles/Makefile2:335: recipe for target 'src/caffe/test/CMakeFiles/runtest.dir/rule' failed
make[1]: *** [src/caffe/test/CMakeFiles/runtest.dir/rule] Error 2
Makefile:240: recipe for target 'runtest' failed
make: *** [runtest] Error 2

`


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@gujunli
Copy link
Contributor

gujunli commented Nov 22, 2016

Oh the log said that null platform pointer. Opencl caffe did not find your platform. Suggest you go to arc/caffe/device.cpp, device::init() look at the logic, print out more info

Sent from my iPhone

On Nov 22, 2016, at 12:45 PM, gujunli gujunli@gmail.com wrote:

It seems that you have a really old GPU with OpenCL 1.1.
I suspect the error is caused by the incompatibility
Thanks
Junli

Sent from my iPhone

On Nov 22, 2016, at 12:29 PM, janchk notifications@github.com wrote:

Lol. Same problem.

My clinfo

`Number of platforms 1
Platform Name Clover
Platform Vendor Mesa
Platform Version OpenCL 1.1 Mesa 13.1.0-devel (git-151aeca 2016-11-13 xenial-oibaf-ppa)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix MESA

Platform Name Clover
Number of devices 1
Device Name AMD OLAND (DRM 2.43.0 / 4.4.0-47-generic, LLVM 3.9.0)
Device Vendor AMD
Device Vendor ID 0x1002
Device Version OpenCL 1.1 Mesa 13.1.0-devel (git-151aeca 2016-11-13 xenial-oibaf-ppa)
Driver Version 13.1.0-devel
Device OpenCL C Version OpenCL C 1.1
Device Type GPU
Device Profile FULL_PROFILE
Max compute units 6
Max clock frequency 825MHz
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
=== CL_PROGRAM_BUILD_LOG ===
:0:0: in function sum void (float addrspace(1), float addrspace(1), float addrspace(1)): unsupported call to function get_local_size
Preferred work group size multiple :0:0: in function sum void (float addrspace(1), float addrspace(1), float addrspace(1)): unsupported call to function get_local_size

Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 2 / 2
half 0 / 0 (n/a)
float 4 / 4
double 2 / 2 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Address bits 64, Little-Endian
Global memory size 2147483648 (2GiB)
Error Correction support No
Max memory allocation 1503238553 (1.4GiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type None
Image support No
Local memory type Local
Local memory size 32768 (32KiB)
Max constant buffer size 1503238553 (1.4GiB)
Max number of constant args 16
Max size of kernel argument 1024
Queue properties
Out-of-order execution No
Profiling Yes
Profiling timer resolution 0ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Device Available Yes
Compiler Available Yes
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_fp64

NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform
clCreateContext(NULL, ...) [default] No platform
clCreateContext(NULL, ...) [other] Success [MESA]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) No platform
`

My error log

`F1122 23:19:05.806424 21980 device.cpp:95] Cannot find any dGPU!
*** Check failure stack trace: ***
@ 0x7f4c8ee2a5cd google::LogMessage::Fail()
@ 0x7f4c8ee2c433 google::LogMessage::SendToLog()
@ 0x7f4c8ee2a15b google::LogMessage::Flush()
@ 0x7f4c8ee2ce1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f4c8f4c4c5b caffe::Device::Init()
@ 0x6f04d2 main
@ 0x7f4c8cf1c830 __libc_start_main
@ 0x6f3f59 _start
@ (nil) (unknown)
Aborted (core dumped)
src/caffe/test/CMakeFiles/runtest.dir/build.make:57: recipe for target 'src/caffe/test/CMakeFiles/runtest' failed
make[3]: *** [src/caffe/test/CMakeFiles/runtest] Error 134
CMakeFiles/Makefile2:328: recipe for target 'src/caffe/test/CMakeFiles/runtest.dir/all' failed
make[2]: *** [src/caffe/test/CMakeFiles/runtest.dir/all] Error 2
CMakeFiles/Makefile2:335: recipe for target 'src/caffe/test/CMakeFiles/runtest.dir/rule' failed
make[1]: *** [src/caffe/test/CMakeFiles/runtest.dir/rule] Error 2
Makefile:240: recipe for target 'runtest' failed
make: *** [runtest] Error 2

`


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@janchk
Copy link

janchk commented Nov 22, 2016

gujiunli, Meaning this?

cl_int Device::Init(int deviceId) {

  DisplayPlatformInfo();

  clGetPlatformIDs(0, NULL, &numPlatforms);
  cl_platform_id PlatformIDs[numPlatforms];
  clGetPlatformIDs(numPlatforms, PlatformIDs, NULL);

  size_t nameLen;
  cl_int res = clGetPlatformInfo(PlatformIDs[0], CL_PLATFORM_NAME, 64,
      platformName, &nameLen);
  if (res != CL_SUCCESS) {
    fprintf(stderr, "Err: Failed to Get Platform Info\n");
    return 0;
  }
  platformName[nameLen] = 0;

  GetDeviceInfo();
  cl_uint uiNumDevices;
  cl_bool unified_memory = false;
  clGetDeviceIDs(PlatformIDs[0], CL_DEVICE_TYPE_GPU, 0, NULL, &numDevices);
  uiNumDevices = numDevices;
  if (0 == uiNumDevices) {
    LOG(FATAL) << "Err: No GPU devices";
  } else {
    pDevices = (cl_device_id *) malloc(uiNumDevices * sizeof(cl_device_id));
    OCL_CHECK(
        clGetDeviceIDs(PlatformIDs[0], CL_DEVICE_TYPE_GPU, uiNumDevices,
            pDevices, &uiNumDevices));
    if (deviceId == -1) {
      int i;
      for (i = 0; i < (int) uiNumDevices; i++) {
        clGetDeviceInfo(pDevices[i], CL_DEVICE_HOST_UNIFIED_MEMORY,
            sizeof(cl_bool), &unified_memory, NULL);
        if (!unified_memory) { //skip iGPU
          //we pick the first dGPU we found
          pDevices[0] = pDevices[i];
          device_id = i;
          LOG(INFO) << "Picked default device type : dGPU " << device_id;
          break;
        }
      }
      if (i == uiNumDevices) {
        LOG(FATAL) << "Cannot find any dGPU! ";
      }
    } else if (deviceId >= 0 && deviceId < uiNumDevices) {
      pDevices[0] = pDevices[deviceId];
      device_id = deviceId;
      LOG(INFO) << "Picked device type : GPU " << device_id;
    } else {
      LOG(FATAL) << "  Invalid GPU deviceId! ";
    }
  }

  Context = clCreateContext(NULL, 1, pDevices, NULL, NULL, NULL);
  if (NULL == Context) {
    fprintf(stderr, "Err: Failed to Create Context\n");
    return 0;
  }
  CommandQueue = clCreateCommandQueue(Context, pDevices[0],
      CL_QUEUE_PROFILING_ENABLE, NULL);
  CommandQueue_helper = clCreateCommandQueue(Context, pDevices[0],
      CL_QUEUE_PROFILING_ENABLE, NULL);
  if (NULL == CommandQueue || NULL == CommandQueue_helper) {
    fprintf(stderr, "Err: Failed to Create Commandqueue\n");
    return 0;
  }
  BuildProgram (oclKernelPath);
  row = clblasRowMajor;
  col = clblasColumnMajor;
  return 0;
}

Probably it's can't go into

for (i = 0; i < (int) uiNumDevices; i++) {
        clGetDeviceInfo(pDevices[i], CL_DEVICE_HOST_UNIFIED_MEMORY,
            sizeof(cl_bool), &unified_memory, NULL);
        if (!unified_memory) { //skip iGPU
          //we pick the first dGPU we found
          pDevices[0] = pDevices[i];
          device_id = i;
          LOG(INFO) << "Picked default device type : dGPU " << device_id;
          break;
        }

Because uiNumDevices == 0.
which goes from clGetDeviceIDs
But I have no idea how to deal with this.
BTW. I have ubuntu 16.04. and radeon HD 8750m. If that matter.

@janchk
Copy link

janchk commented Nov 22, 2016

Or may be OCL_CHECK is failed for &uiNumDevices?

@gujunli
Copy link
Contributor

gujunli commented Nov 23, 2016

you can insert a few printfs in this file to trace down whether the
GPU are detected correctly. EG. first print out the num of devices

clGetDeviceIDs(PlatformIDs[0], CL_DEVICE_TYPE_GPU, 0, NULL, &numDevices);

uiNumDevices = numDevices;

printf("#device %d \n", uiNumDevices);

if (0 == uiNumDevices) {

LOG(FATAL) << "Err: No GPU devices";

} else {
pDevices = (cl_device_id *) malloc(uiNumDevices * sizeof(cl_device_id));

On Tue, Nov 22, 2016 at 1:56 PM, janchk notifications@github.com wrote:

Or may be OCL_CHECK is failed for &uiNumDevices?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#37 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAFVvntdvxnfu9oCEh9UXX7619_Cdcvaks5rA2UUgaJpZM4H0BUC
.

@janchk
Copy link

janchk commented Nov 23, 2016

@gujunli I've found problem. Variable unified_memory == 1. Because of this it can't get into

if (!unified_memory) { //skip iGPU
          //we pick the first dGPU we found
          pDevices[0] = pDevices[i];
          device_id = i;
          LOG(INFO) << "Picked default device type : dGPU " << device_id;
          break;

//skip iGPU What is that?

@gujunli
Copy link
Contributor

gujunli commented Nov 24, 2016 via email

@janchk
Copy link

janchk commented Nov 24, 2016

@gujunli
Here what i've got, after commenting if (!unified_memory)

Current device id: 0
#device_afterclget 1 
#device_afteroclcheck 1 
#device_beforecycle 1 
unified 1 
Err: Failed to build program
Note: Google Test filter = -*GPU*
Note: Randomizing tests' orders with a seed of 37283 .
[==========] Running 1297 tests from 201 test cases.
[----------] Global test environment set-up.
[----------] 2 tests from SoftmaxLayerTest/2, where TypeParam = caffe::GPUDevice<float>
[ RUN      ] SoftmaxLayerTest/2.TestGradient
#device_afterclget 1 
#device_afteroclcheck 1 
#device_beforecycle 1 
unified 1 
Err: Failed to build program
*** Aborted at 1479990162 (unix time) try "date -d @1479990162" if you are using GNU date ***
PC: @     0x7f27137fe578 clCreateKernel
*** SIGSEGV (@0x110) received by PID 29204 (TID 0x7f271712bac0) from PID 272; stack trace: ***
    @     0x7f27168403e0 (unknown)
    @     0x7f27137fe578 clCreateKernel
    @     0x7f2716cb32a2 caffe::SyncedMemory::ocl_setup()
    @     0x7f2716caea4b caffe::Blob<>::Reshape()
    @     0x7f2716caec6f caffe::Blob<>::Reshape()
    @     0x7f2716caed0c caffe::Blob<>::Blob()
    @           0x78ed8c caffe::SoftmaxLayerTest<>::SoftmaxLayerTest()
    @           0x78efbb testing::internal::TestFactoryImpl<>::CreateTest()
    @           0xa7bfb3 testing::internal::HandleExceptionsInMethodIfSupported<>()
    @           0xa74c93 testing::TestInfo::Run()
    @           0xa74e25 testing::TestCase::Run()
    @           0xa769bf testing::internal::UnitTestImpl::RunAllTests()
    @           0xa76ce3 testing::UnitTest::Run()
    @           0x6f04df main
    @     0x7f27146fd830 __libc_start_main
    @           0x6f3f59 _start
    @                0x0 (unknown)
Segmentation fault (core dumped)
src/caffe/test/CMakeFiles/runtest.dir/build.make:57: recipe for target 'src/caffe/test/CMakeFiles/runtest' failed
make[3]: *** [src/caffe/test/CMakeFiles/runtest] Error 139
CMakeFiles/Makefile2:328: recipe for target 'src/caffe/test/CMakeFiles/runtest.dir/all' failed
make[2]: *** [src/caffe/test/CMakeFiles/runtest.dir/all] Error 2
CMakeFiles/Makefile2:335: recipe for target 'src/caffe/test/CMakeFiles/runtest.dir/rule' failed
make[1]: *** [src/caffe/test/CMakeFiles/runtest.dir/rule] Error 2
Makefile:240: recipe for target 'runtest' failed
make: *** [runtest] Error 2

Here what lspci | grep VGA does on my system

00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Mars [Radeon HD 8670A/8670M/8750M] (rev ff)

I want to use caffe with my AMD discrete graphics.

Also i tried DRI_PRIME=1 make runtest. Have no effect.

@naibaf7
Copy link

naibaf7 commented Nov 24, 2016

Maybe try this, since the AMD branch is no longer actively maintained:
https://github.com/bvlc/caffe/tree/opencl

@janchk
Copy link

janchk commented Nov 24, 2016

@naibaf7
I tried this branch.
Have no idea what to do.

F1124 19:42:10.999312 22298 syncedmem.cpp:201] Check failed: mapped_ptr == cpu_ptr_ (0x7fb1e00db000 vs. 0x208ab20) Device claims it support zero copy but failed to create correct user ptr buffer
*** Check failure stack trace: ***
    @     0x7fb1dee3c5cd  google::LogMessage::Fail()
    @     0x7fb1dee3e433  google::LogMessage::SendToLog()
    @     0x7fb1dee3c15b  google::LogMessage::Flush()
    @     0x7fb1dee3ee1e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7fb1dfc44bae  caffe::SyncedMemory::mutable_gpu_data()
    @           0xb93f06  caffe::RandomNumberGeneratorTest_TestRngUniformTimesUniformGPU_Test<>::TestBody_Impl()
    @           0xdedaf3  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @           0xde68aa  testing::Test::Run()
    @           0xde69f8  testing::TestInfo::Run()
    @           0xde6b05  testing::TestCase::Run()
    @           0xde858f  testing::internal::UnitTestImpl::RunAllTests()
    @           0xde88c3  testing::UnitTest::Run()
    @           0x8be529  main
    @     0x7fb1dcd23830  __libc_start_main
    @           0x8c4ce9  _start
    @              (nil)  (unknown)
Aborted (core dumped)
src/caffe/test/CMakeFiles/runtest.dir/build.make:57: recipe for target 'src/caffe/test/CMakeFiles/runtest' failed
make[3]: *** [src/caffe/test/CMakeFiles/runtest] Error 134
CMakeFiles/Makefile2:328: recipe for target 'src/caffe/test/CMakeFiles/runtest.dir/all' failed
make[2]: *** [src/caffe/test/CMakeFiles/runtest.dir/all] Error 2
CMakeFiles/Makefile2:335: recipe for target 'src/caffe/test/CMakeFiles/runtest.dir/rule' failed
make[1]: *** [src/caffe/test/CMakeFiles/runtest.dir/rule] Error 2
Makefile:240: recipe for target 'runtest' failed
make: *** [runtest] Error 2

I've got this. And lot of code above this.
Can Oibaf's driver be reason for that?

@naibaf7
Copy link

naibaf7 commented Nov 24, 2016

@janchk
Yes if BOTH branches don't work it's definitely a failure with the drivers.
Seems the driver reports unified (zero-copy) memory between your CPU and GPU, but this is not actually the case.
So it's broken.
It seems you use the CLOVER OpenCL implementation. This never worked for me and I usually disable it on my system. Use a FGLRX or AMDGPU-PRO implementation instead for better results.

@janchk
Copy link

janchk commented Nov 24, 2016

@naibaf7 As I know FGLRX does not work on ubuntu 16.04. And at the same time AMDGPU-PRO does not support my graphics card (radeon hd 8750m). Should I use old version of ubuntu instead?

@gujunli
Copy link
Contributor

gujunli commented Nov 24, 2016 via email

@naibaf7
Copy link

naibaf7 commented Nov 24, 2016

@janchk
Unfortunately yes, Ubuntu 14.04 until AMDGPU-PRO driver becomes available for the Mars (HD8750M) chip. You might be able to make it work if you find a firmware binary (https://git.kernel.org/cgit/linux/kernel/git/firmware/linux-firmware.git/tree/amdgpu are the supported chipsets) & compile your own kernel with AMDGPU CIK support.
Another option is an alternative OS like Fedora 25 with downgraded XORG (1.7) and kernel (4.4) and modified FGLRX, but also highly complicated. I can send you the modified FGLRX and instructions if you want.

Or use Windows, the Caffe OpenCL branch will be available under Windows at latest end of January.

@janchk
Copy link

janchk commented Nov 25, 2016

@naibaf7 I would like to try method that you advised. If it is not difficult for you.

@naibaf7
Copy link

naibaf7 commented Nov 25, 2016

Here is something that works on Fedora 23/24/25, instructions and all: https://github.com/imageguy/fglrx-for-Fedora
You basically just need to install Fedora as you like, and downgrade with:

sudo dnf downgrade –-allowerasing –-releasever=21 xorg-x11-server-Xorg xorg-x11-server-common

Make sure to never update those packages or you'll get no GPU.
Also download some 4.7.x kernel from here (final version, no RC kernel):
http://koji.fedoraproject.org/koji/packageinfo?packageID=8
and install it before installing the AMD GPU driver.

This is how you install the driver in the ZIP below:

./ati-installer.sh 15.302 --install

Here is a driver I already patched that you can try: http://tingy.pw/fglrx.zip (won't be up forever...)
As far as I remember that was patched for 4.4 to 4.7, but you gotta try and see what happens. It's OpenCL 2.0 and actually still faster than AMDGPU-PRO and CLOVER... ;).

@janchk
Copy link

janchk commented Nov 25, 2016

@naibaf7
Here my installation progress. Probably I made some mistakes.
1)Installed fedora 25
2)make downgrade xorg
3)install kernel-4.7.9-200.fc24.x86_64.rpm (core, modules, main)
4)boot with this kernel
5) try to ./ati-installer.sh 15.302 --install, got error Please install the required pre-requisites before
Find them for ubuntu but not for fedora.
6) try with --force and got another error
P.S. may be I should switch thread to write to.

@naibaf7
Copy link

naibaf7 commented Nov 25, 2016

@janchk
What pre-requisites is it complaining about? What is the error with force? The steps you did seem correct so far!

@janchk
Copy link

janchk commented Nov 26, 2016

@naibaf7 When I install without --force.

Supported adapter detected.
Check if system has the tools required for installation.
Uninstalling any previously installed drivers.
Unloading drm module...
rmmod: ERROR: Module drm is in use by: i915 drm_kms_helper
[Message] Kernel Module : Trying to install a precompiled kernel module.
[Message] Kernel Module : Precompiled kernel module version mismatched.
[Message] Kernel Module : Found kernel module build environment, generating kernel module now.
AMD kernel module generator version 2.1
Error:
kernel includes at /lib/modules/4.7.9-200.fc24.x86_64/build/include do not match current kernel.
they are versioned as ""
instead of "4.7.9-200.fc24.x86_64".
you might need to adjust your symlinks:
- /usr/include
- /usr/src/linux
[Error] Kernel Module : Failed to compile kernel module - please consult readme.
[Reboot] Kernel Module : dracut

I have solved problem with pre-requisites. So i got this. Same log i have with --force flag.

@naibaf7
Copy link

naibaf7 commented Nov 26, 2016

Oooh, this already looks quite good.
3)install kernel-4.7.9-200.fc24.x86_64.rpm (core, modules, main)
You might have missed to install the kernel-headers and/or kernel-devel package for this kernel.
What does uname -r give you?

@janchk
Copy link

janchk commented Nov 26, 2016

@naibaf7 Yup don't install them both. uname -r return 4.7.9-200.fc24.x86_64

@janchk
Copy link

janchk commented Nov 26, 2016

@naibaf7 After system reinstallation and making other steps i've got this fglrx-install.log. And then GNOME crashed and forced me to reboot. Then it stuck on boot log.

@naibaf7
Copy link

naibaf7 commented Nov 26, 2016

@janchk
Hmm then the driver has not been patched up to 4.7 prior to installation, sorry.
If you look for the chunks that fail during compilation, they should all be handled by the patches (.diff) from here:
https://github.com/imageguy/fglrx-for-Fedora
if you still have the nerves, you can uninstall fglrx from rescue or console mode (using the installer) and reinstall it after applying the above patches.

@janchk
Copy link

janchk commented Nov 26, 2016

@naibaf7
When I trying to patch fglrx that you've sent.

[root@localhost fglrx-install.MtTczW]# patch -p1 </root/Downloads/fglrx-for-Fedora-master/fglrx_kernel_4.7.diff
patching file common/lib/modules/fglrx/build_mod/firegl_public.c
Hunk #1 succeeded at 615 (offset -16 lines).
Hunk #2 succeeded at 3200 (offset -20 lines).
Hunk #3 succeeded at 3218 (offset -20 lines).
Hunk #4 succeeded at 3229 (offset -20 lines).
Hunk #5 succeeded at 3405 (offset -20 lines).
Hunk #6 succeeded at 3415 (offset -20 lines).
Hunk #7 succeeded at 4502 with fuzz 2 (offset -20 lines).
Hunk #8 succeeded at 4525 with fuzz 2 (offset -15 lines).
Hunk #9 succeeded at 4560 with fuzz 2 (offset -11 lines).
Hunk #10 succeeded at 4582 with fuzz 2 (offset -6 lines).
Hunk #11 FAILED at 6475.
1 out of 11 hunks FAILED -- saving rejects to file common/lib/modules/fglrx/build_mod/firegl_public.c.rej
patching file common/lib/modules/fglrx/build_mod/firegl_public.h

And this is what I've got , when patching driver from here: https://github.com/imageguy/fglrx-for-Fedora

[root@localhost fglrx-install.f1TQFY]# patch -p1 </root/Downloads/fglrx-for-Fedora-master/fglrx_kernel_4.7.diff
patching file common/lib/modules/fglrx/build_mod/firegl_public.c
patching file common/lib/modules/fglrx/build_mod/firegl_public.h

Should I worry about Hunk #11 FAILED at 6475. ?

Update. Yes I should. Another boot failed and second log
fglrx-install1.txt

@naibaf7
Copy link

naibaf7 commented Nov 26, 2016

Hmm the only reason for it still failing after the latest patches I could only imagine that the kernel (XSTATE_FP, __fgl_cmpxchg, fpu_xsave(fpu)) changed too substantially between 4.7.2 (latest tested kernel) and 4.7.9.
Seems taking the 4.7.2-FC24 kernel is the last thing you can try to do :( no one bothered to update the patches after that.

@janchk
Copy link

janchk commented Dec 3, 2016

@naibaf7 Thank you for your support. I have solved the problem. Used ubuntu 15.10, latest AMD drivers from the official site and your branch of caffe using viennacl. Oh,
so much performance vs cpu calculation.
screenshot from 2016-12-03 16-58-12

@naibaf7
Copy link

naibaf7 commented Dec 3, 2016

@janchk
Cool! Are you happy with the performance? You can try to enable LibDNN compilation in the Makefile and switch from ViennaCL to clBLAS (from AMD) to get even more performance out of it :)

@janchk
Copy link

janchk commented Dec 3, 2016

@naibaf7 MANY thanks to you dude! Using my old configuration I faced with error
F1203 19:00:07.040632 5162 syncedmem.cpp:215] Check failed: 0 == err (0 vs. -61) OpenCL buffer allocation of size 2707024896 failed.
when trying to handle big files.
Following your advice I fixed it!
And much more performance of course.
P.s. I would be glad to test your Windows branch after it realise.

@naibaf7
Copy link

naibaf7 commented Dec 3, 2016

@janchk OK, watch out for updates on my OpenCL branch, it will show when Windows is ready :)
Glad I could help, enjoy Caffe :)

It is true that F1203 19:00:07.040632 5162 syncedmem.cpp:215] Check failed: 0 == err (0 vs. -61) OpenCL buffer allocation of size 2707024896 failed. is because you run out of GPU memory.
But LibDNN uses much less memory for convolutions than the Caffe engine, thus you can load bigger models. It's similar to nVidia's cuDNN.

@dasha-5555-5
Copy link

Hi! I have this problem when tried to do runtest on Ubuntu 16.04:
make runtest
[100%] Built target proto
[100%] Built target caffe
[100%] Built target gtest
[100%] Linking CXX executable ../../../test/test.testbin
[100%] Built target test.testbin
Current device id: 0
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
#device 1
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
X server found. dri2 connection failed!
DRM_IOCTL_I915_GEM_APERTURE failed: Invalid argument
Assuming 131072kB available aperture size.
May lead to reduced performance or incorrect rendering.
get chip id failed: -1 [22]
param: 4, val: 0
F0521 19:01:47.042328 4111 device.cpp:96] Cannot find any dGPU!
*** Check failure stack trace: ***
@ 0x7f88209955cd google::LogMessage::Fail()
@ 0x7f8820997433 google::LogMessage::SendToLog()
@ 0x7f882099515b google::LogMessage::Flush()
@ 0x7f8820997e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f8820fcdd3b caffe::Device::Init()
@ 0x6f1742 main
@ 0x7f881e6d2830 __libc_start_main
@ 0x6f3f09 _start
@ (nil) (unknown)
Aborted (core dumped)
Any help will be appreciating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants