-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect Backprop OpenACC code #5
Comments
GCC doesn't support 2D array, we may have to rewrite the code to use 1D array. backprop.c: In function ‘bpnn_adjust_weights’:
backprop.c:296:14: error: array section is not contiguous in ‘map’ clause
296 | present(w[0:nly][0:ndelta],oldw[0:nly][0:ndelta]) In the corresponding OpenMP GPU version, the compilation by LLVM is fine, but the execution leads to illegal memory access on the device. It's probably the same reason. |
For the corresponding OpenMP GPU version, with commit 45a787f, the program can be compiled by clang now, but with the following warnings. imagenet.c:18:32: warning: implicit conversion from 'int' to 'float' changes value from 2147483647 to 2147483648 [-Wimplicit-const-int-float-conversion]
units[k] = (float)rand() / RAND_MAX;
~ ^~~~~~~~
/usr/include/stdlib.h:86:18: note: expanded from macro 'RAND_MAX'
#define RAND_MAX 2147483647
^~~~~~~~~~
1 warning generated.
imagenet.c:18:32: warning: implicit conversion from 'int' to 'float' changes value from 2147483647 to 2147483648 [-Wimplicit-const-int-float-conversion]
units[k] = (float)rand() / RAND_MAX;
~ ^~~~~~~~
/usr/include/stdlib.h:86:18: note: expanded from macro 'RAND_MAX'
#define RAND_MAX 2147483647
^~~~~~~~~~
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:43:66: note: used here
__DEVICE__ void __attribute__((overloadable)) __brkpt(int __a) { __brkpt(); }
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1189:10: note: used here
return __bool2mask(__vseteq2(__a, __b), 16);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1189:22: note: used here
return __bool2mask(__vseteq2(__a, __b), 16);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1199:22: note: used here
return __bool2mask(__vseteq4(__a, __b), 8);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1209:22: note: used here
return __bool2mask(__vsetges2(__a, __b), 16);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1219:22: note: used here
return __bool2mask(__vsetges4(__a, __b), 8);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1229:22: note: used here
return __bool2mask(__vsetgeu2(__a, __b), 16);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1239:22: note: used here
return __bool2mask(__vsetgeu4(__a, __b), 8);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1249:22: note: used here
return __bool2mask(__vsetgts2(__a, __b), 16);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1259:22: note: used here
return __bool2mask(__vsetgts4(__a, __b), 8);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1269:22: note: used here
return __bool2mask(__vsetgtu2(__a, __b), 16);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1279:22: note: used here
return __bool2mask(__vsetgtu4(__a, __b), 8);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1289:22: note: used here
return __bool2mask(__vsetles2(__a, __b), 16);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1299:22: note: used here
return __bool2mask(__vsetles4(__a, __b), 8);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1309:22: note: used here
return __bool2mask(__vsetleu2(__a, __b), 16);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1319:22: note: used here
return __bool2mask(__vsetleu4(__a, __b), 8);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1329:22: note: used here
return __bool2mask(__vsetlts2(__a, __b), 16);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1339:22: note: used here
return __bool2mask(__vsetlts4(__a, __b), 8);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1349:22: note: used here
return __bool2mask(__vsetltu2(__a, __b), 16);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1359:22: note: used here
return __bool2mask(__vsetltu4(__a, __b), 8);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1369:22: note: used here
return __bool2mask(__vsetne2(__a, __b), 16);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1379:22: note: used here
return __bool2mask(__vsetne4(__a, __b), 8);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1399:21: note: used here
unsigned mask = __vcmpgts2(__a, __b);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1493:60: note: used here
__DEVICE__ unsigned int __vneg2(unsigned int __a) { return __vsub2(0, __a); }
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1502:60: note: used here
__DEVICE__ unsigned int __vneg4(unsigned int __a) { return __vsub4(0, __a); }
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1511:10: note: used here
return __vsubss2(0, __a);
^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1521:10: note: used here
return __vsubss4(0, __a);
^
1 warning generated. The warning related to those headers seems to be a known bug. It was mentioned here. |
The OpenMP GPU offloading version is created from scratch based on the official OpenMP version. Then the OpenACC version is created based on the OpenMP GPU offloading version. The warnings above don't affect compilation or execution, they will be revisited later. The original OpenACC version doesn't work at all, so it's abandoned. |
At runtime, the function
bpnn_layerforward
in the filebackprop.c
throws the following error:It seems that the data is not fully mapped.
In the file
backprop_kernel.c
the related data mapping iscopyin(input_weights[0:in][0:hid], hidden_weights[0:hid][0:out])
and the function call isbpnn_layerforward(input_units, hidden_units, input_weights, in, hid);
.In the function causing problem, the data mapping is:
#pragma acc parallel loop present(l1[0:n1],l2[0:n2],conn[0:n1*n2])
, whereconn
isinput_weights
.We can't make two copy clauses use the same mapping indices, otherwise, at runtime, it throws the error:
Failing in Thread:1 call to cuStreamSynchronize returned error 700: Illegal address during kernel execution
The text was updated successfully, but these errors were encountered: