Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Backprop OpenACC code #5

Closed
ouankou opened this issue Sep 3, 2022 · 3 comments
Closed

Incorrect Backprop OpenACC code #5

ouankou opened this issue Sep 3, 2022 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@ouankou
Copy link
Collaborator

ouankou commented Sep 3, 2022

At runtime, the function bpnn_layerforward in the file backprop.c throws the following error:

FATAL ERROR: variable in data clause is partially present on the device: name=conn
 file: <some path>/openacc/backprop/backprop.c bpnn_layerforward line:229

It seems that the data is not fully mapped.
In the file backprop_kernel.c the related data mapping is copyin(input_weights[0:in][0:hid], hidden_weights[0:hid][0:out]) and the function call is bpnn_layerforward(input_units, hidden_units, input_weights, in, hid);.
In the function causing problem, the data mapping is: #pragma acc parallel loop present(l1[0:n1],l2[0:n2],conn[0:n1*n2]), where conn is input_weights.

We can't make two copy clauses use the same mapping indices, otherwise, at runtime, it throws the error:

Failing in Thread:1
call to cuStreamSynchronize returned error 700: Illegal address during kernel execution
@ouankou ouankou added the bug Something isn't working label Sep 3, 2022
@ouankou ouankou self-assigned this Sep 3, 2022
@ouankou
Copy link
Collaborator Author

ouankou commented Sep 4, 2022

GCC doesn't support 2D array, we may have to rewrite the code to use 1D array.

backprop.c: In function ‘bpnn_adjust_weights’:
backprop.c:296:14: error: array section is not contiguous in ‘map’ clause
  296 |       present(w[0:nly][0:ndelta],oldw[0:nly][0:ndelta])

In the corresponding OpenMP GPU version, the compilation by LLVM is fine, but the execution leads to illegal memory access on the device. It's probably the same reason.

@ouankou
Copy link
Collaborator Author

ouankou commented Sep 5, 2022

For the corresponding OpenMP GPU version, with commit 45a787f, the program can be compiled by clang now, but with the following warnings.

imagenet.c:18:32: warning: implicit conversion from 'int' to 'float' changes value from 2147483647 to 2147483648 [-Wimplicit-const-int-float-conversion]
    units[k] = (float)rand() / RAND_MAX;
                             ~ ^~~~~~~~
/usr/include/stdlib.h:86:18: note: expanded from macro 'RAND_MAX'
#define RAND_MAX        2147483647
                        ^~~~~~~~~~
1 warning generated.
imagenet.c:18:32: warning: implicit conversion from 'int' to 'float' changes value from 2147483647 to 2147483648 [-Wimplicit-const-int-float-conversion]
    units[k] = (float)rand() / RAND_MAX;
                             ~ ^~~~~~~~
/usr/include/stdlib.h:86:18: note: expanded from macro 'RAND_MAX'
#define RAND_MAX        2147483647
                        ^~~~~~~~~~
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:43:66: note: used here
__DEVICE__ void __attribute__((overloadable)) __brkpt(int __a) { __brkpt(); }
                                                                 ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1189:10: note: used here
  return __bool2mask(__vseteq2(__a, __b), 16);
         ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1189:22: note: used here
  return __bool2mask(__vseteq2(__a, __b), 16);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1199:22: note: used here
  return __bool2mask(__vseteq4(__a, __b), 8);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1209:22: note: used here
  return __bool2mask(__vsetges2(__a, __b), 16);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1219:22: note: used here
  return __bool2mask(__vsetges4(__a, __b), 8);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1229:22: note: used here
  return __bool2mask(__vsetgeu2(__a, __b), 16);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1239:22: note: used here
  return __bool2mask(__vsetgeu4(__a, __b), 8);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1249:22: note: used here
  return __bool2mask(__vsetgts2(__a, __b), 16);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1259:22: note: used here
  return __bool2mask(__vsetgts4(__a, __b), 8);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1269:22: note: used here
  return __bool2mask(__vsetgtu2(__a, __b), 16);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1279:22: note: used here
  return __bool2mask(__vsetgtu4(__a, __b), 8);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1289:22: note: used here
  return __bool2mask(__vsetles2(__a, __b), 16);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1299:22: note: used here
  return __bool2mask(__vsetles4(__a, __b), 8);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1309:22: note: used here
  return __bool2mask(__vsetleu2(__a, __b), 16);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1319:22: note: used here
  return __bool2mask(__vsetleu4(__a, __b), 8);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1329:22: note: used here
  return __bool2mask(__vsetlts2(__a, __b), 16);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1339:22: note: used here
  return __bool2mask(__vsetlts4(__a, __b), 8);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1349:22: note: used here
  return __bool2mask(__vsetltu2(__a, __b), 16);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1359:22: note: used here
  return __bool2mask(__vsetltu4(__a, __b), 8);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1369:22: note: used here
  return __bool2mask(__vsetne2(__a, __b), 16);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1379:22: note: used here
  return __bool2mask(__vsetne4(__a, __b), 8);
                     ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1399:21: note: used here
    unsigned mask = __vcmpgts2(__a, __b);
                    ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1493:60: note: used here
__DEVICE__ unsigned int __vneg2(unsigned int __a) { return __vsub2(0, __a); }
                                                           ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1502:60: note: used here
__DEVICE__ unsigned int __vneg4(unsigned int __a) { return __vsub4(0, __a); }
                                                           ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1511:10: note: used here
  return __vsubss2(0, __a);
         ^
/opt/llvm/llvm-14.x-install/lib/clang/14.0.0/include/__clang_cuda_device_functions.h:1521:10: note: used here
  return __vsubss4(0, __a);
         ^
1 warning generated.

The warning related to those headers seems to be a known bug. It was mentioned here.
https://www.mail-archive.com/llvm-bugs@lists.llvm.org/msg53641.html
LLVM 15 shows the same warning. If we remove the header stdlib.h in imagenet.c, the header warnings will be gone.

@ouankou
Copy link
Collaborator Author

ouankou commented Sep 6, 2022

The OpenMP GPU offloading version is created from scratch based on the official OpenMP version. Then the OpenACC version is created based on the OpenMP GPU offloading version. The warnings above don't affect compilation or execution, they will be revisited later.

The original OpenACC version doesn't work at all, so it's abandoned.

@ouankou ouankou closed this as completed Sep 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant