Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Kmeans OpenACC code #3

Closed
ouankou opened this issue Sep 3, 2022 · 1 comment
Closed

Incorrect Kmeans OpenACC code #3

ouankou opened this issue Sep 3, 2022 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@ouankou
Copy link
Collaborator

ouankou commented Sep 3, 2022

The OpenACC directives seem fine and can be mapped to OpenMP code. However, there are many errors in the source.

For example:

  1. Missing headers: cluster(), allocateMemory(), deallocateMemory(), etc.
  2. Mismatch data type usage: float* features vs float** features.
  3. Functions with incorrect code: undeclared variables in allocateMemory() and deallocateMemory()
  4. Mixed C++ and CUDA code: CUDA API tex1Dfetch().

We can either fix them all or start with the official OpenMP CPU version.

@ouankou ouankou added the bug Something isn't working label Sep 3, 2022
@ouankou ouankou self-assigned this Sep 3, 2022
@ouankou
Copy link
Collaborator Author

ouankou commented Sep 9, 2022

The OpenMP GPU offloading version is created from scratch based on the official OpenMP and CUDA versions. Then the OpenACC version is created based on the OpenMP GPU offloading version.
The original OpenMP CPU version uses omp_get_thread_num() to do work chunking for multiple threads. However, OpenACC doesn't have similar APIs. We have to calculate the chunk id manually (check this commit passlab/NeoRodinia@3d88f2a).

The original OpenACC version doesn't work at all, so it's abandoned.

@ouankou ouankou closed this as completed Sep 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant