Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA: Race condition in parallel_scan #2681

Closed
crtrott opened this issue Jan 27, 2020 · 3 comments
Closed

CUDA: Race condition in parallel_scan #2681

crtrott opened this issue Jan 27, 2020 · 3 comments
Assignees
Labels
Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)

Comments

@crtrott
Copy link
Member

crtrott commented Jan 27, 2020

There was a bug in the parallel_scan implementation where the correct inter-block fence (__threadfence()) wasn't called. This would manifest as a wrong result of the scan.

@crtrott crtrott added Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos) bug - fix pushed to develop branch labels Jan 27, 2020
@ndellingwood
Copy link
Contributor

PRs #2666 and #2668

@crtrott crtrott self-assigned this Jan 28, 2020
@crtrott
Copy link
Member Author

crtrott commented Jan 28, 2020

@stanmoore1 found the reproducer which made it possible to find this thing ...

@ndellingwood
Copy link
Contributor

Closing this issue as this is now in master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)
Projects
None yet
Development

No branches or pull requests

2 participants