Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda.view_64bit test hangs on Power8+Kepler37 system - develop and 2.9.00 branches #2771

Closed
ndellingwood opened this issue Feb 17, 2020 · 4 comments
Assignees
Labels
Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)

Comments

@ndellingwood
Copy link
Contributor

ndellingwood commented Feb 17, 2020

This was detected on the White testbed (Power8+Kepler37) when enabling the large_memory_tests in nightlies.

Reproducer instructions (UVM enabled):

module load cuda/9.2.88 cmake/3.12.3 gcc/7.2.0

export CUDA_LAUNCH_BLOCKING=1
export CUDA_MANAGED_FORCE_DEVICE_ALLOC=1

<KOKKOS_PATH>/generate_makefile.bash --arch=Kepler37,Power8 --compiler=${KOKKOS_PATH}/bin/nvcc_wrapper --with-cuda-options="force_uvm,enable_lambda" --with-options=enable_large_mem_tests --with-cuda --kokkos-path=<KOKKOS_PATH>
@janciesko janciesko self-assigned this Feb 19, 2020
@janciesko
Copy link
Contributor

Reproduced. Investigating.

@crtrott crtrott added Blocks Promotion Overview issue for release-blocking bugs Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos) labels Feb 26, 2020
@crtrott crtrott self-assigned this Feb 26, 2020
@crtrott
Copy link
Member

crtrott commented Feb 29, 2020

Found it. This is a general bug, not just for that system :-(. View Initialization was only ever using default index type for its execution policy, which means 32bit indices internally. So when you hand it a size_t with the correct larger than 2B value you are in trouble. Have a fix coming ... .

@crtrott
Copy link
Member

crtrott commented Feb 29, 2020

Note this is not memory size larger than 2GB, but more than 2B elements (i.e. typically 8GB or 16GB allocations). The only realistic way to get in trouble on Summit or Sierra is probably with allocation a view of char. But on the 32GB Volta cards inside of DGX boxes it is not totally outside of what one can imagine, that a big graph with more than 2B elements gets allocated maybe, i.e. the list of all edges or so as a Nx2 array with N>1B.

@crtrott crtrott added InDevelop and removed Blocks Promotion Overview issue for release-blocking bugs labels Mar 1, 2020
@ndellingwood
Copy link
Contributor Author

@crtrott the fix from #2819 did not resolve the hang+timeout in view_64bit tests on White.
This still occurs with cuda/9.2.88+gcc/7.2.0 builds with UVM enabled.

@janciesko janciesko added this to the Tentative 3.1 Release milestone Mar 4, 2020
@crtrott crtrott closed this as completed Apr 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)
Projects
None yet
Development

No branches or pull requests

3 participants