Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock on GPU when using NEW and matrix size is large #527

Open
QingleiCao opened this issue Apr 25, 2023 · 1 comment
Open

Deadlock on GPU when using NEW and matrix size is large #527

QingleiCao opened this issue Apr 25, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@QingleiCao
Copy link
Contributor

Describe the bug

There is a deadlock when running on GPU. This deadlock happens when using NEW with a large matrix size (memory does not fit into GPU memory).

To Reproduce

Steps to reproduce the behavior:

  1. Checkout version 'commit-hash'
  2. Compile with the following options '....' [e.g., head config.log]
  3. Run test '....' [e.g. ctest --output-on-failure -R xyz]
  4. See error

Expected behavior

A clear and concise description of what you expected to happen.

Environment (please complete the following information):

  • PaRSEC version: [e.g., git hash]
  • OS: [e.g. CentOS/7]
  • Compiler: [e.g. GCC/7.3.0]
  • MPI version: [e.g. Open MPI/4.1.2]

Additional context

Add any other context about the problem here.
The content of the config.log file can be useful in some cases.

@QingleiCao QingleiCao added the bug Something isn't working label Apr 25, 2023
@QingleiCao
Copy link
Contributor Author

This is because when handling PARSEC_FLOW_ACCESS_WRITE in device_cuda_module.c, NEW is not considered in some cases when the matrix can not fit into GPU memory. But I need to understand more about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant