Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix debug_launcher issues #413

Merged
merged 3 commits into from
May 31, 2022
Merged

Fix debug_launcher issues #413

merged 3 commits into from
May 31, 2022

Conversation

muellerzr
Copy link
Collaborator

Fix Debug Launcher + CUDA breakage with tests

What does this add?

This PR introduces fixes for tests that run debug_launcher. They must be performed only on the CPU, as otherwise at some point if any previous tests were ran on the GPU, CUDA has been initialized for this instance of python and as a result torch will complain and break saying that a multiprocessing error has occured.

Who is it for?

closes #402

Why is it needed?

Currently running all of the tests will crash the CUDA process, due to it trying to reinitialize CUDA. From there we had the Makefile split into 2 separate calls to pytest, but given that this breakage (and since they really only run on the CPU), these tests are now under a require_cpu tag that will only run if CUDA is not available.

Note: If these tests are run first, it will not break. But for keeping it simple for the user we are going with this methodology

What parts of the API does this impact?

User-facing:

Nothing

Internal structure:

If any test should be ran on the GPU, decorate it with:

from accelerate.test_utils import require_cpu

@require_cpu
def some_test_for_cpu_only(self):
...

@muellerzr muellerzr added the enhancement New feature or request label May 31, 2022
@muellerzr muellerzr requested a review from sgugger May 31, 2022 18:16
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented May 31, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

@muellerzr muellerzr merged commit 3b51d6e into main May 31, 2022
@muellerzr muellerzr deleted the fix-debug_launcher branch May 31, 2022 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issues with debug_launcher and CUDA
3 participants