Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mp serialization for integer nn.Parameter on CUDA #56529

Closed

Conversation

Alvant
Copy link
Contributor

@Alvant Alvant commented Apr 20, 2021

Fixes #56342

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Apr 20, 2021

💊 CI failures summary and remediations

As of commit 49a95e7 (more details on the Dr. CI page):


  • 3/3 failures possibly* introduced in this PR
    • 1/3 non-scanned failure(s)

2 failures not recognized by patterns:

Job Step Action
GitHub Actions python2-setup-compat Unknown 🔁 rerun
GitHub Actions auto-label-rocm Unknown 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Copy link
Collaborator

@ngimel ngimel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix

@facebook-github-bot
Copy link
Contributor

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@@ -833,12 +833,21 @@ def test_cuda_parameter_sharing(self):
@unittest.skipIf(NO_MULTIPROCESSING_SPAWN, "Disabled for environments that \
don't support multiprocessing with spawn start method")
def test_integer_parameter_serialization(self):
iparam = torch.nn.Parameter(torch.tensor(0, dtype=torch.int64), requires_grad=False)
for device in ['cpu', 'cuda']:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test errors are real, you can skip this test if cuda ipc is not available, like the tests above. Also, nit, prefer using tuples ('cpu', 'cuda') over lists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngimel oops! my mistake. Thanks for the advice.

I split the test in two: for CPU and for CUDA. The first one is going to be run regardless of the availability of CUDA IPC. The second test is skipped if CUDA is not available.

@codecov
Copy link

codecov bot commented Apr 21, 2021

Codecov Report

Merging #56529 (49a95e7) into master (ea4af15) will increase coverage by 0.24%.
The diff coverage is 50.00%.

@@            Coverage Diff             @@
##           master   #56529      +/-   ##
==========================================
+ Coverage   77.54%   77.78%   +0.24%     
==========================================
  Files        1923     1923              
  Lines      190853   190854       +1     
==========================================
+ Hits       147996   148457     +461     
+ Misses      42857    42397     -460     

@facebook-github-bot
Copy link
Contributor

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ngimel merged this pull request in bac4cfd.

krshrimali pushed a commit to krshrimali/pytorch that referenced this pull request May 19, 2021
Summary:
Fixes pytorch#56342

Pull Request resolved: pytorch#56529

Reviewed By: albanD

Differential Revision: D27896094

Pulled By: ngimel

fbshipit-source-id: fe817781eb7139ea57c78acfd56e7c11b61eb4ed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multiprocessing with tensors on GPU: "Only Tensors of floating point dtype can require gradients"
4 participants