Fix mp serialization for integer nn.Parameter on CUDA #56529

Alvant · 2021-04-20T20:57:48Z

facebook-github-bot · 2021-04-20T20:57:54Z

💊 CI failures summary and remediations

As of commit 49a95e7 (more details on the Dr. CI page):

3/3 failures possibly* introduced in this PR
- 1/3 non-scanned failure(s)

2 failures not recognized by patterns:

Job	Step	Action
^{python2-setup-compat}	^Unknown	🔁 rerun
^{auto-label-rocm}	^Unknown	🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

ngimel

Thanks for the fix

facebook-github-bot · 2021-04-20T22:42:54Z

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ngimel · 2021-04-21T01:09:01Z

test/test_multiprocessing.py

@@ -833,12 +833,21 @@ def test_cuda_parameter_sharing(self):
    @unittest.skipIf(NO_MULTIPROCESSING_SPAWN, "Disabled for environments that \
                     don't support multiprocessing with spawn start method")
    def test_integer_parameter_serialization(self):
-        iparam = torch.nn.Parameter(torch.tensor(0, dtype=torch.int64), requires_grad=False)
+        for device in ['cpu', 'cuda']:


Test errors are real, you can skip this test if cuda ipc is not available, like the tests above. Also, nit, prefer using tuples ('cpu', 'cuda') over lists.

@ngimel oops! my mistake. Thanks for the advice.

I split the test in two: for CPU and for CUDA. The first one is going to be run regardless of the availability of CUDA IPC. The second test is skipped if CUDA is not available.

codecov · 2021-04-21T11:33:07Z

Codecov Report

Merging #56529 (49a95e7) into master (ea4af15) will increase coverage by 0.24%.
The diff coverage is 50.00%.

@@            Coverage Diff             @@
##           master   #56529      +/-   ##
==========================================
+ Coverage   77.54%   77.78%   +0.24%     
==========================================
  Files        1923     1923              
  Lines      190853   190854       +1     
==========================================
+ Hits       147996   148457     +461     
+ Misses      42857    42397     -460

facebook-github-bot · 2021-04-21T16:41:20Z

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-04-22T23:22:10Z

@ngimel merged this pull request in bac4cfd.

Summary: Fixes pytorch#56342 Pull Request resolved: pytorch#56529 Reviewed By: albanD Differential Revision: D27896094 Pulled By: ngimel fbshipit-source-id: fe817781eb7139ea57c78acfd56e7c11b61eb4ed

fix mp serialization for integer nn.Parameter on CUDA

61f0dde

facebook-github-bot added the cla signed label Apr 20, 2021

Alvant mentioned this pull request Apr 20, 2021

Multiprocessing with tensors on GPU: "Only Tensors of floating point dtype can require gradients" #56342

Closed

pytorchbot added the open source label Apr 20, 2021

ngimel approved these changes Apr 20, 2021

View reviewed changes

ngimel reviewed Apr 21, 2021

View reviewed changes

fix test for int mp parameter serialization on cuda

49a95e7

facebook-github-bot closed this in bac4cfd Apr 22, 2021

facebook-github-bot added the Merged label Apr 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix mp serialization for integer nn.Parameter on CUDA #56529

Fix mp serialization for integer nn.Parameter on CUDA #56529

Alvant commented Apr 20, 2021

facebook-github-bot commented Apr 20, 2021 •

edited

ngimel left a comment

facebook-github-bot commented Apr 20, 2021

ngimel Apr 21, 2021

Alvant Apr 21, 2021

codecov bot commented Apr 21, 2021

facebook-github-bot commented Apr 21, 2021

facebook-github-bot commented Apr 22, 2021

Fix mp serialization for integer nn.Parameter on CUDA #56529

Fix mp serialization for integer nn.Parameter on CUDA #56529

Conversation

Alvant commented Apr 20, 2021

facebook-github-bot commented Apr 20, 2021 • edited

💊 CI failures summary and remediations

2 failures not recognized by patterns:

ngimel left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Apr 20, 2021

ngimel Apr 21, 2021

Choose a reason for hiding this comment

Alvant Apr 21, 2021

Choose a reason for hiding this comment

codecov bot commented Apr 21, 2021

Codecov Report

facebook-github-bot commented Apr 21, 2021

facebook-github-bot commented Apr 22, 2021

facebook-github-bot commented Apr 20, 2021 •

edited