Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add full support for serialization of MPS Tensors #79465

Closed
wants to merge 2 commits into from

Conversation

albanD
Copy link
Collaborator

@albanD albanD commented Jun 13, 2022

Fix #79384

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jun 13, 2022

🔗 Helpful links

❌ 2 New Failures

As of commit 085e441 (more details on the Dr. CI page):

Expand to see more
  • 2/2 failures introduced in this PR

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build pull / linux-focal-py3.7-gcc7 / test (backwards_compat, 1, 1, linux.2xlarge) (1/2)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-06-14T17:44:46.3124817Z RuntimeError:
2022-06-14T17:44:45.6477283Z Author: PyTorch Team
2022-06-14T17:44:45.6477687Z Author-email: packages@pytorch.org
2022-06-14T17:44:45.6477930Z License: BSD-3
2022-06-14T17:44:45.6478221Z Location: /opt/conda/lib/python3.7/site-packages
2022-06-14T17:44:45.6478486Z Requires: typing-extensions
2022-06-14T17:44:45.6478938Z Required-by: 
2022-06-14T17:44:45.6895278Z + python check_forward_backward_compatibility.py --existing-schemas nightly_schemas.txt
2022-06-14T17:44:46.3123932Z Traceback (most recent call last):
2022-06-14T17:44:46.3124308Z   File "check_forward_backward_compatibility.py", line 345, in <module>
2022-06-14T17:44:46.3124606Z     s = parse_schema(line.strip())
2022-06-14T17:44:46.3124817Z RuntimeError: 
2022-06-14T17:44:46.3125081Z Unknown custom class type c10d.ProcessGroup. Please ensure it is registered.:
2022-06-14T17:44:46.3126038Z c10d::broadcast(__torch__.torch.classes.c10d.ProcessGroup _0, Tensor[] _1, int _2, int _3, int _4) -> __torch__.torch.classes.c10d.Work _0
2022-06-14T17:44:46.3126456Z                                              ~~~~~~~~~~~~ <--- HERE
2022-06-14T17:44:46.3126594Z 
2022-06-14T17:44:46.4305612Z + cleanup
2022-06-14T17:44:46.4306039Z + retcode=1
2022-06-14T17:44:46.4306333Z + set +x
2022-06-14T17:44:46.4340295Z ##[error]Process completed with exit code 1.
2022-06-14T17:44:46.4369470Z Prepare all required actions
2022-06-14T17:44:46.4369744Z Getting action download info

See GitHub Actions build pull / linux-focal-py3.7-clang7-asan / build (2/2)

Step: "Pull docker image" (full log | diagnosis details | 🔁 rerun)

2022-06-14T17:36:08.3173914Z �[91m./configure: line 6876: /usr/bin/file: No such file or directory
2022-06-14T17:36:08.1996779Z checking for dlltool... no
2022-06-14T17:36:08.1998840Z checking how to associate runtime and link libraries... printf %s\n
2022-06-14T17:36:08.2003685Z checking for ar... ar
2022-06-14T17:36:08.2245357Z checking for archiver @FILE support... @
2022-06-14T17:36:08.2248640Z checking for strip... strip
2022-06-14T17:36:08.2252816Z checking for ranlib... ranlib
2022-06-14T17:36:08.2949133Z checking command to parse /usr/bin/nm -B output from gcc object... ok
2022-06-14T17:36:08.2961780Z checking for sysroot... no
2022-06-14T17:36:08.3001351Z checking for a working dd... /usr/bin/dd
2022-06-14T17:36:08.3036797Z checking how to truncate binary pipes... /usr/bin/dd bs=4096 count=1
2022-06-14T17:36:08.3173914Z �[91m./configure: line 6876: /usr/bin/file: No such file or directory
2022-06-14T17:36:08.3186882Z �[0mchecking for mt... no
2022-06-14T17:36:08.3220732Z checking if : is a manifest tool... no
2022-06-14T17:36:08.3421591Z checking how to run the C preprocessor... gcc -E
2022-06-14T17:36:08.4407783Z checking for ANSI C header files... yes
2022-06-14T17:36:08.4626863Z checking for sys/types.h... yes
2022-06-14T17:36:08.4876893Z checking for sys/stat.h... yes
2022-06-14T17:36:08.5130890Z checking for stdlib.h... yes
2022-06-14T17:36:08.5393263Z checking for string.h... yes
2022-06-14T17:36:08.5661743Z checking for memory.h... yes
2022-06-14T17:36:08.5922590Z checking for strings.h... yes

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@albanD albanD added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 13, 2022
Copy link
Collaborator

@kulinseth kulinseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@albanD
Copy link
Collaborator Author

albanD commented Jun 14, 2022

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

@github-actions
Copy link

Hey @albanD.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

@zengk95
Copy link
Contributor

zengk95 commented Jun 14, 2022

@pytorchbot revert -m "this broke X linux-xenial-py3.7-clang7-onnx / test (default, 1, 2, linux.2xlarge). Not sure why since it passed on pull." -c landrace

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here

pytorchmergebot added a commit that referenced this pull request Jun 14, 2022
This reverts commit 64c2a27.

Reverted #79465 on behalf of https://github.com/zengk95 due to this broke X linux-xenial-py3.7-clang7-onnx / test (default, 1, 2, linux.2xlarge). Not sure why since it passed on pull.
@albanD albanD reopened this Jun 14, 2022
@albanD
Copy link
Collaborator Author

albanD commented Jun 14, 2022

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

@Birch-san
Copy link

thanks so much @albanD; very expedient!

@malfet
Copy link
Contributor

malfet commented Jun 14, 2022

@pytorchbot merge -f

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

@github-actions
Copy link

Hey @albanD.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

facebook-github-bot pushed a commit that referenced this pull request Jun 16, 2022
Summary:
Fix #79384

Pull Request resolved: #79465
Approved by: https://github.com/kulinseth, https://github.com/malfet

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/64c2a275c4d463b936b9469da948a666e016bbb8

Reviewed By: osalpekar

Differential Revision: D37156509

Pulled By: malfet

fbshipit-source-id: 3c7ece64b0b519662bc7e5f19873bf579c6ffd93
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request cla signed Merged Reverted
Projects
None yet
Development

Successfully merging this pull request may close these issues.

torch.load() fails on MPS backend ("don't know how to restore data location")
7 participants