Skip to content

Conversation

kumpera
Copy link
Contributor

@kumpera kumpera commented Aug 15, 2022

The planners come with default implementations in default_planner.py.

The default planners expose their core functionality as separate functions
to make it easy for other checkpoint implementations to use this functionality.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Aug 15, 2022

🔗 Helpful links

❌ 1 New Failures

As of commit 4e0377a (more details on the Dr. CI page):

Expand to see more
  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build pull / linux-bionic-cuda11.6-py3.10-gcc7 / test (default, 2, 4, linux.4xlarge.nvidia.gpu) (1/1)

Step: "Unknown" (full log | diagnosis details)

2022-08-18T14:47:53.4564050Z test_add_done_ca...arg() takes 0 positional arguments but 1 was given
2022-08-18T14:47:53.4535343Z   /opt/conda/lib/python3.10/unittest/suite.py(122): run
2022-08-18T14:47:53.4535704Z   /opt/conda/lib/python3.10/unittest/suite.py(84): __call__
2022-08-18T14:47:53.4536162Z   /opt/conda/lib/python3.10/site-packages/xmlrunner/runner.py(67): run
2022-08-18T14:47:53.4536559Z   /opt/conda/lib/python3.10/unittest/main.py(271): runTests
2022-08-18T14:47:53.4536928Z   /opt/conda/lib/python3.10/unittest/main.py(101): __init__
2022-08-18T14:47:53.4537420Z   /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py(783): run_tests
2022-08-18T14:47:53.4537855Z   /var/lib/jenkins/workspace/test/test_futures.py(331): <module>
2022-08-18T14:47:53.4538083Z 
2022-08-18T14:47:53.4538185Z ok (1.591s)
2022-08-18T14:47:53.4557090Z   test_add_done_callback_maintains_callback_order (__main__.TestFuture) ... ok (0.003s)
2022-08-18T14:47:53.4564050Z   test_add_done_callback_no_arg_error_is_ignored (__main__.TestFuture) ... [E pybind_utils.h:212] Got the following error when running the callback: TypeError: TestFuture.test_add_done_callback_no_arg_error_is_ignored.<locals>.no_arg() takes 0 positional arguments but 1 was given
2022-08-18T14:47:53.4565050Z ok (0.001s)
2022-08-18T14:47:53.4578509Z   test_add_done_callback_simple (__main__.TestFuture) ... ok (0.001s)
2022-08-18T14:47:53.4637081Z   test_chained_then (__main__.TestFuture) ... ok (0.006s)
2022-08-18T14:47:53.5657981Z   test_collect_all (__main__.TestFuture) ... ok (0.102s)
2022-08-18T14:47:53.5666306Z   test_done (__main__.TestFuture) ... ok (0.001s)
2022-08-18T14:47:53.5680034Z   test_done_exception (__main__.TestFuture) ... ok (0.001s)
2022-08-18T14:47:53.5699197Z   test_interleaving_then_and_add_done_callback_maintains_callback_order (__main__.TestFuture) ... ok (0.002s)
2022-08-18T14:47:53.5709963Z   test_interleaving_then_and_add_done_callback_propagates_error (__main__.TestFuture) ... [E pybind_utils.h:212] Got the following error when running the callback: ValueError: Expected error
2022-08-18T14:47:53.5710469Z 
2022-08-18T14:47:53.5710631Z At:

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Copy link
Collaborator

@wanchaol wanchaol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looking good! some comments and questions inlined.

Copy link
Collaborator

@wanchaol wanchaol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for addressing comments!

@kumpera
Copy link
Contributor Author

kumpera commented Aug 17, 2022

@pytorchmergebot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here.
The merge job was triggered without a flag. This means that your change will be merged once all checks on your PR have passed (ETA: 0-4 Hours). If this is not the intended behavior, feel free to use some of the other merge options in the wiki.
Please reach out to the PyTorch DevX Team with feedback or questions!

@pytorchmergebot
Copy link
Collaborator

Merge failed
Reason: Refusing to merge as mandatory check(s) pull failed for rule Distributed
Raised by https://github.com/pytorch/pytorch/actions/runs/2875878521

@kumpera
Copy link
Contributor Author

kumpera commented Aug 17, 2022

@pytorchmergebot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a rebase job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased land_planner onto refs/remotes/origin/master, please pull locally before adding more changes (for example, via git checkout land_planner && git pull --rebase)

@kumpera
Copy link
Contributor Author

kumpera commented Aug 17, 2022

@pytorchmergebot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here.
The merge job was triggered without a flag. This means that your change will be merged once all checks on your PR have passed (ETA: 0-4 Hours). If this is not the intended behavior, feel free to use some of the other merge options in the wiki.
Please reach out to the PyTorch DevX Team with feedback or questions!

@pytorchmergebot
Copy link
Collaborator

Merge failed
Reason: Refusing to merge as mandatory check(s) pull failed for rule Distributed
Raised by https://github.com/pytorch/pytorch/actions/runs/2877107957

@kumpera
Copy link
Contributor Author

kumpera commented Aug 18, 2022

@pytorchmergebot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a rebase job. Check the current status here

Rodrigo Kumpera added 3 commits August 18, 2022 12:50
The planners come with default implementations in default_planner.py.

The default planners expose their core functionality as separate functions
to make it easy for other checkpoint implementations to use this functionality.
@pytorchmergebot
Copy link
Collaborator

Successfully rebased land_planner onto refs/remotes/origin/master, please pull locally before adding more changes (for example, via git checkout land_planner && git pull --rebase)

@kumpera
Copy link
Contributor Author

kumpera commented Aug 18, 2022

@pytorchmergebot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here.
The merge job was triggered without a flag. This means that your change will be merged once all checks on your PR have passed (ETA: 0-4 Hours). If this is not the intended behavior, feel free to use some of the other merge options in the wiki.
Please reach out to the PyTorch DevX Team with feedback or questions!

@pytorchmergebot
Copy link
Collaborator

Merge failed
Reason: Refusing to merge as mandatory check(s) pull failed for rule Distributed
Raised by https://github.com/pytorch/pytorch/actions/runs/2884228296

@kumpera
Copy link
Contributor Author

kumpera commented Aug 18, 2022

@pytorchmergebot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here.
The merge job was triggered without a flag. This means that your change will be merged once all checks on your PR have passed (ETA: 0-4 Hours). If this is not the intended behavior, feel free to use some of the other merge options in the wiki.
Please reach out to the PyTorch DevX Team with feedback or questions!

@github-actions
Copy link
Contributor

Hey @kumpera.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

facebook-github-bot pushed a commit that referenced this pull request Aug 19, 2022
…83419) (#83419)

Summary:
The planners come with default implementations in default_planner.py.

The default planners expose their core functionality as separate functions
to make it easy for other checkpoint implementations to use this functionality.

Pull Request resolved: #83419
Approved by: https://github.com/wanchaol

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/d11d3dd036b4a7098ab3b4d333fcb556b97b4860

Reviewed By: atalman

Differential Revision: D38852478

Pulled By: kumpera

fbshipit-source-id: 460acbb38a5edbe1fd517395e77fb451026a2ee2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged oncall: distributed Add this issue/PR to distributed oncall triage queue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants