add test cases for GradScaler on CPU #109994

CaoE · 2023-09-25T06:39:44Z

Stack from ghstack (oldest at bottom):

-> add test cases for GradScaler on CPU #109994

[ghstack-poisoned]

pytorch-bot · 2023-09-25T06:39:47Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/109994

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit c2b2fec with merge base a43c283 ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, linux.2xlarge) (gh)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 3164a84 Pull Request resolved: #109994

[ghstack-poisoned]

ghstack-source-id: 56f1c48 Pull Request resolved: #109994

[ghstack-poisoned]

ghstack-source-id: b3bb224 Pull Request resolved: #109994

[ghstack-poisoned]

ghstack-source-id: 81dcc9a Pull Request resolved: #109994

[ghstack-poisoned]

ghstack-source-id: 8cf3826 Pull Request resolved: #109994

[ghstack-poisoned]

ghstack-source-id: 9fba887 Pull Request resolved: #109994

[ghstack-poisoned]

ghstack-source-id: f02a56a Pull Request resolved: #109994

[ghstack-poisoned]

ghstack-source-id: 0cb9838 Pull Request resolved: #109994

[ghstack-poisoned]

ghstack-source-id: 02fad89 Pull Request resolved: #109994

[ghstack-poisoned]

ghstack-source-id: 717129b Pull Request resolved: #109994

[ghstack-poisoned]

ghstack-source-id: 2a83d48 Pull Request resolved: #109994

ezyang · 2024-02-01T19:59:31Z

test/test_torch.py

+        with self.assertWarns(FutureWarning):
+            scaler.step(o1)
+        scaler.step(o2)
+        scaler.update()


You should seriously just consider making a dedicated test file for this though

Do you mean making a dedicated test file like test_grad_scaler.py for all GradScaler related tests above ? It's okay for me. I will move such tests to a new file test_grad_scaler.py.

ezyang · 2024-02-01T19:59:53Z

test/test_torch.py

+    @skipIfTorchDynamo("Failed running call_function for sparse_coo_tensor. See https://github.com/pytorch/pytorch/issues/118856")
+    @onlyNativeDeviceTypes
+    @dtypes(torch.float)
+    def test_grad_scaling_unscale_sparse(self, device, dtype):


Any code changes besides changing device? (A diff perhaps?)

No, only the device change.
The simple code would also fail (will also fail on CUDA with device="cuda")：
torch._dynamo.exc.TorchRuntimeError: Failed running call_function <built-in method sparse_coo_tensor of type object at 0x7f105afdb9a0>(*(FakeTensor(..., size=(2, 3), dtype=torch.int64), FakeTensor(..., size=(3,)), (2, 3)), **{'device': 'cpu', 'dtype': torch.float32}): The tensor has a non-zero number of elements, but its data is not allocated yet. Caffe2 uses a lazy allocation, so you will need to call mutable_data() or raw_mutable_data() to actually allocate memory.

@onlyNativeDeviceTypes @dtypes(torch.float) def test_grad_scaling_unscale_sparse(self, device, dtype): i = torch.tensor([[0, 1, 1], [2, 0, 2]], device="cpu", dtype=torch.int64) v = torch.tensor([16., 32., 64.], device="cpu", dtype=torch.float) s = torch.sparse_coo_tensor(i, v, torch.Size([2, 3]), device="cpu", dtype=torch.float)

ezyang · 2024-02-02T21:45:01Z

@pytorchbot merge

pytorchmergebot · 2024-02-02T21:47:38Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Pull Request resolved: #109994 Approved by: https://github.com/jgong5, https://github.com/ezyang

add test cases for gradscaler on CPU

e7003ea

[ghstack-poisoned]

CaoE mentioned this pull request Sep 25, 2023

add _amp_foreach_non_finite_check_and_unscale_cpu_ and _amp_update_scale_cpu_ kernels on CPU #109281

Closed

pytorch-bot bot added the topic: not user facing topic category label Sep 25, 2023

CaoE mentioned this pull request Sep 25, 2023

add GradScaler on CPU #109993

Closed

CaoE added a commit that referenced this pull request Sep 25, 2023

add test cases for gradscaler on CPU

6e48f89

ghstack-source-id: 3164a84 Pull Request resolved: #109994

CaoE marked this pull request as draft September 25, 2023 06:44

pytorchbot added the open source label Sep 25, 2023

Update on "add test cases for gradscaler on CPU"

7a8477a

[ghstack-poisoned]

CaoE added a commit that referenced this pull request Sep 26, 2023

add test cases for gradscaler on CPU

7e4266a

ghstack-source-id: 56f1c48 Pull Request resolved: #109994

Update on "add test cases for gradscaler on CPU"

605dc7c

[ghstack-poisoned]

Update on "add test cases for gradscaler on CPU"

c0232e5

[ghstack-poisoned]

CaoE added a commit that referenced this pull request Sep 27, 2023

add test cases for gradscaler on CPU

aebb5cc

ghstack-source-id: b3bb224 Pull Request resolved: #109994

CaoE added ciflow/trunk Trigger trunk jobs on your pull request ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR ciflow/inductor ciflow/slow labels Sep 27, 2023

Update on "add test cases for gradscaler on CPU"

b10c665

[ghstack-poisoned]

CaoE added a commit that referenced this pull request Oct 7, 2023

add test cases for gradscaler on CPU

d453f5b

ghstack-source-id: 81dcc9a Pull Request resolved: #109994

Update on "add test cases for gradscaler on CPU"

f449fbf

[ghstack-poisoned]

CaoE added a commit that referenced this pull request Dec 1, 2023

add test cases for gradscaler on CPU

cc8fbc1

ghstack-source-id: 8cf3826 Pull Request resolved: #109994

Update on "add test cases for gradscaler on CPU"

86db41c

[ghstack-poisoned]

Update on "add test cases for gradscaler on CPU"

54d3920

[ghstack-poisoned]

CaoE added a commit that referenced this pull request Dec 4, 2023

add test cases for gradscaler on CPU

4c61174

ghstack-source-id: 9fba887 Pull Request resolved: #109994

Update on "add test cases for gradscaler on CPU"

bcbb2b7

[ghstack-poisoned]

Update on "add test cases for gradscaler on CPU"

d4e5d0c

[ghstack-poisoned]

Update on "add test cases for GradScaler on CPU"

65ffbc0

[ghstack-poisoned]

Update on "add test cases for GradScaler on CPU"

f1ee826

[ghstack-poisoned]

CaoE added a commit that referenced this pull request Jan 29, 2024

add test cases for gradscaler on CPU

cb20dd6

ghstack-source-id: f02a56a Pull Request resolved: #109994

Update on "add test cases for GradScaler on CPU"

7fc000f

[ghstack-poisoned]

CaoE added a commit that referenced this pull request Jan 30, 2024

add test cases for gradscaler on CPU

1c1092d

ghstack-source-id: 0cb9838 Pull Request resolved: #109994

Update on "add test cases for GradScaler on CPU"

e333860

[ghstack-poisoned]

Update on "add test cases for GradScaler on CPU"

bbf1829

[ghstack-poisoned]

CaoE added a commit that referenced this pull request Jan 31, 2024

add test cases for gradscaler on CPU

689d2e7

ghstack-source-id: 02fad89 Pull Request resolved: #109994

Update on "add test cases for GradScaler on CPU"

cbc19f2

[ghstack-poisoned]

CaoE added a commit that referenced this pull request Feb 1, 2024

add test cases for gradscaler on CPU

249411e

ghstack-source-id: 717129b Pull Request resolved: #109994

CaoE requested a review from ezyang February 1, 2024 01:56

CaoE mentioned this pull request Feb 1, 2024

torch._dynamo.exc.TorchRuntimeError: Failed running call_function of torch.sparse_coo_tensor #118856

Closed

Update on "add test cases for GradScaler on CPU"

c2b2fec

[ghstack-poisoned]

CaoE added a commit that referenced this pull request Feb 1, 2024

add test cases for gradscaler on CPU

42c094f

ghstack-source-id: 2a83d48 Pull Request resolved: #109994

ezyang approved these changes Feb 1, 2024

View reviewed changes

ezyang reviewed Feb 1, 2024

View reviewed changes

pytorchmergebot added the merging label Feb 2, 2024

pytorchmergebot closed this in 113138a Feb 2, 2024

pytorchmergebot added Merged and removed merging labels Feb 2, 2024

pytorch-bot bot pushed a commit that referenced this pull request Feb 8, 2024

add test cases for GradScaler on CPU (#109994)

814bb5b

Pull Request resolved: #109994 Approved by: https://github.com/jgong5, https://github.com/ezyang

soulitzer mentioned this pull request Feb 13, 2024

DISABLED test_grad_scaling_autocast_cuda (__main__.TestTorchDeviceTypeCUDA) #119154

Closed

github-actions bot deleted the gh/CaoE/40/head branch March 4, 2024 01:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add test cases for GradScaler on CPU #109994

add test cases for GradScaler on CPU #109994

Uh oh!

CaoE commented Sep 25, 2023 •

edited

Loading

Uh oh!

pytorch-bot bot commented Sep 25, 2023 •

edited

Loading

Uh oh!

ezyang Feb 1, 2024

Uh oh!

CaoE Feb 2, 2024 •

edited

Loading

Uh oh!

ezyang Feb 1, 2024

Uh oh!

CaoE Feb 2, 2024 •

edited

Loading

Uh oh!

ezyang commented Feb 2, 2024

Uh oh!

pytorchmergebot commented Feb 2, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

add test cases for GradScaler on CPU #109994

add test cases for GradScaler on CPU #109994

Uh oh!

Conversation

CaoE commented Sep 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/109994

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

ezyang Feb 1, 2024

Choose a reason for hiding this comment

Uh oh!

CaoE Feb 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ezyang Feb 1, 2024

Choose a reason for hiding this comment

Uh oh!

CaoE Feb 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ezyang commented Feb 2, 2024

Uh oh!

pytorchmergebot commented Feb 2, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

CaoE commented Sep 25, 2023 •

edited

Loading

pytorch-bot bot commented Sep 25, 2023 •

edited

Loading

CaoE Feb 2, 2024 •

edited

Loading

CaoE Feb 2, 2024 •

edited

Loading