Fix FakeTensor device creation for MPS #144796

malfet · 2025-01-14T20:09:37Z

Stack from ghstack (oldest at bottom):

By promoting torch.device("mps") to torch.device("mps:0"), but skipping is_initialized check, as MPS does not really support multi-GPU right now

This fixes GPUTests.test_remove_no_ops_mps

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov @BoyuanFeng

[ghstack-poisoned]

pytorch-bot · 2025-01-14T20:09:42Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/144796

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Pending, 1 Unrelated Failure

As of commit c799cef with merge base 95b41d2 ():

NEW FAILURE - The following job has failed:

pull / linux-jammy-py3.10-clang15-asan / test (default, 5, 6, linux.4xlarge) (gh)
Process completed with exit code 137.

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, linux.2xlarge) (gh) (trunk failure)
backends/xnnpack/test/ops/test_conv1d.py::TestConv1d::test_qs8_conv1d_batchnorm_seq

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ezyang · 2025-01-14T21:32:28Z

torch/_subclasses/fake_tensor.py

            and device.index is None
        ):
-            if getattr(torch, device.type).is_initialized():
+            if device.type != "mps" and getattr(torch, device.type).is_initialized():


what's going on here?

Is it documented anywhere, that torch.device.is_initialized() should be available? I can implement that function, but I'm unsure what it actually means. That memory were allocated on GPU?

I mean, we have torch.cuda.is_initialized which means the CUDA context has been initialized. If your accelerator has no meaningful concept of initialization you can just always return True.

[ghstack-poisoned]

Pull Request resolved: #144826 Approved by: https://github.com/dcci ghstack dependencies: #144509, #144798, #144795, #144796

Update

2b269ba

[ghstack-poisoned]

This was referenced Jan 14, 2025

[MPSInductor] Add dummy properties #144509

Closed

[BE] Extend test_remove_no_ops #144795

Closed

pytorch-bot bot added ciflow/inductor module: inductor labels Jan 14, 2025

malfet requested review from ezyang and jansel January 14, 2025 20:11

malfet mentioned this pull request Jan 14, 2025

[MPSInductor] Add min/max to MetalExprPrinter #144798

Closed

malfet added topic: not user facing topic category ciflow/mps Run MPS tests (subset of trunk) labels Jan 14, 2025

ezyang reviewed Jan 14, 2025

View reviewed changes

ezyang approved these changes Jan 14, 2025

View reviewed changes

malfet added 4 commits January 14, 2025 13:47

Update

eff07a5

[ghstack-poisoned]

Update

3a5b4c9

[ghstack-poisoned]

Update

33581ef

[ghstack-poisoned]

Update

c799cef

[ghstack-poisoned]

This was referenced Jan 15, 2025

[MPSInductor] Support abs in MetalPrintExpr #144826

Closed

[MPSInductor] Implement pow() #144827

Closed

pytorchmergebot closed this in 9610a22 Jan 15, 2025

pytorchmergebot added the Merged label Jan 15, 2025

pytorchmergebot pushed a commit that referenced this pull request Jan 15, 2025

[MPSInductor] Support abs in MetalPrintExpr (#144826)

d2ca816

Pull Request resolved: #144826 Approved by: https://github.com/dcci ghstack dependencies: #144509, #144798, #144795, #144796

github-actions bot deleted the gh/malfet/126/head branch February 15, 2025 02:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix FakeTensor device creation for MPS #144796

Fix FakeTensor device creation for MPS #144796

Uh oh!

malfet commented Jan 14, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jan 14, 2025 •

edited

Loading

Uh oh!

ezyang Jan 14, 2025

Uh oh!

malfet Jan 14, 2025

Uh oh!

ezyang Jan 15, 2025

Uh oh!

Uh oh!

Fix FakeTensor device creation for MPS #144796

Fix FakeTensor device creation for MPS #144796

Uh oh!

Conversation

malfet commented Jan 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jan 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/144796

❌ 1 New Failure, 2 Pending, 1 Unrelated Failure

Uh oh!

ezyang Jan 14, 2025

Choose a reason for hiding this comment

Uh oh!

malfet Jan 14, 2025

Choose a reason for hiding this comment

Uh oh!

ezyang Jan 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

malfet commented Jan 14, 2025 •

edited

Loading

pytorch-bot bot commented Jan 14, 2025 •

edited

Loading