Skip to content

Commit

Permalink
[MPS] Skip virtualized devices (#111576) (#112265)
Browse files Browse the repository at this point in the history
* check in (#111875)

check in impl

address comments, skip test on rocm

unused

* [MPS] Skip virtualized devices (#111576)

Skip devices that does not support `MTLGPUFamilyMac2`, for example something called "Apple Paravirtual device", which started to appear in GitHub CI, from https://github.com/malfet/deleteme/actions/runs/6577012044/job/17867739464#step:3:18
```
Found device Apple Paravirtual device isLowPower false supports Metal false
```

As first attempt to allocate memory on such device will fail with:
```
RuntimeError: MPS backend out of memory (MPS allocated: 0 bytes, other allocations: 0 bytes, max allowed: 1.70 GB). Tried to allocate 0 bytes on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).
```

Fixes #111449

Pull Request resolved: #111576
Approved by: https://github.com/atalman, https://github.com/clee2000, https://github.com/huydhn

* Revert "check in (#111875)"

This reverts commit 2f502cc.

---------

Co-authored-by: eqy <eddiey@nvidia.com>
Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
  • Loading branch information
3 people committed Oct 28, 2023
1 parent f82d6e4 commit 2dc37f4
Showing 1 changed file with 10 additions and 3 deletions.
13 changes: 10 additions & 3 deletions aten/src/ATen/mps/MPSDevice.mm
Original file line number Diff line number Diff line change
Expand Up @@ -93,10 +93,17 @@ static inline MTLLanguageVersion getMetalLanguageVersion(const id<MTLDevice>& de
NSArray* devices = [MTLCopyAllDevices() autorelease];
for (unsigned long i = 0; i < [devices count]; i++) {
id<MTLDevice> device = devices[i];
if (![device isLowPower]) { // exclude Intel GPUs
_mtl_device = [device retain];
break;
if ([device isLowPower]) { // exclude Intel GPUs
continue;
}
if (![device supportsFamily:MTLGPUFamilyMac2]) {
// Exclude devices that does not support Metal 2.0
// Virtualised MPS device on MacOS 12.6 should fail this check
TORCH_WARN("Skipping device ", [[device name] UTF8String], " that does not support Metal 2.0");
continue;
}
_mtl_device = [device retain];
break;
}
TORCH_INTERNAL_ASSERT_DEBUG_ONLY(_mtl_device);
}
Expand Down

0 comments on commit 2dc37f4

Please sign in to comment.