DISABLED test_check_inplace_nn_CELU_mps_float32 (main.TestModuleMPS) #111449

huydhn · 2023-10-18T01:24:48Z

Platforms: mac, macos

This test was disabled because it is failing on main branch (recent examples).

This test has been failing in MacOS x86 for a while https://hud.pytorch.org/pytorch/pytorch/commit/973c87b320b5e7489f18b719d5b1c57a2051ae10. The error is:

RuntimeError: MPS backend out of memory (MPS allocated: 0 bytes, other allocations: 0 bytes, max allowed: 1.70 GB). Tried to allocate 0 bytes on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).

cc @kulinseth @albanD @malfet @DenisVieriu97 @razarmehr @abhudev

The text was updated successfully, but these errors were encountered:

pytorch-bot · 2023-10-18T01:24:52Z

Hello there! From the DISABLED prefix in this issue title, it looks like you are attempting to disable a test in PyTorch CI. The information I have parsed is below:

Test name: test_check_inplace_nn_CELU_mps_float32 (__main__.TestModuleMPS)
Platforms for which to skip the test: mac, macos
Disabled by huydhn

Within ~15 minutes, test_check_inplace_nn_CELU_mps_float32 (__main__.TestModuleMPS) will be disabled in PyTorch CI for these platforms: mac, macos. Please verify that your test name looks correct, e.g., test_cuda_assert_async (__main__.TestCuda).

To modify the platforms list, please include a line in the issue body, like below. The default action will disable the test for all platforms if no platforms list is specified.

Platforms: case-insensitive, list, of, platforms

We currently support the following platforms: asan, dynamo, inductor, linux, mac, macos, rocm, slow, win, windows.

pytorch-bot · 2023-10-18T01:24:53Z

Hello there! From the DISABLED prefix in this issue title, it looks like you are attempting to disable a test in PyTorch CI. The information I have parsed is below:

Test name: test_check_inplace_nn_CELU_mps_float32 (__main__.TestModuleMPS)
Platforms for which to skip the test: mac, macos
Disabled by huydhn

Within ~15 minutes, test_check_inplace_nn_CELU_mps_float32 (__main__.TestModuleMPS) will be disabled in PyTorch CI for these platforms: mac, macos. Please verify that your test name looks correct, e.g., test_cuda_assert_async (__main__.TestCuda).

To modify the platforms list, please include a line in the issue body, like below. The default action will disable the test for all platforms if no platforms list is specified.

Platforms: case-insensitive, list, of, platforms

We currently support the following platforms: asan, dynamo, inductor, linux, mac, macos, rocm, slow, win, windows.

malfet · 2023-10-18T04:21:45Z

Hmm, this is weird, we should not be running any MPS tests on GitHub Actions runners as they do not have an access to MPS hardware....

Skip devices that does not support `MTLGPUFamilyMac2`, for example something called "Apple Paravirtual device", which started to appear in GitHub CI, from https://github.com/malfet/deleteme/actions/runs/6577012044/job/17867739464#step:3:18 ``` Found device Apple Paravirtual device isLowPower false supports Metal false ``` As first attempt to allocate memory on such device will fail with: ``` RuntimeError: MPS backend out of memory (MPS allocated: 0 bytes, other allocations: 0 bytes, max allowed: 1.70 GB). Tried to allocate 0 bytes on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). ``` Fixes #111449

Skip devices that does not support `MTLGPUFamilyMac2`, for example something called "Apple Paravirtual device", which started to appear in GitHub CI, from https://github.com/malfet/deleteme/actions/runs/6577012044/job/17867739464#step:3:18 ``` Found device Apple Paravirtual device isLowPower false supports Metal false ``` As first attempt to allocate memory on such device will fail with: ``` RuntimeError: MPS backend out of memory (MPS allocated: 0 bytes, other allocations: 0 bytes, max allowed: 1.70 GB). Tried to allocate 0 bytes on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). ``` Fixes #111449 Pull Request resolved: #111576 Approved by: https://github.com/atalman, https://github.com/clee2000, https://github.com/huydhn

* check in (#111875) check in impl address comments, skip test on rocm unused * [MPS] Skip virtualized devices (#111576) Skip devices that does not support `MTLGPUFamilyMac2`, for example something called "Apple Paravirtual device", which started to appear in GitHub CI, from https://github.com/malfet/deleteme/actions/runs/6577012044/job/17867739464#step:3:18 ``` Found device Apple Paravirtual device isLowPower false supports Metal false ``` As first attempt to allocate memory on such device will fail with: ``` RuntimeError: MPS backend out of memory (MPS allocated: 0 bytes, other allocations: 0 bytes, max allowed: 1.70 GB). Tried to allocate 0 bytes on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). ``` Fixes #111449 Pull Request resolved: #111576 Approved by: https://github.com/atalman, https://github.com/clee2000, https://github.com/huydhn * Revert "check in (#111875)" This reverts commit 2f502cc. --------- Co-authored-by: eqy <eddiey@nvidia.com> Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>

Skip devices that does not support `MTLGPUFamilyMac2`, for example something called "Apple Paravirtual device", which started to appear in GitHub CI, from https://github.com/malfet/deleteme/actions/runs/6577012044/job/17867739464#step:3:18 ``` Found device Apple Paravirtual device isLowPower false supports Metal false ``` As first attempt to allocate memory on such device will fail with: ``` RuntimeError: MPS backend out of memory (MPS allocated: 0 bytes, other allocations: 0 bytes, max allowed: 1.70 GB). Tried to allocate 0 bytes on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). ``` Fixes pytorch#111449 Pull Request resolved: pytorch#111576 Approved by: https://github.com/atalman, https://github.com/clee2000, https://github.com/huydhn

* check in (pytorch#111875) check in impl address comments, skip test on rocm unused * [MPS] Skip virtualized devices (pytorch#111576) Skip devices that does not support `MTLGPUFamilyMac2`, for example something called "Apple Paravirtual device", which started to appear in GitHub CI, from https://github.com/malfet/deleteme/actions/runs/6577012044/job/17867739464#step:3:18 ``` Found device Apple Paravirtual device isLowPower false supports Metal false ``` As first attempt to allocate memory on such device will fail with: ``` RuntimeError: MPS backend out of memory (MPS allocated: 0 bytes, other allocations: 0 bytes, max allowed: 1.70 GB). Tried to allocate 0 bytes on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). ``` Fixes pytorch#111449 Pull Request resolved: pytorch#111576 Approved by: https://github.com/atalman, https://github.com/clee2000, https://github.com/huydhn * Revert "check in (pytorch#111875)" This reverts commit 2f502cc. --------- Co-authored-by: eqy <eddiey@nvidia.com> Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>

…n on macos-14 runners pytorch/pytorch#111449 (comment)

…E` env var (#131) * use MPS backend if available and use_device is None (prev would default to CPU in that case) also fix type errors * revert torch.det for volume to torch.dot and torch.cross (which have MPS support) * try PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 in CI to avoid OOM error E RuntimeError: MPS backend out of memory (MPS allocated: 0 bytes, other allocations: 0 bytes, max allowed: 7.93 GB). Tried to allocate 512 bytes on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). * add support for CHGNET_DEVICE environment variable * test CHGNET_DEVICE in test_model_load_version_params() update deprecated ruff lint config * set CHGNET_DEVICE=cpu in test.yml since no MPS hardware available even on macos-14 runners pytorch/pytorch#111449 (comment) * fix setting CHGNET_DEVICE env var on windows

huydhn added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: mps Related to Apple Metal Performance Shaders framework labels Oct 18, 2023

pytorch-bot bot added the skipped Denotes a (flaky) test currently skipped in CI. label Oct 18, 2023

pytorch-bot bot added module: macos Mac OS related issues skipped Denotes a (flaky) test currently skipped in CI. and removed skipped Denotes a (flaky) test currently skipped in CI. labels Oct 18, 2023

malfet self-assigned this Oct 18, 2023

malfet mentioned this issue Oct 19, 2023

[MPS] Skip virtualized devices #111576

Closed

pytorchmergebot closed this as completed in ca5f6f7 Oct 19, 2023

janosh mentioned this issue Feb 28, 2024

GitHub Action: New M1 runner available to all plans, including open source 🚀 actions/runner-images#9254

Closed

12 tasks

janosh added a commit to CederGroupHub/chgnet that referenced this issue Feb 28, 2024

set CHGNET_DEVICE=cpu in test.yml since no MPS hardware available eve…

8d6dcea

…n on macos-14 runners pytorch/pytorch#111449 (comment)

JacksonBurns mentioned this issue Apr 29, 2024

Docker images for v2 chemprop/chemprop#841

Merged

IgorTatarnikov mentioned this issue May 9, 2024

It/keras3 pytorch brainglobe/cellfinder#396

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DISABLED test_check_inplace_nn_CELU_mps_float32 (main.TestModuleMPS) #111449

DISABLED test_check_inplace_nn_CELU_mps_float32 (main.TestModuleMPS) #111449

huydhn commented Oct 18, 2023 •

edited by pytorch-bot bot

pytorch-bot bot commented Oct 18, 2023

pytorch-bot bot commented Oct 18, 2023

malfet commented Oct 18, 2023

DISABLED test_check_inplace_nn_CELU_mps_float32 (__main__.TestModuleMPS) #111449

DISABLED test_check_inplace_nn_CELU_mps_float32 (__main__.TestModuleMPS) #111449

Comments

huydhn commented Oct 18, 2023 • edited by pytorch-bot bot

pytorch-bot bot commented Oct 18, 2023

pytorch-bot bot commented Oct 18, 2023

malfet commented Oct 18, 2023

DISABLED test_check_inplace_nn_CELU_mps_float32 (main.TestModuleMPS) #111449

DISABLED test_check_inplace_nn_CELU_mps_float32 (main.TestModuleMPS) #111449

huydhn commented Oct 18, 2023 •

edited by pytorch-bot bot