Skip to content

Conversation

RohanM
Copy link
Contributor

@RohanM RohanM commented May 24, 2022

Implements the aten::count_nonzero.dim_IntList operator (as used by torch.count_nonzero) for MPS.

@facebook-github-bot
Copy link
Contributor

Hi @RohanM!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented May 24, 2022

🔗 Helpful links

❌ 1 New Failures

As of commit 43721d8 (more details on the Dr. CI page):

Expand to see more
  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build pull / linux-focal-py3.7-gcc7-mobile-lightweight-dispatch-build / build (1/1)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

2022-05-30T22:23:57.0194277Z ##[error]Process completed with exit code 137.
2022-05-30T22:22:16.6823550Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/x8-lut/gen/lut-avx512skx-vpshufb-x256.c.o�[0m
2022-05-30T22:22:16.8425480Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2-k-over-64.c.o�[0m
2022-05-30T22:22:16.9458439Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2-k-over-2048.c.o�[0m
2022-05-30T22:22:17.0090121Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2minus-k-over-4.c.o�[0m
2022-05-30T22:22:17.0878877Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2minus-k-over-8.c.o�[0m
2022-05-30T22:22:17.1649046Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2minus-k-over-16.c.o�[0m
2022-05-30T22:22:17.2265169Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2minus-k-over-64.c.o�[0m
2022-05-30T22:22:17.3130902Z [ 98%] �[32mBuilding C object confu-deps/XNNPACK/CMakeFiles/all_microkernels.dir/src/tables/exp2minus-k-over-2048.c.o�[0m
2022-05-30T22:22:17.4211278Z [ 98%] Built target all_microkernels
2022-05-30T22:22:17.4286681Z [ 98%] �[32mBuilding CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterCodegenUnboxedKernels_8.cpp.o�[0m
2022-05-30T22:23:57.0194277Z ##[error]Process completed with exit code 137.
2022-05-30T22:23:57.0726907Z Prepare all required actions
2022-05-30T22:23:57.0792532Z ##[group]Run ./.github/actions/teardown-linux
2022-05-30T22:23:57.0792843Z with:
2022-05-30T22:23:57.0793049Z env:
2022-05-30T22:23:57.0793273Z   IN_CI: 1
2022-05-30T22:23:57.0793503Z   IS_GHA: 1
2022-05-30T22:23:57.0793725Z ##[endgroup]
2022-05-30T22:23:57.0836478Z ##[group]Run .github/scripts/wait_for_ssh_to_drain.sh
2022-05-30T22:23:57.0836866Z �[36;1m.github/scripts/wait_for_ssh_to_drain.sh�[0m
2022-05-30T22:23:57.1559425Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@RohanM RohanM force-pushed the mps-count-nonzero branch 2 times, most recently from 508fd86 to 618fe66 Compare May 24, 2022 10:33
@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@RohanM RohanM marked this pull request as ready for review May 24, 2022 11:59
@RohanM RohanM requested a review from bdhirsh as a code owner May 24, 2022 11:59
@RohanM RohanM force-pushed the mps-count-nonzero branch from 9912822 to 6c7ada3 Compare May 24, 2022 12:04
@RohanM
Copy link
Contributor Author

RohanM commented May 30, 2022

@albanD @kulinseth Hello! First time pytorch contributor here. I've written this PR and could use a review - are you able to assist?

@kulinseth
Copy link
Collaborator

@albanD @kulinseth Hello! First time pytorch contributor here. I've written this PR and could use a review - are you able to assist?

Thanks @RohanM for your contribution, much appreciated!. Overall the PR looks good.
How did you test your changes (was it on M1 ?) ? Also you did refactoring in the reduction Ops, did you run the entire suite of test_mps.py tests to make sure there is no regression?

@kulinseth kulinseth requested review from albanD and kulinseth May 30, 2022 17:52
@kulinseth kulinseth added this to the 1.12.0 milestone May 30, 2022
@kulinseth kulinseth added the ciflow/trunk Trigger trunk jobs on your pull request label May 30, 2022
@RohanM RohanM force-pushed the mps-count-nonzero branch from 6c7ada3 to 43721d8 Compare May 30, 2022 22:13
@RohanM
Copy link
Contributor Author

RohanM commented May 30, 2022

Thanks! Yes, I developed and tested on my 2021 MBP running Monterey 12.4. That's a good point, I'm guessing the MPS specs aren't running on CI at the moment. I rebased, re-ran python setup.py develop and then ran test_mps.py and everything's looking green (and the warnings look unrelated):

❯ date
Tue 31 May 2022 09:16:28 AEST
❯ gl -1
* 43721d8e51 (HEAD -> mps-count-nonzero, origin/mps-count-nonzero) MPS: Fix casting for count_nonzero  Rohan Mitchell, 6 days ago
❯ pytest test/test_mps.py
=================================================================================== test session starts ===================================================================================
platform darwin -- Python 3.9.12, pytest-7.1.1, pluggy-1.0.0
rootdir: /Users/rohanmitchell/dev/pytorch, configfile: pytest.ini
plugins: anyio-3.5.0, hypothesis-6.29.3
collected 171 items

test/test_mps.py .................................................................................................................................................................. [ 94%]
..s......                                                                                                                                                                           [100%]

==================================================================================== warnings summary =====================================================================================
test/test_mps.py::MPSReluTest::testNpRelu
test/test_mps.py::MPSReluTest::testNumbersCPU
test/test_mps.py::MPSReluTest::testNumbersGPU
test/test_mps.py::MPSLeakyReluTest::testNpLeakyRelu
test/test_mps.py::MPSLeakyReluTest::testNumbersCPU
test/test_mps.py::TestMPS::testNumbersGPU
test/test_mps.py::TestMPS::test_addmm
test/test_mps.py::TestMPS::test_local_scalar_dense_mps
test/test_mps.py::TestMPS::test_mm
  /Users/rohanmitchell/dev/pytorch/torch/testing/_deprecated.py:35: FutureWarning: torch.testing.assert_allclose() is deprecated since 1.12 and will be removed in 1.14. Use torch.testing.assert_close() instead. For detailed upgrade instructions see https://github.com/pytorch/pytorch/issues/61844.
    warnings.warn(msg, FutureWarning)

test/test_mps.py::TestMPS::test_nhwc_operation
  /Users/rohanmitchell/dev/pytorch/test/test_mps.py:575: UserWarning: The dst MTL buffer in copy_to_mps is non-contiguous (Triggered internally at  ../aten/src/ATen/native/mps/operations/Copy.mm:381.)
    x = cpu_x.detach().clone().to('mps').requires_grad_()

test/test_mps.py::TestNLLLoss::test_embedding_dense_backward
  /Users/rohanmitchell/dev/pytorch/torch/nn/functional.py:2054: UserWarning: The operator 'aten::embedding_renorm_' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at  ../aten/src/ATen/mps/MPSFallback.mm:11.)
    torch.embedding_renorm_(weight, input, max_norm, norm_type)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================= short test summary info =================================================================================
SKIPPED [1] test/test_mps.py:4201: Backward of lstm returns wrong result
====================================================================== 170 passed, 1 skipped, 11 warnings in 28.10s =======================================================================

@ejguan ejguan added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: mps Related to Apple Metal Performance Shaders framework labels May 31, 2022
Copy link
Collaborator

@kulinseth kulinseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and thanks for testing it.

@albanD
Copy link
Collaborator

albanD commented May 31, 2022

@pytorchbot merge this please

@github-actions
Copy link
Contributor

Hey @RohanM.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

facebook-github-bot pushed a commit that referenced this pull request Jun 1, 2022
Summary:
- See: #77764

Implements the `aten::count_nonzero.dim_IntList` operator (as used by [torch.count_nonzero](https://pytorch.org/docs/stable/generated/torch.count_nonzero.html)) for [MPS](https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/).

Pull Request resolved: #78169
Approved by: https://github.com/malfet, https://github.com/kulinseth, https://github.com/albanD

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/f42b42d3eb9af4ea1d09f00a13e9b6dc9efcc0f8

Reviewed By: seemethere

Differential Revision: D36815110

Pulled By: seemethere

fbshipit-source-id: 1ef7e0f317a75dae2c7fcd610cb56c5812d89da2
atalman pushed a commit to atalman/pytorch that referenced this pull request Jun 6, 2022
malfet pushed a commit that referenced this pull request Jun 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request cla signed Merged module: mps Related to Apple Metal Performance Shaders framework open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants