Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MPS] Add test consistency from OpInfo based tests from PR 78504 #79532

Closed
wants to merge 7 commits into from

Conversation

kulinseth
Copy link
Collaborator

@kulinseth kulinseth commented Jun 14, 2022

No description provided.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jun 14, 2022

🔗 Helpful links

❌ 1 New Failures

As of commit 6285ed9 (more details on the Dr. CI page):

Expand to see more
  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build pull / linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit / build-and-test (1/1)

Step: "Setup Linux" (full log | diagnosis details | 🔁 rerun)

2022-07-01T22:47:35.7955064Z ##[error]Process completed with exit code 125.
2022-07-01T22:47:35.7399693Z ##[group]Run docker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
2022-07-01T22:47:35.7400161Z �[36;1mdocker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .�[0m
2022-07-01T22:47:35.7411921Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2022-07-01T22:47:35.7412206Z env:
2022-07-01T22:47:35.7412441Z   GIT_DEFAULT_BRANCH: master
2022-07-01T22:47:35.7412795Z   ALPINE_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/tool/alpine
2022-07-01T22:47:35.7413095Z ##[endgroup]
2022-07-01T22:47:35.7703497Z Unable to find image '308535385114.dkr.ecr.us-east-1.amazonaws.com/tool/alpine:latest' locally
2022-07-01T22:47:35.7939318Z docker: Error response from daemon: Head "https://308535385114.dkr.ecr.us-east-1.amazonaws.com/v2/tool/alpine/manifests/latest": no basic auth credentials.
2022-07-01T22:47:35.7939764Z See 'docker run --help'.
2022-07-01T22:47:35.7955064Z ##[error]Process completed with exit code 125.
2022-07-01T22:47:35.7986582Z Prepare all required actions
2022-07-01T22:47:35.8030991Z ##[group]Run ./.github/actions/teardown-linux
2022-07-01T22:47:35.8031195Z with:
2022-07-01T22:47:35.8031333Z env:
2022-07-01T22:47:35.8031503Z   GIT_DEFAULT_BRANCH: master
2022-07-01T22:47:35.8031686Z ##[endgroup]
2022-07-01T22:47:35.8046171Z ##[group]Run .github/scripts/wait_for_ssh_to_drain.sh
2022-07-01T22:47:35.8046441Z �[36;1m.github/scripts/wait_for_ssh_to_drain.sh�[0m
2022-07-01T22:47:35.8058173Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2022-07-01T22:47:35.8058393Z env:

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty identical to This looks very much identical to #79521
Also, change can not be landed as is, as pprint is not a standard dependnecy

And it would be nice if it mentioned issue it fixes

@mikaylagawarecki mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 14, 2022
@kulinseth
Copy link
Collaborator Author

This looks pretty identical to This looks very much identical to #79521 Also, change can not be landed as is, as pprint is not a standard dependnecy

Unfortunately it turned out to be the same underlying issue. The issue was related to correctness which we noticed in constan_pad_nd operation. I have approved #79521, after it merges, I will rebase to check-in the test_consistency tests.

@kulinseth kulinseth changed the title [MPS] Fix chaining of View tensor when shape is different from Parent shape [MPS] Add test consistency from OpInfo based tests from PR 78504 Jun 15, 2022
@kulinseth
Copy link
Collaborator Author

Rebased the changes, now the PR has only the test consistency in test_mps. @albanD , please take a look.

@albanD albanD added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 15, 2022
test/test_mps.py Outdated Show resolved Hide resolved
test/test_mps.py Outdated Show resolved Hide resolved
test/test_mps.py Outdated Show resolved Hide resolved
test/test_mps.py Show resolved Hide resolved
test/test_mps.py Show resolved Hide resolved
test/test_mps.py Show resolved Hide resolved
test/test_mps.py Show resolved Hide resolved
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update!

@kulinseth
Copy link
Collaborator Author

File "/Users/gharunner/actions-runner/_work/pytorch/pytorch/conda-test-env-2519578557/lib/python3.8/site-packages/torchgen/utils.py", line [34](https://github.com/pytorch/pytorch/runs/6946646878?check_suite_focus=true#step:6:35), in <module>
34
    from yaml import CSafeLoader as Loader
[35](https://github.com/pytorch/pytorch/runs/6946646878?check_suite_focus=true#step:6:36)
ModuleNotFoundError: No module named 'yaml'
[36](https://github.com/pytorch/pytorch/runs/6946646878?check_suite_focus=true#step:6:37)

How do we enable yaml dependency for MPS. @albanD

@kulinseth
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

@github-actions
Copy link

Hey @kulinseth.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

@frank-wei
Copy link
Contributor

The third party libs are overridden. It is not intentional I think. Please revert it.
cc @kulinseth @malfet @albanD

@malfet
Copy link
Contributor

malfet commented Jun 30, 2022

@pytorchbot revert -m "Unintended submodules updates" -c weird

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

@kulinseth your PR has been successfully reverted.

pytorchmergebot added a commit that referenced this pull request Jun 30, 2022
…504 (#79532)"

This reverts commit c71886e.

Reverted #79532 on behalf of https://github.com/malfet due to Unintended submodules updates
@malfet malfet reopened this Jun 30, 2022
Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please undo submodule updates (which aren't needed, are they?)

@kulinseth
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Merge failed due to Command git -C /home/runner/actions-runner/_work/pytorch/pytorch rebase origin/master returned non-zero exit code 1

Rebasing (1/1)
Auto-merging test/test_mps.py
CONFLICT (content): Merge conflict in test/test_mps.py
error: could not apply 86304f0358... [MPS] Add test consistency from OpInfo based tests from PR 78504 (#79532)
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
Could not apply 86304f0358... [MPS] Add test consistency from OpInfo based tests from PR 78504 (#79532)

Raised by https://github.com/pytorch/pytorch/actions/runs/2597502193

@kulinseth
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Merge failed due to Refusing to merge as mandatory check(s) pull failed for rule superuser
Raised by https://github.com/pytorch/pytorch/actions/runs/2602174903

@albanD
Copy link
Collaborator

albanD commented Jul 4, 2022

@pytorchbot merge -f

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

@github-actions
Copy link

github-actions bot commented Jul 4, 2022

Hey @kulinseth.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

facebook-github-bot pushed a commit that referenced this pull request Jul 6, 2022
) (#79532)

Summary:
Pull Request resolved: #79532
Approved by: https://github.com/albanD, https://github.com/malfet

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/76cff182428fbd165b5725f3de29dbd91a1512fa

Reviewed By: mehtanirav

Differential Revision: D37604752

Pulled By: mehtanirav

fbshipit-source-id: 9a96726acd97c4fc811e680564a29efb33d0c340
atalman added a commit that referenced this pull request Jul 25, 2022
* MPS: Fixes (#78930)

Cast integer to float in UnaryOps
Add tensor dtype in key generation
Enable FP16 scalars and use placeholder for alpha tensor in add/sum ops

Fixes #ISSUE_NUMBER

Pull Request resolved: #78930
Approved by: https://github.com/albanD

* MPS: Binary cast fix by proper type promotion and remove spurious copy warning (#79185)

Fixes #78019, #78020
Fixes #79185
Pull Request resolved: #79185
Approved by: https://github.com/albanD, https://github.com/razarmehr

* MPS: add exponential op (#79188)

Add exponential distribution

Fixes #ISSUE_NUMBER

Pull Request resolved: #79188
Approved by: https://github.com/razarmehr, https://github.com/albanD

* [MPS] Delete unused vars from OperationUtils.mm

Pull Request resolved: #79514

Approved by: https://github.com/kulinseth, https://github.com/albanD

* [MPS] Fix getDefaultGenerator and copy_kernel_mps

Returning reference to stack memory is really bad

Pull Request resolved: #79515

Approved by: https://github.com/albanD

* [MPS][BE]Do not use `new/delete[]` in `chainViewOperation`

`std::array` will do just fine

Pull Request resolved: #79516

Approved by: https://github.com/albanD

* [MPS] Support stride of stride

Fixes #79181

Pull Request resolved: #79521

Approved by: https://github.com/kulinseth

* MPS: TopK raise an error if K>16 (#79677)

* Error out in TopK when k>16.
* Add a test case too.

Fixes #78915

Pull Request resolved: #79677
Approved by: https://github.com/albanD

* [MPS]: Add fix for squeezed input axes handling in BCE loss (#79676)

Fixes #79527

Pull Request resolved: #79676
Approved by: https://github.com/razarmehr, https://github.com/albanD

* MPS: Add amax and amin Ops with tests  (#79682)

* Add amax and amin with tests

Fixes #ISSUE_NUMBER

Pull Request resolved: #79682
Approved by: https://github.com/albanD

* [MPS] Fix torch.uint8 support (#80049)

`ScalarType.Byte` should be cast to `MPSDataTypeUInt8`
And support for `torch.int8` as well as test those conversions in `TestMPS.test_to`

Fixes #80006

Pull Request resolved: #80049
Approved by: https://github.com/albanD

* [MPS] Fix binary ops between int32 tensor with int64 scalar (#80220)

For some reason, tensor *op* scalar does not follow the normal binary promotion rules
So cast output tensor to expected type if needed
It seems that one should have casted input tensors to expected output tensor type, but it does not really work for boolean binary ops, so...
Add output tensor type/shape to cached graph key
Extend `TestMPS. test_add_scalars` to test for this regression

Fixes #79835

Pull Request resolved: #80220
Approved by: https://github.com/albanD

* [MPS] Add equal operator (#80195)

Which is, in essence is composite of `eq`->`all`->`item`
`native/mps/operators/Equal.cpp` is an almost verbatim copy of `native/cuda/Equal.cpp`

Fix codegen by generating MPSFunctions headers

Pull Request resolved: #80195
Approved by: https://github.com/albanD

* [MPS] add `aten::normal.Tensor_float` `aten::normal.float_Tensor` `aten::normal.Tensor_Tensor` (#80297)

Fixes #ISSUE_NUMBER

Pull Request resolved: #80297
Approved by: https://github.com/albanD, https://github.com/kulinseth

* [MPS] Add flip (#80214)

Fixes #ISSUE_NUMBER

Pull Request resolved: #80214
Approved by: https://github.com/DenisVieriu97, https://github.com/albanD

* [MPS] Add logical ops (#80216)

This PR adds `logical_not`, `logical_and`, `logical_or`, `logical_xor`.
Pull Request resolved: #80216
Approved by: https://github.com/albanD, https://github.com/kulinseth

* [MPS] Add glu (#79866)

Adds mps op for `aten::glu.out`.

Pull Request resolved: #79866
Approved by: https://github.com/kulinseth, https://github.com/albanD

* [MPS] Fix std/var cache issue (#80502)

Use `getTensorsStringKey` which has tensor shape info added as part of the key to prevent cache lookup issue when the shape of input tensor is changed.

Fixes #80499

Pull Request resolved: #80502
Approved by: https://github.com/malfet, https://github.com/kulinseth

* Add scatter support for view operations (#79939)

* Add scatter support for view operations; #78074, #78886, #79672
* Update test_slicing_replace_column to properly test different sizes
* Handle in-place changes for binary ops; add new testcase
* Add new view ops testing scatter; add MPSDebugConfig.h config file for debugging purposes
* Merge gatherViewTensor and scatterViewTensor into a generic function
* Add scatter on demand in scatterViewOperation instead of caching it into a generic graph
* Create separate graphs for scatter and gather;
* Create scatter graph at scatter time

Fixes #ISSUE_NUMBER

Pull Request resolved: #79939
Approved by: https://github.com/razarmehr

* MPS: Fix handling of 1D tensors in linear backward (#80759)

Fixes ##79784

Pull Request resolved: #80759
Approved by: https://github.com/ezyang

* [MPS] Move the View ops to a separate file and reduce the number of graphs created (#80491)

This is dependent on the PR to go in first: #79939

Remove the data_ptr from the View Graph key which reduces the number of
graphs created significantly.

Don't wait when copying from MPS to MPS tensors

Pull Request resolved: #80491
Approved by: https://github.com/malfet

* [MPS] Add softplus backward (#79873)

Fixes #ISSUE_NUMBER

Pull Request resolved: #79873
Approved by: https://github.com/malfet

* [MPS] Add argmin (#80828)

This PR

1. adds argmin
2. refactors `reduction_type` in `ReduceOps.mm` with enum.

Co-authored by Kulin Seth <kulinseth@gmail.com>
Pull Request resolved: #80828
Approved by: https://github.com/malfet

* [MPS] Fix LSTM batch_first output transposed (#80597)

The output of LSTM with `batch_first` should be transposed back to batch first format.

Fixes #80306

Pull Request resolved: #80597
Approved by: https://github.com/kulinseth

* [MPS][BE] Introduce MPSUnaryCachedGraph (#81033)

I.e. CachedGraph that has input and output tensors
Also, add `MPSGraphCache::LookUpAs` template, which combines LookUp with
static_cast to target type

Pull Request resolved: #81033
Approved by: https://github.com/kulinseth

* [MPS] Add test consistency from OpInfo based tests from PR 78504 (#79532)

Pull Request resolved: #79532
Approved by: https://github.com/albanD, https://github.com/malfet

* [MPS] Add huber loss (#80163)

Fixes #ISSUE_NUMBER

Pull Request resolved: #80163
Approved by: https://github.com/kulinseth, https://github.com/malfet

* Remove two tests dependent on the MPS serialization checkin.

* Fix lint error (FLAKE8) F401

* Remove the serialization test from test_mps as its support is not there in 1.12.1.

Co-authored-by: Kulin Seth <kulinseth@gmail.com>
Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>
Co-authored-by: Kulin Seth <kulin_seth@apple.com>
Co-authored-by: Abhishek Pathak <abhipathak97@gmail.com>
Co-authored-by: Nikita Shulga <nshulga@fb.com>
Co-authored-by: qqaatw <qqaatw@gmail.com>
Co-authored-by: Ramin Azarmehr <razarmehr@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request cla signed Merged open source Reverted triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants