Skip to content

Conversation

atalman and others added 30 commits August 28, 2023 11:55
* [CI] Release only changes for 2.1 release

* include circle script

* release only changes for test-infra

* More test-infra related
…ch#108139)

For test_conv_bn_fuse dynamic case, we always fuse bn with convolution and there only a external convolution call, not loops, so it will failed when we do a dynamic loop vars check. This PR will skip this case.

Pull Request resolved: pytorch#108113
Approved by: https://github.com/huydhn
…8075) (pytorch#108177)

It should be `${{ inputs.build_environment }}`, although I wonder why not just clean up the artifacts directory for all build instead of just `aarch64`
Pull Request resolved: pytorch#108075
Approved by: https://github.com/atalman, https://github.com/seemethere
…ytorch#108200)

There are more issues that I expect at the beginning:

* Triton was uploaded on `main` instead of `nightly` and release branch
* The environment `conda-aws-upload` wasn't used correctly in both wheel and conda upload
* Conda update wasn't run in a separate ephemeral runner
* Duplicated upload logic, should have just use `bash .circleci/scripts/binary_upload.sh` instead
* Handle `CONDA_PYTORCHBOT_TOKEN` and `CONDA_PYTORCHBOT_TOKEN_TEST` tokens in a similar way as pytorch/test-infra#4530

Part of pytorch#108154
…de (pytorch#108203) (pytorch#108251)

This is the follow-up of pytorch#108187 to set the correct release version without commit hash for triton wheel and conda binaries when building them in release mode.

### Testing

* With commit hash (nightly): https://github.com/pytorch/pytorch/actions/runs/6019021716
* Without commit hash https://github.com/pytorch/pytorch/actions/runs/6019378616 (by adding `--release` into the PR)
Pull Request resolved: pytorch#108203
Approved by: https://github.com/atalman
This should be nightly for nightly and test for release candidates.  There are 2 bugs:

* The shell needs to set to `bash` explicitly, otherwise, GHA uses `sh` which doesn't recognized `[[` as shown in https://github.com/pytorch/pytorch/actions/runs/6030476858/job/16362717792#step:6:10
*`${GITHUB_REF_NAME}` is un-quoted.  This is basically https://www.shellcheck.net/wiki/SC2248 but this wasn't captured by actionlint, and shellcheck doesn't work with workflow YAML file.  I will think about how to add a lint rule for this later then.

### Testing

https://github.com/pytorch/pytorch/actions/runs/6031330411 to confirm that setting the channel is performed correctly.

Pull Request resolved: pytorch#108291
Approved by: https://github.com/osalpekar, https://github.com/atalman
…h#95271) (pytorch#108216)

### Motivation

- Add channels_last3d support for mkldnn conv and mkldnn deconv.
- Use `ideep::convolution_transpose_forward::compute_v3` instead of `ideep::convolution_transpose_forward::compute`.  compute_v3 uses `is_channels_last` to notify ideep whether to go CL or not to align with the memory format check of PyTorch.

### Testing
1 socket (28 cores):

- memory format: torch.contiguous_format

module | shape | forward / ms | backward / ms
-- | -- | -- | --
conv3d | input size: (32, 32, 10, 100, 100), weight size: (32, 32, 3, 3, 3) | 64.56885 | 150.1796
conv3d | input size: (32, 16, 10, 200, 200), weight size: (16, 16, 3, 3, 3) | 100.6754 | 231.8883
conv3d | input size: (16, 4, 5, 300, 300), weight size: (4, 4, 3, 3, 3) | 19.31751 | 68.31131

module | shape | forward / ms | backward / ms
-- | -- | -- | --
ConvTranspose3d | input size: (32, 32, 10, 100, 100), weight size: (32, 32, 3, 3, 3) | 122.7646 | 207.5125
ConvTranspose3d | input size: (32, 16, 10, 200, 200), weight size: (16, 16, 3, 3, 3) | 202.4542 | 368.5492
ConvTranspose3d | input size: (16, 4, 5, 300, 300), weight size: (4, 4, 3, 3, 3) | 122.959 | 84.62577

- memory format: torch.channels_last_3d

module | shape | forward / ms | backward / ms
-- | -- | -- | --
conv3d | input size: (32, 32, 10, 100, 100), weight size: (32, 32, 3, 3, 3) | 40.06993 | 114.317
conv3d | input size: (32, 16, 10, 200, 200), weight size: (16, 16, 3, 3, 3 | 49.08249 | 133.4079
conv3d | input size: (16, 4, 5, 300, 300), weight size: (4, 4, 3, 3, 3) | 5.873911 | 17.58647

module | shape | forward / ms | backward / ms
-- | -- | -- | --
ConvTranspose3d | input size: (32, 32, 10, 100, 100), weight size: (32, 32, 3, 3, 3) | 88.4246 | 208.2269
ConvTranspose3d | input size: (32, 16, 10, 200, 200), weight size: (16, 16, 3, 3, 3 | 140.0725 | 270.4172
ConvTranspose3d | input size: (16, 4, 5, 300, 300), weight size: (4, 4, 3, 3, 3) | 23.0223 | 37.16972

Pull Request resolved: pytorch#95271
Approved by: https://github.com/jgong5, https://github.com/cpuhrsch
…ytorch#108385)

Addresses [issue pytorch#106085](pytorch#106085).

In `torch/nn/modules/rnn.py`:
- Adds documentation string to RNNBase class.
- Adds parameters to __init__ methods for RNN, LSTM, and GRU, classes.
- Adds type annotations to __init__ methods for RNN, LSTM, and GRU.

In `torch/ao/nn/quantized/dynamic/modules/rnn.py`:
- Adds type specifications to `_FLOAT_MODULE` attributes in RNNBase, RNN, LSTM, and GRU classes.
> This resolves a `mypy` assignment error `Incompatible types in assignment (expression has type "Type[LSTM]", base class "RNNBase" defined the type as "Type[RNNBase]")` that seemed to be a result of fully specified type annotations in `torch/nn/modules/rnn.py`).
Pull Request resolved: pytorch#106222
Approved by: https://github.com/mikaylagawarecki
…8410)

By refactoring `_local_scalar_dense_mps` to use `_empty_like` to allocate CPU tensor.
Also, print a more reasonable error message when dst dim is less than src in mps_copy_

This fixes regression introduced by pytorch#105617 and adds regression test.

<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at abd06e6</samp>

> _Sing, O Muse, of the valiant deeds of the PyTorch developers_
> _Who strive to improve the performance and usability of tensors_
> _And who, with skill and wisdom, fixed a bug in the MPS backend_
> _That caused confusion and dismay to many a user of `item()`_

Fixes pytorch#107867

Pull Request resolved: pytorch#107913
Approved by: https://github.com/albanD

Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>
…S, and changed the error message (pytorch#107758) (pytorch#108365)

New message when invalid option is provided
<img width="1551" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/8b61534a-ee55-431e-94fe-2ffa25b7fd5c">

TORCH_LOGS="help"
<img width="1558" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/72e8939c-92fa-4141-8114-79db71451d42">

TORCH_LOGS="+help"
<img width="1551" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/2cdc94ac-505a-478c-aa58-0175526075d2">

Pull Request resolved: pytorch#107758
Approved by: https://github.com/ezyang, https://github.com/mlazos
ghstack dependencies: pytorch#106192
…n set as an installation requirement yet (pytorch#108424) (pytorch#108471)

The dependency was added twice before in CUDA and ROCm binaries, one as an installation dependency from builder and the later as an extra dependency for dynamo, for example:

```
Requires-Python: >=3.8.0
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: NOTICE
Requires-Dist: filelock
Requires-Dist: typing-extensions
Requires-Dist: sympy
Requires-Dist: networkx
Requires-Dist: jinja2
Requires-Dist: fsspec
Requires-Dist: pytorch-triton (==2.1.0+e6216047b8)
Provides-Extra: dynamo
Requires-Dist: pytorch-triton (==2.1.0+e6216047b8) ; extra == 'dynamo'
Requires-Dist: jinja2 ; extra == 'dynamo'
Provides-Extra: opt-einsum
Requires-Dist: opt-einsum (>=3.3) ; extra == 'opt-einsum'
```

In the previous release, we needed to remove this part from `setup.py` to build release binaries pytorch#96010.  With this, that step isn't needed anymore because the dependency will come from builder.

### Testing

Using the draft pytorch#108374 for testing and manually inspect the wheels artifact at https://github.com/pytorch/pytorch/actions/runs/6045878399 (don't want to go through all `ciflow/binaries` again)

* torch-2.1.0.dev20230901+cu121-cp39-cp39-linux_x86_64
```
Requires-Python: >=3.8.0
Description-Content-Type: text/markdown
Requires-Dist: filelock
Requires-Dist: typing-extensions
Requires-Dist: sympy
Requires-Dist: networkx
Requires-Dist: jinja2
Requires-Dist: fsspec
Requires-Dist: pytorch-triton (==2.1.0+e6216047b8) <-- This will be 2.1.0 on the release branch after pytorch/builder#1515
Provides-Extra: dynamo
Requires-Dist: jinja2 ; extra == 'dynamo'
Provides-Extra: opt-einsum
Requires-Dist: opt-einsum (>=3.3) ; extra == 'opt-einsum'
```

* torch-2.1.0.dev20230901+cu121.with.pypi.cudnn-cp39-cp39-linux_x86_64
```
Requires-Python: >=3.8.0
Description-Content-Type: text/markdown
Requires-Dist: filelock
Requires-Dist: typing-extensions
Requires-Dist: sympy
Requires-Dist: networkx
Requires-Dist: jinja2
Requires-Dist: fsspec
Requires-Dist: pytorch-triton (==2.1.0+e6216047b8)
Requires-Dist: nvidia-cuda-nvrtc-cu12 (==12.1.105) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cuda-runtime-cu12 (==12.1.105) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cuda-cupti-cu12 (==12.1.105) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cudnn-cu12 (==8.9.2.26) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cublas-cu12 (==12.1.3.1) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cufft-cu12 (==11.0.2.54) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-curand-cu12 (==10.3.2.106) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cusolver-cu12 (==11.4.5.107) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-cusparse-cu12 (==12.1.0.106) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-nccl-cu12 (==2.18.1) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: nvidia-nvtx-cu12 (==12.1.105) ; platform_system == "Linux" and platform_machine == "x86_64"
Requires-Dist: triton (==2.1.0) ; platform_system == "Linux" and platform_machine == "x86_64" <--This is 2.1.0 because it already has pytorch#108423, but the package doesn't exist yet atm
Provides-Extra: dynamo
Requires-Dist: jinja2 ; extra == 'dynamo'
Provides-Extra: opt-einsum
Requires-Dist: opt-einsum (>=3.3) ; extra == 'opt-einsum'
```

* torch-2.1.0.dev20230901+rocm5.6-cp38-cp38-linux_x86_64
```
Requires-Python: >=3.8.0
Description-Content-Type: text/markdown
Requires-Dist: filelock
Requires-Dist: typing-extensions
Requires-Dist: sympy
Requires-Dist: networkx
Requires-Dist: jinja2
Requires-Dist: fsspec
Requires-Dist: pytorch-triton-rocm (==2.1.0+34f8189eae) <-- This will be 2.1.0 on the release branch after pytorch/builder#1515
Provides-Extra: dynamo
Requires-Dist: jinja2 ; extra == 'dynamo'
Provides-Extra: opt-einsum
Requires-Dist: opt-einsum (>=3.3) ; extra == 'opt-einsum'
```

Pull Request resolved: pytorch#108424
Approved by: https://github.com/atalman
…ytorch#108523)

* When byteorder record is missing load as little endian by default

Fixes pytorch#101688

* Add test for warning

Also change warning type from DeprecationWarning
to UserWarning to make it visible by default.
…8143) (pytorch#108258)

Pull Request resolved: pytorch#108143
Approved by: https://github.com/andrewor14

Co-authored-by: Tugsbayasgalan Manlaibaatar <tmanlaibaatar@fb.com>
) (pytorch#108255)

Summary: This commit adds a public facing
`torch.ao.quantization.move_model_to_eval` util function
for QAT users. Instead of calling model.eval() on an exported
model (which doesn't work, see
pytorch#103681), the user
would call this new util function instead. This ensures special
ops such as dropout and batchnorm (not supported yet) will have
the right behavior when the graph is later used for inference.

Note: Support for an equivalent `move_model_to_train` will be
added in the future. This is difficult to do for dropout
currently because the eval pattern of dropout is simply a clone
op, which we cannot just match and replace with a dropout op.

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_move_model_to_eval

Reviewers: jerryzh168, kimishpatel

Subscribers: jerryzh168, kimishpatel, supriyar

Differential Revision: [D48814735](https://our.internmc.facebook.com/intern/diff/D48814735)
Pull Request resolved: pytorch#108184
Approved by: https://github.com/jerryzh168
…h#108593)

Building docker in trunk is failing atm https://github.com/pytorch/pytorch/actions/runs/6033657019/job/16370683676 with the following error:

```
+ conda_reinstall numpy=1.24.4
+ as_jenkins conda install -q -n py_3.10 -y --force-reinstall numpy=1.24.4
+ sudo -E -H -u jenkins env -u SUDO_UID -u SUDO_GID -u SUDO_COMMAND -u SUDO_USER env PATH=/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 conda install -q -n py_3.10 -y --force-reinstall numpy=1.24.4
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... unsuccessful initial attempt using frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... unsuccessful initial attempt using frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  - numpy=1.24.4

Current channels:

  - https://repo.anaconda.com/pkgs/main/linux-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/linux-64
  - https://repo.anaconda.com/pkgs/r/noarch
```

This was pulled in by pandas 2.1.0 released yesterday https://pypi.org/project/pandas/2.1.0
Pull Request resolved: pytorch#108355
Approved by: https://github.com/kit1980, https://github.com/atalman, https://github.com/malfet
…ytorch#108141) (pytorch#108327)

A previous PR pytorch#106274 decomposes `aten.dropout` and would create a `clone()` when `eval()` or `p=0`. This makes many SDPA-related models fail to match fused_attention pattern matchers.

This PR adds new fused_attention pattern matchers with an additional clone to re-enable the SDPA op matching.

Pull Request resolved: pytorch#108141
Approved by: https://github.com/jgong5, https://github.com/eellison
…rch#108596)

Fixes pytorch#103142

Pull Request resolved: pytorch#108292
Approved by: https://github.com/albanD

Co-authored-by: Kurt Mohler <kmohler@quansight.com>
This PR fixes the new_empty_strided op to become replicate from sharding
when necessary, this is a quick fix to resolve pytorch#107661

We'll need to think more about the behavior of this op when it comes to
sharding, one possibility is to follow the input sharding, but given the
output shape of this op might not be the same as the input, it's hard to
say we should follow the input sharding, further improvement needed once
we figure out the op syntax
Pull Request resolved: pytorch#107835
Approved by: https://github.com/fduwjj
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 13, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/147119

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit f4c686b with merge base 138e289 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added module: cpu CPU specific problem (e.g., perf, algorithm) module: dynamo module: inductor module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: quantization release notes category release notes: releng release notes category labels Feb 13, 2025
@linux-foundation-easycla
Copy link

CLA Missing ID CLA Not Signed

@facebook-github-bot facebook-github-bot added oncall: jit Add this issue/PR to JIT oncall triage queue module: rocm AMD GPU support for Pytorch fx labels Feb 13, 2025
@janeyx99
Copy link
Contributor

Looks like the PR brought in a lot of old diffs, which is likely a mistake that subscribed a bunch of people. Closing this PR--please open a new clean one with only your intended changes accompanied by a descriptive PR body.

@janeyx99 janeyx99 closed this Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fx module: cpu CPU specific problem (e.g., perf, algorithm) module: dynamo module: inductor module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration module: rocm AMD GPU support for Pytorch oncall: distributed Add this issue/PR to distributed oncall triage queue oncall: jit Add this issue/PR to JIT oncall triage queue open source release notes: quantization release notes category release notes: releng release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.