Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull in master #4

Merged
merged 2,739 commits into from Jan 19, 2021
Merged

Pull in master #4

merged 2,739 commits into from Jan 19, 2021

Conversation

skyw
Copy link
Owner

@skyw skyw commented Jan 19, 2021

Fixes #{issue number}

r-barnes and others added 30 commits January 7, 2021 15:39
Summary: Pull Request resolved: #50079

Test Plan: Sandcastle tests

Reviewed By: xush6528

Differential Revision: D25718694

fbshipit-source-id: f535fb879bcd4cb4ea715adfd90bbffa3fcc1150
Summary:
Pull Request resolved: #49944

Upgrades type annotations from Python2 to Python3

Test Plan: Sandcastle tests

Reviewed By: xush6528

Differential Revision: D25717539

fbshipit-source-id: c621e2712e87eaed08cda48eb0fb224f6b0570c9
…SGD hook signatures (#50197)

Summary:
Pull Request resolved: #50197

Remove the extra comma after "bucket".
ghstack-source-id: 119513484

Test Plan: waitforbuildbot

Reviewed By: rohan-varma

Differential Revision: D25823117

fbshipit-source-id: acf048f7cb732c23cba3a81ccce1e70f6b9f4299
Summary:
closes gh-49704

Pull Request resolved: #49705

Reviewed By: mruberry

Differential Revision: D25725352

Pulled By: malfet

fbshipit-source-id: 05a7041c9caffde4a5c1eb8af0d13697075103af
Summary:
In `batch_norm_gather_stats_with_counts_cuda` use `input.scalar_type()` if `running_mean` is not defined
In `SyncBatchNorm` forward function create count tensor with `torch.float32` type if `running_mean` is None
Fix a few typos

Pull Request resolved: #50126

Test Plan:
```
python -c "import torch;print(torch.batch_norm_gather_stats_with_counts( torch.randn(1, 3, 3, 3, device='cuda'), mean = torch.ones(2, 3, device='cuda'), invstd = torch.ones(2, 3, device='cuda'), running_mean = None, running_var = None  , momentum = .1, eps = 1e-5, counts = torch.ones(2, device='cuda')))"
```

Fixes #49730

Reviewed By: ngimel

Differential Revision: D25797930

Pulled By: malfet

fbshipit-source-id: 22a91e3969b5e9bbb7969d9cc70b45013a42fe83
Summary:
Pull Request resolved: #49766

Devirtualizing this seems like a decent performance improvement on
internal benchmarks.

The *reason* this is a performance improvement is twofold:
1) virtual calls are a bit slower than regular calls
2) virtual functions in `TensorImpl` can't be inlined

Test Plan: internal benchmark

Reviewed By: hlu1

Differential Revision: D25602321

fbshipit-source-id: d61556456ccfd7f10c6ebdc3a52263b438a2aef1
…49767)

Summary:
Pull Request resolved: #49767

I'm told that the base implementation should work fine. Let's validate that in an intermediate diff before removing it.
ghstack-source-id: 119528066

Test Plan: CI

Reviewed By: ezyang, bhosmer

Differential Revision: D25686830

fbshipit-source-id: f931394d3de6df7f6c5c68fe8ab711d90d3b12fd
Summary:
Pull Request resolved: #49770

Seems like the performance cost of making this commonly-called method virtual isn't worth having use of undefined tensors crash a bit earlier (they'll still fail to dispatch).
ghstack-source-id: 119528065

Test Plan: framework overhead benchmarks

Reviewed By: ezyang

Differential Revision: D25687465

fbshipit-source-id: 89aabce165a594be401979c04236114a6f527b59
Summary:
Pull Request resolved: #49906

This commit modifies RPC Message to inherit from `torch::CustomClassHolder`,
and wraps a Message in an IValue in `RpcAgent::send()`.

Test Plan: Imported from OSS

Reviewed By: lw

Differential Revision: D25719518

Pulled By: mrshenli

fbshipit-source-id: 694e40021e49e396da1620a2f81226522341550b
…ls.* (#49927)

Summary: Pull Request resolved: #49927

Test Plan: Imported from OSS

Reviewed By: lw

Differential Revision: D25724241

Pulled By: mrshenli

fbshipit-source-id: d608e448f5224e41fbb0b5be6b9ac51a587f25b4
Summary: Pull Request resolved: #49960

Test Plan: Imported from OSS

Reviewed By: lw

Differential Revision: D25730530

Pulled By: mrshenli

fbshipit-source-id: 5d54572c653592d79c40aed616266c87307a1ad8
…9995)

Summary: Pull Request resolved: #49995

Test Plan: Imported from OSS

Reviewed By: lw

Differential Revision: D25745301

Pulled By: mrshenli

fbshipit-source-id: b5e3a7e0b377496924847d8d70d61de32e2d87f4
Summary: Pull Request resolved: #50004

Test Plan: Imported from OSS

Reviewed By: lw

Differential Revision: D25750602

Pulled By: mrshenli

fbshipit-source-id: 06854a77f4fb5cc4c34a1ede843301157ebf7309
…50005)

Summary: Pull Request resolved: #50005

Test Plan: Imported from OSS

Reviewed By: lw

Differential Revision: D25750663

Pulled By: mrshenli

fbshipit-source-id: 6d97156b61d82aa19dd0567ca72fe04bd7b5d1e7
Summary: Pull Request resolved: #50020

Test Plan: Imported from OSS

Reviewed By: lw

Differential Revision: D25752968

Pulled By: mrshenli

fbshipit-source-id: 138d37e204b6f9a584633cfc79fd44c8c9c00f41
Summary: Pull Request resolved: #50023

Test Plan: Imported from OSS

Reviewed By: lw

Differential Revision: D25753217

Pulled By: mrshenli

fbshipit-source-id: 5a98473c17535c8f92043abe143064e7fca4413b
Summary: Pull Request resolved: #50024

Test Plan: Imported from OSS

Reviewed By: lw

Differential Revision: D25753386

Pulled By: mrshenli

fbshipit-source-id: fdca051b805762a2c88f965ceb3edf1c25d40a56
Summary: Pull Request resolved: #50025

Test Plan: Imported from OSS

Reviewed By: lw

Differential Revision: D25753587

Pulled By: mrshenli

fbshipit-source-id: a5d4106a10d1b0d3e4c406751795f19af8afd120
Summary: Pull Request resolved: #50026

Test Plan: Imported from OSS

Reviewed By: lw

Differential Revision: D25753588

Pulled By: mrshenli

fbshipit-source-id: a6fcda7830901dd812fbf0489b001e6bd9673780
Summary: Pull Request resolved: #50027

Test Plan: Imported from OSS

Reviewed By: lw

Differential Revision: D25753815

Pulled By: mrshenli

fbshipit-source-id: 85b9b03fec52b4175288ac3a401285607744b451
Summary: Pull Request resolved: #50028

Test Plan: Imported from OSS

Reviewed By: lw

Differential Revision: D25753887

Pulled By: mrshenli

fbshipit-source-id: 40718349c2def262a16aaa24c167c0b540cddcb1
Summary: Pull Request resolved: #50029

Test Plan:
buck run mode/opt -c=python.package_style=inplace //caffe2/torch/fb/training_toolkit/examples:ctr_mbl_feed_april_2020 -- local-preset --flow-entitlement pytorch_ftw_gpu --secure-group oncall_pytorch_distributed

Before:

```
...

I0107 11:03:10.434000 3831111 print_publisher.py:23  master      ] Publishing batch metrics: qps-qps|total_examples 14000.0
I0107 11:03:10.434000 3831111 print_publisher.py:23  master      ] Publishing batch metrics: qps-qps|window_qps 74.60101318359375
I0107 11:03:10.434000 3831111 print_publisher.py:23  master      ] Publishing batch metrics: qps-qps|lifetime_qps 74.60101318359375

...

I0107 11:05:12.132000 3831111 print_publisher.py:23  master      ] Publishing batch metrics: qps-qps|total_examples 20000.0
I0107 11:05:12.132000 3831111 print_publisher.py:23  master      ] Publishing batch metrics: qps-qps|window_qps 64.0
I0107 11:05:12.132000 3831111 print_publisher.py:23  master      ] Publishing batch metrics: qps-qps|lifetime_qps 64.64917755126953

...
```

After:

```
...

I0107 11:53:03.858000 53693 print_publisher.py:23  master      ] Publishing batch metrics: qps-qps|total_examples 14000.0
I0107 11:53:03.858000 53693 print_publisher.py:23  master      ] Publishing batch metrics: qps-qps|window_qps 72.56404876708984
I0107 11:53:03.858000 53693 print_publisher.py:23  master      ] Publishing batch metrics: qps-qps|lifetime_qps 72.56404876708984

...

I0107 11:54:24.612000 53693 print_publisher.py:23  master      ] Publishing batch metrics: qps-qps|total_examples 20000.0
I0107 11:54:24.612000 53693 print_publisher.py:23  master      ] Publishing batch metrics: qps-qps|window_qps 73.07617950439453
I0107 11:54:24.612000 53693 print_publisher.py:23  master      ] Publishing batch metrics: qps-qps|lifetime_qps 73.07617950439453

...
```

Reviewed By: lw

Differential Revision: D25774915

Pulled By: mrshenli

fbshipit-source-id: 1128c3c2df9d76e36beaf171557da86e82043eb9
Summary:
Pull Request resolved: #47507

This introduces a new SizesAndStrides class as a helper for
TensorImpl, in preparation for changing its representation.
ghstack-source-id: 119313559

Test Plan:
Added new automated tests as well.

Run framework overhead benchmarks. Results seem to be neutral-ish.

Reviewed By: ezyang

Differential Revision: D24762557

fbshipit-source-id: 6cc0ede52d0a126549fb51eecef92af41c3e1a98
Summary:
Pull Request resolved: #47508

This moves SizesAndStrides to a specialized representation
that is 5 words smaller in the common case of tensor rank 5 or less.
ghstack-source-id: 119313560

Test Plan:
SizesAndStridesTest added in previous diff passes under
ASAN + UBSAN.

Run framework overhead benchmarks. Looks more or less neutral.

Reviewed By: ezyang

Differential Revision: D24772023

fbshipit-source-id: 0a75fd6c2daabb0769e2f803e80e2d6831871316
Summary:
Excludes sm_86 GPU devices from using cuDNN persistent RNN.

This is because there are some hard-to-detect edge cases that will throw exceptions with cudnn 8.0.5 on Nvidia A40 GPU.

Pull Request resolved: #49534

Reviewed By: mruberry

Differential Revision: D25632378

Pulled By: mrshenli

fbshipit-source-id: cbe78236d85d4d0c2e4ca63a3fc2c4e2de662d9e
Summary:
Pull Request resolved: #50131

Noticed that in the internal diff for
#49069 there was a clang-tidy warning to
use emplace instead of push_back. This can save us a copy as it eliminates the
unnecessary in-place construction
ghstack-source-id: 119560979

Test Plan: CI

Reviewed By: pritamdamania87

Differential Revision: D25800134

fbshipit-source-id: 243e57318f5d6e43de524d4e5409893febe6164c
Test Plan: revert-hammer

Differential Revision:
D25687465 (4de6b27)

Original commit changeset: 89aabce165a5

fbshipit-source-id: fa5def17209d1691e68b1245fa0873fd03e88eaa
Summary:
This solves a race condition where the worker thread might
see a partially initialized graph_task

Fixes #49652

I don't know how to reliably trigger the race so I didn't add any test. But the rocm build flakyness (it just happens to race more often on rocm builds) should disappear after this PR.

Pull Request resolved: #50164

Reviewed By: zou3519

Differential Revision: D25824954

Pulled By: albanD

fbshipit-source-id: 6a3391753cb2afd2ab415d3fb2071a837cc565bb
Summary:
Reference: #42515

Pull Request resolved: #50093

Reviewed By: H-Huang

Differential Revision: D25803549

Pulled By: mruberry

fbshipit-source-id: e6f245b5e728f2dca6072f8c359f03dff63aa14d
Summary:
Remove outdated comment and update to use new paths.

Pull Request resolved: #50166

Reviewed By: zou3519

Differential Revision: D25824942

Pulled By: albanD

fbshipit-source-id: 7dc694891409e80e1804eddcdcc50cc21b60f822
malfet and others added 29 commits January 15, 2021 23:39
Summary:
Fixes `docstring of torch.distributed.rpc.RRef.remote:14: WARNING: Field list ends without a blank line; unexpected unindent.` by indenting multiline fieldlist

Pull Request resolved: #50651

Reviewed By: SplitInfinity

Differential Revision: D25935839

Pulled By: malfet

fbshipit-source-id: e2613ae75334d01ab57f4b071cb0fddf80c6bd78
Summary:
Adds the rest of the ops.

Pull Request resolved: #50643

Reviewed By: pbelevich

Differential Revision: D25936346

Pulled By: Chillee

fbshipit-source-id: 4e2a7afbeabde51991c39d187a8c35e766950ffe
Summary:
Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>

Pull Request resolved: #50629

Reviewed By: albanD

Differential Revision: D25935005

Pulled By: rohan-varma

fbshipit-source-id: e0969afecac2f319833189a7a8897d78068a2cda
Summary:
Fixes #42588
The contiguity check used to be for memory format suggested by `grad_output->suggest_memory_format()`, but an invariant guaranteed by derivatives.yaml is `input->suggest_memory_format()`

Pull Request resolved: #50659

Reviewed By: mruberry

Differential Revision: D25938921

Pulled By: ngimel

fbshipit-source-id: a945bfef6ce3d91b17e7ff96babe89ffd508939a
…st_recurrent (#50668)

Summary:
Pull Request resolved: #50668

GPU initialization sometimes is slow

Test Plan: buck test mode/opt //caffe2/caffe2/python:hypothesis_test -- --exact 'caffe2/caffe2/python:hypothesis_test - test_recurrent (caffe2.caffe2.python.hypothesis_test.TestOperators)' --run-disabled

Reviewed By: hl475

Differential Revision: D25939037

fbshipit-source-id: 832700cf42ece848cda66dd629a06ecda207f086
…ispatch for CPU min/max pointwise ops (#50465)

Summary:
Fixes #50064

**PROBLEM DESCRIPTION:**
1. Had not removed dtype checks for complex types in the previous PR (#50347) for this issue.
These type-checks were added in #36377, but are no longer necessary,
as we now rely upon dispatch macros to produce error messages.
2. dtype checks in `clamp_max()` and `clamp_min()` for complex inputs had not been removed either.
3. For min/max pointwise ops in TensorCompareKernel.cpp, complex dispatch had not been removed for min/max functions.

### **FIX DESCRIPTION:**
**FIX SUMMARY:**
1. Removed dtype checks added in #36377, and added 3 more in TensorCompare.cpp.
2. Removed dtype checks for complex inputs in `clamp_max()` and `clamp_min()`.
3.  Disabled complex dispatch for min/max pointwise ops in TensorCompareKernel.cpp.
4. Error messages in the exceptions raised due to min/max ops not being implemented are now checked for containing the text _not support_ (which can also be present in _not supported_), or _not implemented_, so one of them should be a part of error messages, in order for them to be informative.

**REASON FOR NOT CHANGING DISPATCH FOR CUDA AND CLAMP OPS**:

As for the CUDA min/max operations, their kernels do not seem to be compiled & dispatched for complex types anyway, so no further changes seem to be required. Basically, the dispatch macros currently being used don't have cases for complex types.

For example,

1. the reduce CUDA ops use [AT_DISPATCH_ALL_TYPES_AND2 (https://github.com/pytorch/pytorch/commit/678fe9f0771a5cd98ead214363d70480ba03000d)](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/Dispatch.h#L548-L575) in [ReduceMinMaxKernel.cu](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/ReduceMinMaxKernel.cu), and that macro doesn't allow complex types.

2. In [MinMaxElementwiseKernel.cu](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/MaxMinElementwiseKernel.cu), the CUDA pointwise ops use [`AT_DISPATCH_FLOATING_TYPES_AND2 (https://github.com/pytorch/pytorch/commit/678fe9f0771a5cd98ead214363d70480ba03000d)`](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/Dispatch.h#L240-L263) for non-integral & non-boolean types, and this marco doesn't have a case for complex types either.

3. [clamp CUDA ops](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/UnaryOpsKernel.cu#L170-L211) use `AT_DISPATCH_ALL_TYPES_AND2 (https://github.com/pytorch/pytorch/commit/678fe9f0771a5cd98ead214363d70480ba03000d)`, which doesn't have a case for complex types.

Similarly, [CPU clamp min/max ops](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cpu/UnaryOpsKernel.cpp#L428-L458) use the `AT_DISPATCH_ALL_TYPES_AND `dispatch macro, which doesn't have a case for complex types.

**REASON FOR ADDING 3 dtype CHECKS:**
There are a few cases in which the methods corresponding to `min_stub()` or `max_stub()` are not called, so dispatch macros don't get invoked, resulting in no exceptions being raised. Hence, `dtype` checks are necessary at 3 places to raise exceptions:

1. https://github.com/pytorch/pytorch/blob/52dcc7299925de055d330781d2fe0dad71182829/aten/src/ATen/native/TensorCompare.cpp#L342
2. https://github.com/pytorch/pytorch/blob/52dcc7299925de055d330781d2fe0dad71182829/aten/src/ATen/native/TensorCompare.cpp#L422
3. https://github.com/pytorch/pytorch/blob/52dcc7299925de055d330781d2fe0dad71182829/aten/src/ATen/native/TensorCompare.cpp#L389

The first dtype check requirement can be verified from the following example Python code based on `test_complex_unsupported()`:
```
import unittest
import torch

class MyTestCase(unittest.TestCase):

   def test_1(self):
      t = torch.tensor((1 + 1j), device='cpu', dtype=torch.complex128)
      with self.assertRaises(Exception):
         torch.max(t, dim=0)

if __name__ == '__main__':
    unittest.main()
```

Pull Request resolved: #50465

Reviewed By: mruberry

Differential Revision: D25938106

Pulled By: ngimel

fbshipit-source-id: 95e2df02ba8583fa3ce87d4a2fdcd60b912dda46
Summary:
Introduced operator variant to OpInfo

Context: Split of #49158

cc mruberry

Pull Request resolved: #50370

Reviewed By: mrshenli

Differential Revision: D25897821

Pulled By: mruberry

fbshipit-source-id: 4387ea10607dbd7209842b685f1794bcb31f434e
Summary:
Reopen PR for #46975

Pull Request resolved: #50007

Reviewed By: mruberry

Differential Revision: D25850808

Pulled By: ngimel

fbshipit-source-id: a232e02949182b7d3799448d24ad54a9e0bcf95c
…50632)

Summary:
Pull Request resolved: #50632

I'll port the following method tests in follow-up PRs:
`'baddbmm', 'addbmm', 'addmv', 'addr'`
After the tests are ported to OpInfo based tests, it would also be much easier to add tests with complex alpha and beta values.
Edit- it seems like it's hard to port the broadcasting variant tests because one ends up skipping `test_inplace_grad` and `test_variant_consistency_eager` even for the case when inputs are not required to be broadcasted.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D25947471

Pulled By: anjali411

fbshipit-source-id: 9faa7f1fd55a1269bad282adac2b39d19bfa4591
Summary:
- Related with #44937
- Use `resize_output` instead of `resize_as`
- Tuning the `native_functions.yaml`, move the inplace variant `pow_` next to the other `pow` entries

Pull Request resolved: #46830

Reviewed By: mrshenli

Differential Revision: D24567702

Pulled By: anjali411

fbshipit-source-id: a352422c9d4e356574dbfdf21fb57f7ca7c6075d
…_allcompare (#50696)

Summary:
Pull Request resolved: #50696

set no deadline for test_alklcompare

Test Plan: buck test mode/dev //caffe2/caffe2/python:lazy_dyndep_test -- --exact 'caffe2/caffe2/python:lazy_dyndep_test - test_allcompare (caffe2.caffe2.python.lazy_dyndep_test.TestLazyDynDepAllCompare)' --run-disabled

Reviewed By: hl475

Differential Revision: D25947800

fbshipit-source-id: d2043f97128e257ef06ebca9b68262bb1c0c5e6b
Summary:
Pull Request resolved: #50564

When an RPC was sent, the associated future was stored in two maps:
pendingResponseMessage_ and timeoutMap_. Once the response was received, the
entry was only removed from pendingResponseMessage_ and not timeoutMap_. The
pollTimedoudRpcs method then eventually removed the entry from timeoutMap_
after the time out duration had passed.

Although, in scenarios where there is a large timeout and a large number of
RPCs being used, it is very easy for the timeoutMap_ to grow without any
bounds. This was discovered in #50522.

To fix this issue, I've added some code to cleanup timeoutMap_ as well once we
receive a response.
ghstack-source-id: 119925182

Test Plan:
1) Unit test added.
2) Tested with repro in #50522

#Closes: #50522

Reviewed By: mrshenli

Differential Revision: D25919650

fbshipit-source-id: a0a42647e706d598fce2ca2c92963e540b9d9dbb
Summary: Pull Request resolved: #50674

Test Plan: Imported from OSS

Reviewed By: beauby

Differential Revision: D25941964

Pulled By: mrshenli

fbshipit-source-id: b53454efdce01f7c06f67dfb890d3c3bdc2c648f
Summary: Pull Request resolved: #50675

Test Plan: Imported from OSS

Reviewed By: beauby

Differential Revision: D25941963

Pulled By: mrshenli

fbshipit-source-id: 205786d7366f36d659a3a3374081a458cfcb4dd1
Summary:
Fixes #{[24991](#24991)}

I used a value of 0.75 as suggested in the forums by Thomas: https://discuss.pytorch.org/t/calculate-gain-tanh/20854/6

I verified that the value keeps the gradient stable for a 100-layer network.

Code to reproduce (from [jpeg729](https://discuss.pytorch.org/t/calculate-gain-tanh/20854/4)):
```python
import torch
import torch.nn.functional as F
import sys

a = torch.randn(1000,1000, requires_grad=True)
b = a
print (f"in: {a.std().item():.4f}")
for i in range(100):
    l = torch.nn.Linear(1000,1000, bias=False)
    torch.nn.init.xavier_normal_(l.weight, torch.nn.init.calculate_gain("selu"))
    b = getattr(F, 'selu')(l(b))
    if i % 10 == 0:
        print (f"out: {b.std().item():.4f}", end=" ")
        a.grad = None
        b.sum().backward(retain_graph=True)
        print (f"grad: {a.grad.abs().mean().item():.4f}")
```
Output:
```
in: 1.0008
out: 0.7968 grad: 0.6509
out: 0.3127 grad: 0.2760
out: 0.2404 grad: 0.2337
out: 0.2062 grad: 0.2039
out: 0.2056 grad: 0.1795
out: 0.2044 grad: 0.1977
out: 0.2005 grad: 0.2045
out: 0.2042 grad: 0.2273
out: 0.1944 grad: 0.2034
out: 0.2085 grad: 0.2464
```
I included the necessary documentation change, and it passes the _test_calculate_gain_nonlinear_ unittest.

Pull Request resolved: #50664

Reviewed By: mruberry

Differential Revision: D25942217

Pulled By: ngimel

fbshipit-source-id: 29ff1be25713484fa7c516df71b12fdaecfb9af8
Summary:
Signed-off-by: Kyle Chen <kylechen@amd.com>

cc: jeffdaily

Pull Request resolved: #50557

Reviewed By: mruberry

Differential Revision: D25941432

Pulled By: ngimel

fbshipit-source-id: 534fc8a91a48fa8b3b397e63423cd8347b41bbe2
…#50711)

Summary:
Pull Request resolved: #50711

As title, missed a few of these.

Test Plan: Imported from OSS

Reviewed By: yf225

Differential Revision: D25949363

Pulled By: suo

fbshipit-source-id: 197743fe7097d2ac894421a99c072696c3b8cd70
Summary:
Pull Request resolved: #50611

Removed the unused old-style code to prevent it from being used.
Added all autograd/gen_pyi sources to mypy-strict.ini config.

Confirmed byte-for-byte compatible with the old codegen:
```
Run it before and after this PR:
  .jenkins/pytorch/codegen-test.sh <baseline_output_dir>
  .jenkins/pytorch/codegen-test.sh <test_output_dir>

Then run diff to compare the generated files:
  diff -Naur <baseline_output_dir> <test_output_dir>
```

Confirmed clean mypy-strict run:
```
mypy --config mypy-strict.ini
```

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D25929730

Pulled By: ljk53

fbshipit-source-id: 1fc94436fd4a6b9b368ee0736e99bfb3c01d38ef
Summary:
As per title. Partially Fixes #49421.
These functions appear to be dead code.

Pull Request resolved: #50489

Reviewed By: mruberry

Differential Revision: D25948912

Pulled By: ngimel

fbshipit-source-id: 108723bd4c76cbc3535eba902d6f74597bfdfa58
Summary:
This is an automated pull request to update the first-party submodule for [pytorch/tensorpipe](https://github.com/pytorch/tensorpipe).

New submodule commit: pytorch/tensorpipe@eabfe52

Pull Request resolved: #50684

Test Plan: Ensure that CI jobs succeed on GitHub before landing.

Reviewed By: lw

Differential Revision: D25944553

fbshipit-source-id: e2bbcc48472cd79df89d87a0e61dcffa783c659d
…50199)

Summary:
Reference: #50013

Pull Request resolved: #50199

Reviewed By: ngimel

Differential Revision: D25949791

Pulled By: mruberry

fbshipit-source-id: 10eaf2d749fac8c08847f50461e72ad1c75c61e3
…50592)

Summary:
Pull Request resolved: #50592

This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan: - run tests

Reviewed By: albanD

Differential Revision: D25925942

Pulled By: zou3519

fbshipit-source-id: 4803c389953469d0bacb285774c895009059522f
Summary:
This PR adds `torch.linalg.slogdet`.

Changes compared to the original torch.slogdet:

- Complex input now works as in NumPy
- Added out= variant (allocates temporary and makes a copy for now)
- Updated `slogdet_backward` to work with complex input

Ref. #42666

Pull Request resolved: #49194

Reviewed By: VitalyFedyunin

Differential Revision: D25916959

Pulled By: mruberry

fbshipit-source-id: cf9be8c5c044870200dcce38be48cd0d10e61a48
Summary: Pull Request resolved: #50732

Test Plan: Imported from OSS

Reviewed By: beauby

Differential Revision: D25954041

Pulled By: mrshenli

fbshipit-source-id: b2eeb1a77753cb8696613bfdc7bbc5001ae4c972
Summary: Pull Request resolved: #50387

Test Plan: Imported from OSS

Reviewed By: heitorschueroff

Differential Revision: D25947496

Pulled By: anjali411

fbshipit-source-id: c70886a73378501421ff94cdc0dc737f1738bf6f
…und (#33884)

Summary:
Pull Request resolved: #33884

Mitigates #5261.

It's not possible for us to support cudnn RNN double backwards due to
limitations in the cudnn API. This PR makes it so that we raise an error
message if users try to get the double backward on a cudnn RNN; in the
error message we suggest using the non-cudnn RNN.

Test Plan: - added some tests to check the error message

Reviewed By: albanD

Differential Revision: D20143544

Pulled By: zou3519

fbshipit-source-id: c2e49b3d8bdb9b34b561f006150e4c7551a78fac
Summary:
Pull Request resolved: #48719

Attempt to break this PR (#33019) into two parts. As per our discussion with eellison,  the first part is to make sure our aten::slice operator take optional parameters for begin/step/end. This will help with refactoring ir_emitter.cpp for genering handling for list and slice striding. Once this PR merged, we will submit a second PR with compiler change.

Test Plan:
None for this PR, but new tests will be added for the second part.

Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D25929902

fbshipit-source-id: 5385df04e6d61ded0699b09bbfec6691396b56c3
Summary:
This PR helps with #50513 by reducing the complexity of our `mypy` test suite and making it easier to reproduce on the command line. Previously, to reproduce how `mypy` was actually run on tracked source files (ignoring the doctest typechecking) in CI, you technically needed to run 9 different commands with various arguments:
```
$ mypy --cache-dir=.mypy_cache/normal --check-untyped-defs --follow-imports silent
$ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/module_list.py
$ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/namedtuple.py
$ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/opt_size.py
$ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/size.py
$ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/tensor_copy.py
$ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/torch_cuda_random.py
$ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/torch_optim.py
$ mypy --cache-dir=.mypy_cache/strict --config mypy-strict.ini
```
Now you only have to run 2 much simpler commands:
```
$ mypy
$ mypy --config mypy-strict.ini
```
One reason this is useful is because it will make it easier to integrate PyTorch's `mypy` setup into editors (remaining work on this to be done in a followup PR).

Also, as shown in the test plan, this also reduces the time it takes to run `test/test_type_hints.py` incrementally, by reducing the number of times `mypy` is invoked while still checking the same set of files with the same configs.

(Because this PR merges `test_type_hint_examples` (added in #34595) into `test_run_mypy` (added in #36584), I've added some people involved in those PRs as reviewers, in case there's a specific reason they weren't combined in the first place.)

Pull Request resolved: #50631

Test Plan:
Run this twice (the first time is to warm the cache):
```
$ python test/test_type_hints.py -v
```

- *Before:*
  ```
  test_doc_examples (__main__.TestTypeHints)
  Run documentation examples through mypy. ... ok
  test_run_mypy (__main__.TestTypeHints)
  Runs mypy over all files specified in mypy.ini ... ok
  test_run_mypy_strict (__main__.TestTypeHints)
  Runs mypy over all files specified in mypy-strict.ini ... ok
  test_type_hint_examples (__main__.TestTypeHints)
  Runs mypy over all the test examples present in ... ok

  ----------------------------------------------------------------------
  Ran 4 tests in 5.090s

  OK
  ```
  You can also just run `mypy` to see how many files it checks:
  ```
  $ mypy --cache-dir=.mypy_cache/normal --check-untyped-defs --follow-imports silent
  Success: no issues found in 1192 source files
  ```
- *After:*
  ```
  test_doc_examples (__main__.TestTypeHints)
  Run documentation examples through mypy. ... ok
  test_run_mypy (__main__.TestTypeHints)
  Runs mypy over all files specified in mypy.ini ... ok
  test_run_mypy_strict (__main__.TestTypeHints)
  Runs mypy over all files specified in mypy-strict.ini ... ok

  ----------------------------------------------------------------------
  Ran 3 tests in 2.404s

  OK
  ```
  Now `mypy` checks 7 more files, which is the number in `test/type_hint_tests`:
  ```
  $ mypy
  Success: no issues found in 1199 source files
  ```

Reviewed By: zou3519

Differential Revision: D25932660

Pulled By: samestep

fbshipit-source-id: 26c6f00f338e7b44954e5ed89522ce24e2fdc5f0
Summary:
Pull Request resolved: #50615

The method tests for some of the ops have been ported to the new OpInfo based tests. This PR removes those op names from `complex_list` in `test_autograd.py`

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D25931268

Pulled By: anjali411

fbshipit-source-id: 4d08626431c61c34cdca18044933e4f5b9b25232
@skyw skyw merged commit 88ff3ac into skyw:master Jan 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment