Pull in master #4

skyw · 2021-01-19T19:33:09Z

Fixes #{issue number}

Summary: Pull Request resolved: #50079 Test Plan: Sandcastle tests Reviewed By: xush6528 Differential Revision: D25718694 fbshipit-source-id: f535fb879bcd4cb4ea715adfd90bbffa3fcc1150

Summary: Pull Request resolved: #49944 Upgrades type annotations from Python2 to Python3 Test Plan: Sandcastle tests Reviewed By: xush6528 Differential Revision: D25717539 fbshipit-source-id: c621e2712e87eaed08cda48eb0fb224f6b0570c9

…SGD hook signatures (#50197) Summary: Pull Request resolved: #50197 Remove the extra comma after "bucket". ghstack-source-id: 119513484 Test Plan: waitforbuildbot Reviewed By: rohan-varma Differential Revision: D25823117 fbshipit-source-id: acf048f7cb732c23cba3a81ccce1e70f6b9f4299

Summary: closes gh-49704 Pull Request resolved: #49705 Reviewed By: mruberry Differential Revision: D25725352 Pulled By: malfet fbshipit-source-id: 05a7041c9caffde4a5c1eb8af0d13697075103af

Summary: In `batch_norm_gather_stats_with_counts_cuda` use `input.scalar_type()` if `running_mean` is not defined In `SyncBatchNorm` forward function create count tensor with `torch.float32` type if `running_mean` is None Fix a few typos Pull Request resolved: #50126 Test Plan: ``` python -c "import torch;print(torch.batch_norm_gather_stats_with_counts( torch.randn(1, 3, 3, 3, device='cuda'), mean = torch.ones(2, 3, device='cuda'), invstd = torch.ones(2, 3, device='cuda'), running_mean = None, running_var = None , momentum = .1, eps = 1e-5, counts = torch.ones(2, device='cuda')))" ``` Fixes #49730 Reviewed By: ngimel Differential Revision: D25797930 Pulled By: malfet fbshipit-source-id: 22a91e3969b5e9bbb7969d9cc70b45013a42fe83

Summary: Pull Request resolved: #49766 Devirtualizing this seems like a decent performance improvement on internal benchmarks. The *reason* this is a performance improvement is twofold: 1) virtual calls are a bit slower than regular calls 2) virtual functions in `TensorImpl` can't be inlined Test Plan: internal benchmark Reviewed By: hlu1 Differential Revision: D25602321 fbshipit-source-id: d61556456ccfd7f10c6ebdc3a52263b438a2aef1

…49767) Summary: Pull Request resolved: #49767 I'm told that the base implementation should work fine. Let's validate that in an intermediate diff before removing it. ghstack-source-id: 119528066 Test Plan: CI Reviewed By: ezyang, bhosmer Differential Revision: D25686830 fbshipit-source-id: f931394d3de6df7f6c5c68fe8ab711d90d3b12fd

Summary: Pull Request resolved: #49770 Seems like the performance cost of making this commonly-called method virtual isn't worth having use of undefined tensors crash a bit earlier (they'll still fail to dispatch). ghstack-source-id: 119528065 Test Plan: framework overhead benchmarks Reviewed By: ezyang Differential Revision: D25687465 fbshipit-source-id: 89aabce165a594be401979c04236114a6f527b59

Summary: Pull Request resolved: #49906 This commit modifies RPC Message to inherit from `torch::CustomClassHolder`, and wraps a Message in an IValue in `RpcAgent::send()`. Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25719518 Pulled By: mrshenli fbshipit-source-id: 694e40021e49e396da1620a2f81226522341550b

…ls.* (#49927) Summary: Pull Request resolved: #49927 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25724241 Pulled By: mrshenli fbshipit-source-id: d608e448f5224e41fbb0b5be6b9ac51a587f25b4

Summary: Pull Request resolved: #49960 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25730530 Pulled By: mrshenli fbshipit-source-id: 5d54572c653592d79c40aed616266c87307a1ad8

…9995) Summary: Pull Request resolved: #49995 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25745301 Pulled By: mrshenli fbshipit-source-id: b5e3a7e0b377496924847d8d70d61de32e2d87f4

Summary: Pull Request resolved: #50004 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25750602 Pulled By: mrshenli fbshipit-source-id: 06854a77f4fb5cc4c34a1ede843301157ebf7309

…50005) Summary: Pull Request resolved: #50005 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25750663 Pulled By: mrshenli fbshipit-source-id: 6d97156b61d82aa19dd0567ca72fe04bd7b5d1e7

Summary: Pull Request resolved: #50020 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25752968 Pulled By: mrshenli fbshipit-source-id: 138d37e204b6f9a584633cfc79fd44c8c9c00f41

Summary: Pull Request resolved: #50023 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25753217 Pulled By: mrshenli fbshipit-source-id: 5a98473c17535c8f92043abe143064e7fca4413b

Summary: Pull Request resolved: #50024 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25753386 Pulled By: mrshenli fbshipit-source-id: fdca051b805762a2c88f965ceb3edf1c25d40a56

Summary: Pull Request resolved: #50025 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25753587 Pulled By: mrshenli fbshipit-source-id: a5d4106a10d1b0d3e4c406751795f19af8afd120

Summary: Pull Request resolved: #50026 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25753588 Pulled By: mrshenli fbshipit-source-id: a6fcda7830901dd812fbf0489b001e6bd9673780

Summary: Pull Request resolved: #50027 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25753815 Pulled By: mrshenli fbshipit-source-id: 85b9b03fec52b4175288ac3a401285607744b451

Summary: Pull Request resolved: #50028 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25753887 Pulled By: mrshenli fbshipit-source-id: 40718349c2def262a16aaa24c167c0b540cddcb1

Summary: Pull Request resolved: #50029 Test Plan: buck run mode/opt -c=python.package_style=inplace //caffe2/torch/fb/training_toolkit/examples:ctr_mbl_feed_april_2020 -- local-preset --flow-entitlement pytorch_ftw_gpu --secure-group oncall_pytorch_distributed Before: ``` ... I0107 11:03:10.434000 3831111 print_publisher.py:23 master ] Publishing batch metrics: qps-qps|total_examples 14000.0 I0107 11:03:10.434000 3831111 print_publisher.py:23 master ] Publishing batch metrics: qps-qps|window_qps 74.60101318359375 I0107 11:03:10.434000 3831111 print_publisher.py:23 master ] Publishing batch metrics: qps-qps|lifetime_qps 74.60101318359375 ... I0107 11:05:12.132000 3831111 print_publisher.py:23 master ] Publishing batch metrics: qps-qps|total_examples 20000.0 I0107 11:05:12.132000 3831111 print_publisher.py:23 master ] Publishing batch metrics: qps-qps|window_qps 64.0 I0107 11:05:12.132000 3831111 print_publisher.py:23 master ] Publishing batch metrics: qps-qps|lifetime_qps 64.64917755126953 ... ``` After: ``` ... I0107 11:53:03.858000 53693 print_publisher.py:23 master ] Publishing batch metrics: qps-qps|total_examples 14000.0 I0107 11:53:03.858000 53693 print_publisher.py:23 master ] Publishing batch metrics: qps-qps|window_qps 72.56404876708984 I0107 11:53:03.858000 53693 print_publisher.py:23 master ] Publishing batch metrics: qps-qps|lifetime_qps 72.56404876708984 ... I0107 11:54:24.612000 53693 print_publisher.py:23 master ] Publishing batch metrics: qps-qps|total_examples 20000.0 I0107 11:54:24.612000 53693 print_publisher.py:23 master ] Publishing batch metrics: qps-qps|window_qps 73.07617950439453 I0107 11:54:24.612000 53693 print_publisher.py:23 master ] Publishing batch metrics: qps-qps|lifetime_qps 73.07617950439453 ... ``` Reviewed By: lw Differential Revision: D25774915 Pulled By: mrshenli fbshipit-source-id: 1128c3c2df9d76e36beaf171557da86e82043eb9

Summary: Pull Request resolved: #47507 This introduces a new SizesAndStrides class as a helper for TensorImpl, in preparation for changing its representation. ghstack-source-id: 119313559 Test Plan: Added new automated tests as well. Run framework overhead benchmarks. Results seem to be neutral-ish. Reviewed By: ezyang Differential Revision: D24762557 fbshipit-source-id: 6cc0ede52d0a126549fb51eecef92af41c3e1a98

Summary: Pull Request resolved: #47508 This moves SizesAndStrides to a specialized representation that is 5 words smaller in the common case of tensor rank 5 or less. ghstack-source-id: 119313560 Test Plan: SizesAndStridesTest added in previous diff passes under ASAN + UBSAN. Run framework overhead benchmarks. Looks more or less neutral. Reviewed By: ezyang Differential Revision: D24772023 fbshipit-source-id: 0a75fd6c2daabb0769e2f803e80e2d6831871316

Summary: Excludes sm_86 GPU devices from using cuDNN persistent RNN. This is because there are some hard-to-detect edge cases that will throw exceptions with cudnn 8.0.5 on Nvidia A40 GPU. Pull Request resolved: #49534 Reviewed By: mruberry Differential Revision: D25632378 Pulled By: mrshenli fbshipit-source-id: cbe78236d85d4d0c2e4ca63a3fc2c4e2de662d9e

Summary: Pull Request resolved: #50131 Noticed that in the internal diff for #49069 there was a clang-tidy warning to use emplace instead of push_back. This can save us a copy as it eliminates the unnecessary in-place construction ghstack-source-id: 119560979 Test Plan: CI Reviewed By: pritamdamania87 Differential Revision: D25800134 fbshipit-source-id: 243e57318f5d6e43de524d4e5409893febe6164c

Test Plan: revert-hammer Differential Revision: D25687465 (4de6b27) Original commit changeset: 89aabce165a5 fbshipit-source-id: fa5def17209d1691e68b1245fa0873fd03e88eaa

Summary: This solves a race condition where the worker thread might see a partially initialized graph_task Fixes #49652 I don't know how to reliably trigger the race so I didn't add any test. But the rocm build flakyness (it just happens to race more often on rocm builds) should disappear after this PR. Pull Request resolved: #50164 Reviewed By: zou3519 Differential Revision: D25824954 Pulled By: albanD fbshipit-source-id: 6a3391753cb2afd2ab415d3fb2071a837cc565bb

Summary: Reference: #42515 Pull Request resolved: #50093 Reviewed By: H-Huang Differential Revision: D25803549 Pulled By: mruberry fbshipit-source-id: e6f245b5e728f2dca6072f8c359f03dff63aa14d

Summary: Remove outdated comment and update to use new paths. Pull Request resolved: #50166 Reviewed By: zou3519 Differential Revision: D25824942 Pulled By: albanD fbshipit-source-id: 7dc694891409e80e1804eddcdcc50cc21b60f822

Summary: Fixes `docstring of torch.distributed.rpc.RRef.remote:14: WARNING: Field list ends without a blank line; unexpected unindent.` by indenting multiline fieldlist Pull Request resolved: #50651 Reviewed By: SplitInfinity Differential Revision: D25935839 Pulled By: malfet fbshipit-source-id: e2613ae75334d01ab57f4b071cb0fddf80c6bd78

Summary: Adds the rest of the ops. Pull Request resolved: #50643 Reviewed By: pbelevich Differential Revision: D25936346 Pulled By: Chillee fbshipit-source-id: 4e2a7afbeabde51991c39d187a8c35e766950ffe

Summary: Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com> Pull Request resolved: #50629 Reviewed By: albanD Differential Revision: D25935005 Pulled By: rohan-varma fbshipit-source-id: e0969afecac2f319833189a7a8897d78068a2cda

Summary: Fixes #42588 The contiguity check used to be for memory format suggested by `grad_output->suggest_memory_format()`, but an invariant guaranteed by derivatives.yaml is `input->suggest_memory_format()` Pull Request resolved: #50659 Reviewed By: mruberry Differential Revision: D25938921 Pulled By: ngimel fbshipit-source-id: a945bfef6ce3d91b17e7ff96babe89ffd508939a

…st_recurrent (#50668) Summary: Pull Request resolved: #50668 GPU initialization sometimes is slow Test Plan: buck test mode/opt //caffe2/caffe2/python:hypothesis_test -- --exact 'caffe2/caffe2/python:hypothesis_test - test_recurrent (caffe2.caffe2.python.hypothesis_test.TestOperators)' --run-disabled Reviewed By: hl475 Differential Revision: D25939037 fbshipit-source-id: 832700cf42ece848cda66dd629a06ecda207f086

…ispatch for CPU min/max pointwise ops (#50465) Summary: Fixes #50064 **PROBLEM DESCRIPTION:** 1. Had not removed dtype checks for complex types in the previous PR (#50347) for this issue. These type-checks were added in #36377, but are no longer necessary, as we now rely upon dispatch macros to produce error messages. 2. dtype checks in `clamp_max()` and `clamp_min()` for complex inputs had not been removed either. 3. For min/max pointwise ops in TensorCompareKernel.cpp, complex dispatch had not been removed for min/max functions. ### **FIX DESCRIPTION:** **FIX SUMMARY:** 1. Removed dtype checks added in #36377, and added 3 more in TensorCompare.cpp. 2. Removed dtype checks for complex inputs in `clamp_max()` and `clamp_min()`. 3. Disabled complex dispatch for min/max pointwise ops in TensorCompareKernel.cpp. 4. Error messages in the exceptions raised due to min/max ops not being implemented are now checked for containing the text _not support_ (which can also be present in _not supported_), or _not implemented_, so one of them should be a part of error messages, in order for them to be informative. **REASON FOR NOT CHANGING DISPATCH FOR CUDA AND CLAMP OPS**: As for the CUDA min/max operations, their kernels do not seem to be compiled & dispatched for complex types anyway, so no further changes seem to be required. Basically, the dispatch macros currently being used don't have cases for complex types. For example, 1. the reduce CUDA ops use [AT_DISPATCH_ALL_TYPES_AND2 (https://github.com/pytorch/pytorch/commit/678fe9f0771a5cd98ead214363d70480ba03000d)](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/Dispatch.h#L548-L575) in [ReduceMinMaxKernel.cu](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/ReduceMinMaxKernel.cu), and that macro doesn't allow complex types. 2. In [MinMaxElementwiseKernel.cu](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/MaxMinElementwiseKernel.cu), the CUDA pointwise ops use [`AT_DISPATCH_FLOATING_TYPES_AND2 (https://github.com/pytorch/pytorch/commit/678fe9f0771a5cd98ead214363d70480ba03000d)`](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/Dispatch.h#L240-L263) for non-integral & non-boolean types, and this marco doesn't have a case for complex types either. 3. [clamp CUDA ops](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/UnaryOpsKernel.cu#L170-L211) use `AT_DISPATCH_ALL_TYPES_AND2 (https://github.com/pytorch/pytorch/commit/678fe9f0771a5cd98ead214363d70480ba03000d)`, which doesn't have a case for complex types. Similarly, [CPU clamp min/max ops](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cpu/UnaryOpsKernel.cpp#L428-L458) use the `AT_DISPATCH_ALL_TYPES_AND `dispatch macro, which doesn't have a case for complex types. **REASON FOR ADDING 3 dtype CHECKS:** There are a few cases in which the methods corresponding to `min_stub()` or `max_stub()` are not called, so dispatch macros don't get invoked, resulting in no exceptions being raised. Hence, `dtype` checks are necessary at 3 places to raise exceptions: 1. https://github.com/pytorch/pytorch/blob/52dcc7299925de055d330781d2fe0dad71182829/aten/src/ATen/native/TensorCompare.cpp#L342 2. https://github.com/pytorch/pytorch/blob/52dcc7299925de055d330781d2fe0dad71182829/aten/src/ATen/native/TensorCompare.cpp#L422 3. https://github.com/pytorch/pytorch/blob/52dcc7299925de055d330781d2fe0dad71182829/aten/src/ATen/native/TensorCompare.cpp#L389 The first dtype check requirement can be verified from the following example Python code based on `test_complex_unsupported()`: ``` import unittest import torch class MyTestCase(unittest.TestCase): def test_1(self): t = torch.tensor((1 + 1j), device='cpu', dtype=torch.complex128) with self.assertRaises(Exception): torch.max(t, dim=0) if __name__ == '__main__': unittest.main() ``` Pull Request resolved: #50465 Reviewed By: mruberry Differential Revision: D25938106 Pulled By: ngimel fbshipit-source-id: 95e2df02ba8583fa3ce87d4a2fdcd60b912dda46

Summary: Introduced operator variant to OpInfo Context: Split of #49158 cc mruberry Pull Request resolved: #50370 Reviewed By: mrshenli Differential Revision: D25897821 Pulled By: mruberry fbshipit-source-id: 4387ea10607dbd7209842b685f1794bcb31f434e

Summary: Reopen PR for #46975 Pull Request resolved: #50007 Reviewed By: mruberry Differential Revision: D25850808 Pulled By: ngimel fbshipit-source-id: a232e02949182b7d3799448d24ad54a9e0bcf95c

…50632) Summary: Pull Request resolved: #50632 I'll port the following method tests in follow-up PRs: `'baddbmm', 'addbmm', 'addmv', 'addr'` After the tests are ported to OpInfo based tests, it would also be much easier to add tests with complex alpha and beta values. Edit- it seems like it's hard to port the broadcasting variant tests because one ends up skipping `test_inplace_grad` and `test_variant_consistency_eager` even for the case when inputs are not required to be broadcasted. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D25947471 Pulled By: anjali411 fbshipit-source-id: 9faa7f1fd55a1269bad282adac2b39d19bfa4591

Summary: - Related with #44937 - Use `resize_output` instead of `resize_as` - Tuning the `native_functions.yaml`, move the inplace variant `pow_` next to the other `pow` entries Pull Request resolved: #46830 Reviewed By: mrshenli Differential Revision: D24567702 Pulled By: anjali411 fbshipit-source-id: a352422c9d4e356574dbfdf21fb57f7ca7c6075d

…_allcompare (#50696) Summary: Pull Request resolved: #50696 set no deadline for test_alklcompare Test Plan: buck test mode/dev //caffe2/caffe2/python:lazy_dyndep_test -- --exact 'caffe2/caffe2/python:lazy_dyndep_test - test_allcompare (caffe2.caffe2.python.lazy_dyndep_test.TestLazyDynDepAllCompare)' --run-disabled Reviewed By: hl475 Differential Revision: D25947800 fbshipit-source-id: d2043f97128e257ef06ebca9b68262bb1c0c5e6b

Summary: Pull Request resolved: #50564 When an RPC was sent, the associated future was stored in two maps: pendingResponseMessage_ and timeoutMap_. Once the response was received, the entry was only removed from pendingResponseMessage_ and not timeoutMap_. The pollTimedoudRpcs method then eventually removed the entry from timeoutMap_ after the time out duration had passed. Although, in scenarios where there is a large timeout and a large number of RPCs being used, it is very easy for the timeoutMap_ to grow without any bounds. This was discovered in #50522. To fix this issue, I've added some code to cleanup timeoutMap_ as well once we receive a response. ghstack-source-id: 119925182 Test Plan: 1) Unit test added. 2) Tested with repro in #50522 #Closes: #50522 Reviewed By: mrshenli Differential Revision: D25919650 fbshipit-source-id: a0a42647e706d598fce2ca2c92963e540b9d9dbb

Summary: Pull Request resolved: #50674 Test Plan: Imported from OSS Reviewed By: beauby Differential Revision: D25941964 Pulled By: mrshenli fbshipit-source-id: b53454efdce01f7c06f67dfb890d3c3bdc2c648f

Summary: Pull Request resolved: #50675 Test Plan: Imported from OSS Reviewed By: beauby Differential Revision: D25941963 Pulled By: mrshenli fbshipit-source-id: 205786d7366f36d659a3a3374081a458cfcb4dd1

Summary: Fixes #{[24991](#24991)} I used a value of 0.75 as suggested in the forums by Thomas: https://discuss.pytorch.org/t/calculate-gain-tanh/20854/6 I verified that the value keeps the gradient stable for a 100-layer network. Code to reproduce (from [jpeg729](https://discuss.pytorch.org/t/calculate-gain-tanh/20854/4)): ```python import torch import torch.nn.functional as F import sys a = torch.randn(1000,1000, requires_grad=True) b = a print (f"in: {a.std().item():.4f}") for i in range(100): l = torch.nn.Linear(1000,1000, bias=False) torch.nn.init.xavier_normal_(l.weight, torch.nn.init.calculate_gain("selu")) b = getattr(F, 'selu')(l(b)) if i % 10 == 0: print (f"out: {b.std().item():.4f}", end=" ") a.grad = None b.sum().backward(retain_graph=True) print (f"grad: {a.grad.abs().mean().item():.4f}") ``` Output: ``` in: 1.0008 out: 0.7968 grad: 0.6509 out: 0.3127 grad: 0.2760 out: 0.2404 grad: 0.2337 out: 0.2062 grad: 0.2039 out: 0.2056 grad: 0.1795 out: 0.2044 grad: 0.1977 out: 0.2005 grad: 0.2045 out: 0.2042 grad: 0.2273 out: 0.1944 grad: 0.2034 out: 0.2085 grad: 0.2464 ``` I included the necessary documentation change, and it passes the _test_calculate_gain_nonlinear_ unittest. Pull Request resolved: #50664 Reviewed By: mruberry Differential Revision: D25942217 Pulled By: ngimel fbshipit-source-id: 29ff1be25713484fa7c516df71b12fdaecfb9af8

Summary: Signed-off-by: Kyle Chen <kylechen@amd.com> cc: jeffdaily Pull Request resolved: #50557 Reviewed By: mruberry Differential Revision: D25941432 Pulled By: ngimel fbshipit-source-id: 534fc8a91a48fa8b3b397e63423cd8347b41bbe2

…#50711) Summary: Pull Request resolved: #50711 As title, missed a few of these. Test Plan: Imported from OSS Reviewed By: yf225 Differential Revision: D25949363 Pulled By: suo fbshipit-source-id: 197743fe7097d2ac894421a99c072696c3b8cd70

Summary: Pull Request resolved: #50611 Removed the unused old-style code to prevent it from being used. Added all autograd/gen_pyi sources to mypy-strict.ini config. Confirmed byte-for-byte compatible with the old codegen: ``` Run it before and after this PR: .jenkins/pytorch/codegen-test.sh <baseline_output_dir> .jenkins/pytorch/codegen-test.sh <test_output_dir> Then run diff to compare the generated files: diff -Naur <baseline_output_dir> <test_output_dir> ``` Confirmed clean mypy-strict run: ``` mypy --config mypy-strict.ini ``` Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D25929730 Pulled By: ljk53 fbshipit-source-id: 1fc94436fd4a6b9b368ee0736e99bfb3c01d38ef

Summary: As per title. Partially Fixes #49421. These functions appear to be dead code. Pull Request resolved: #50489 Reviewed By: mruberry Differential Revision: D25948912 Pulled By: ngimel fbshipit-source-id: 108723bd4c76cbc3535eba902d6f74597bfdfa58

Summary: This is an automated pull request to update the first-party submodule for [pytorch/tensorpipe](https://github.com/pytorch/tensorpipe). New submodule commit: pytorch/tensorpipe@eabfe52 Pull Request resolved: #50684 Test Plan: Ensure that CI jobs succeed on GitHub before landing. Reviewed By: lw Differential Revision: D25944553 fbshipit-source-id: e2bbcc48472cd79df89d87a0e61dcffa783c659d

…50199) Summary: Reference: #50013 Pull Request resolved: #50199 Reviewed By: ngimel Differential Revision: D25949791 Pulled By: mruberry fbshipit-source-id: 10eaf2d749fac8c08847f50461e72ad1c75c61e3

…50592) Summary: Pull Request resolved: #50592 This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck. It defaults to False because gradcheck is a public API and I don't want to break any existing non-pytorch users of gradcheck. This: - runs grad twice with two grad outputs, a & b - runs a vmapped grad with torch.stack([a, b]) - compares the results of the above against each other. Furthermore: - `check_batched_grad=True` is set to be the default for gradcheck/gradgradcheck inside of test_autograd.py. This is done by reassigning to the gradcheck object inside test_autograd - I manually added `check_batched_grad=False` to gradcheck instances that don't support batched grad. - I added a denylist for operations that don't support batched grad. Question: - Should we have a testing only gradcheck (e.g., torch.testing.gradcheck) that has different defaults from our public API, torch.autograd.gradcheck? Future: - The future plan for this is to repeat the above for test_nn.py (the autogenerated test will require a denylist) - Finally, we can repeat the above for all pytorch test files that use gradcheck. Test Plan: - run tests Reviewed By: albanD Differential Revision: D25925942 Pulled By: zou3519 fbshipit-source-id: 4803c389953469d0bacb285774c895009059522f

Summary: This PR adds `torch.linalg.slogdet`. Changes compared to the original torch.slogdet: - Complex input now works as in NumPy - Added out= variant (allocates temporary and makes a copy for now) - Updated `slogdet_backward` to work with complex input Ref. #42666 Pull Request resolved: #49194 Reviewed By: VitalyFedyunin Differential Revision: D25916959 Pulled By: mruberry fbshipit-source-id: cf9be8c5c044870200dcce38be48cd0d10e61a48

Summary: Pull Request resolved: #50732 Test Plan: Imported from OSS Reviewed By: beauby Differential Revision: D25954041 Pulled By: mrshenli fbshipit-source-id: b2eeb1a77753cb8696613bfdc7bbc5001ae4c972

Summary: Pull Request resolved: #50387 Test Plan: Imported from OSS Reviewed By: heitorschueroff Differential Revision: D25947496 Pulled By: anjali411 fbshipit-source-id: c70886a73378501421ff94cdc0dc737f1738bf6f

…und (#33884) Summary: Pull Request resolved: #33884 Mitigates #5261. It's not possible for us to support cudnn RNN double backwards due to limitations in the cudnn API. This PR makes it so that we raise an error message if users try to get the double backward on a cudnn RNN; in the error message we suggest using the non-cudnn RNN. Test Plan: - added some tests to check the error message Reviewed By: albanD Differential Revision: D20143544 Pulled By: zou3519 fbshipit-source-id: c2e49b3d8bdb9b34b561f006150e4c7551a78fac

Summary: Pull Request resolved: #48719 Attempt to break this PR (#33019) into two parts. As per our discussion with eellison, the first part is to make sure our aten::slice operator take optional parameters for begin/step/end. This will help with refactoring ir_emitter.cpp for genering handling for list and slice striding. Once this PR merged, we will submit a second PR with compiler change. Test Plan: None for this PR, but new tests will be added for the second part. Imported from OSS Reviewed By: jamesr66a Differential Revision: D25929902 fbshipit-source-id: 5385df04e6d61ded0699b09bbfec6691396b56c3

Summary: This PR helps with #50513 by reducing the complexity of our `mypy` test suite and making it easier to reproduce on the command line. Previously, to reproduce how `mypy` was actually run on tracked source files (ignoring the doctest typechecking) in CI, you technically needed to run 9 different commands with various arguments: ``` $ mypy --cache-dir=.mypy_cache/normal --check-untyped-defs --follow-imports silent $ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/module_list.py $ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/namedtuple.py $ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/opt_size.py $ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/size.py $ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/tensor_copy.py $ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/torch_cuda_random.py $ mypy --cache-dir=.mypy_cache/examples --follow-imports silent --check-untyped-defs test/type_hint_tests/torch_optim.py $ mypy --cache-dir=.mypy_cache/strict --config mypy-strict.ini ``` Now you only have to run 2 much simpler commands: ``` $ mypy $ mypy --config mypy-strict.ini ``` One reason this is useful is because it will make it easier to integrate PyTorch's `mypy` setup into editors (remaining work on this to be done in a followup PR). Also, as shown in the test plan, this also reduces the time it takes to run `test/test_type_hints.py` incrementally, by reducing the number of times `mypy` is invoked while still checking the same set of files with the same configs. (Because this PR merges `test_type_hint_examples` (added in #34595) into `test_run_mypy` (added in #36584), I've added some people involved in those PRs as reviewers, in case there's a specific reason they weren't combined in the first place.) Pull Request resolved: #50631 Test Plan: Run this twice (the first time is to warm the cache): ``` $ python test/test_type_hints.py -v ``` - *Before:* ``` test_doc_examples (__main__.TestTypeHints) Run documentation examples through mypy. ... ok test_run_mypy (__main__.TestTypeHints) Runs mypy over all files specified in mypy.ini ... ok test_run_mypy_strict (__main__.TestTypeHints) Runs mypy over all files specified in mypy-strict.ini ... ok test_type_hint_examples (__main__.TestTypeHints) Runs mypy over all the test examples present in ... ok ---------------------------------------------------------------------- Ran 4 tests in 5.090s OK ``` You can also just run `mypy` to see how many files it checks: ``` $ mypy --cache-dir=.mypy_cache/normal --check-untyped-defs --follow-imports silent Success: no issues found in 1192 source files ``` - *After:* ``` test_doc_examples (__main__.TestTypeHints) Run documentation examples through mypy. ... ok test_run_mypy (__main__.TestTypeHints) Runs mypy over all files specified in mypy.ini ... ok test_run_mypy_strict (__main__.TestTypeHints) Runs mypy over all files specified in mypy-strict.ini ... ok ---------------------------------------------------------------------- Ran 3 tests in 2.404s OK ``` Now `mypy` checks 7 more files, which is the number in `test/type_hint_tests`: ``` $ mypy Success: no issues found in 1199 source files ``` Reviewed By: zou3519 Differential Revision: D25932660 Pulled By: samestep fbshipit-source-id: 26c6f00f338e7b44954e5ed89522ce24e2fdc5f0

Summary: Pull Request resolved: #50615 The method tests for some of the ops have been ported to the new OpInfo based tests. This PR removes those op names from `complex_list` in `test_autograd.py` Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D25931268 Pulled By: anjali411 fbshipit-source-id: 4d08626431c61c34cdca18044933e4f5b9b25232

r-barnes and others added 30 commits January 7, 2021 15:39

[Don't review] Clean up type annotations in caffe2/torch/nn (#50079)

f83d57f

Summary: Pull Request resolved: #50079 Test Plan: Sandcastle tests Reviewed By: xush6528 Differential Revision: D25718694 fbshipit-source-id: f535fb879bcd4cb4ea715adfd90bbffa3fcc1150

Clean up some type annotations in android (#49944)

09eefec

Summary: Pull Request resolved: #49944 Upgrades type annotations from Python2 to Python3 Test Plan: Sandcastle tests Reviewed By: xush6528 Differential Revision: D25717539 fbshipit-source-id: c621e2712e87eaed08cda48eb0fb224f6b0570c9

add type annotations to torch._utils (#49705)

870ab04

Summary: closes gh-49704 Pull Request resolved: #49705 Reviewed By: mruberry Differential Revision: D25725352 Pulled By: malfet fbshipit-source-id: 05a7041c9caffde4a5c1eb8af0d13697075103af

Replace FutureMessage with ivalue::Future in RRefContext (#49960)

008206d

Summary: Pull Request resolved: #49960 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25730530 Pulled By: mrshenli fbshipit-source-id: 5d54572c653592d79c40aed616266c87307a1ad8

Replace FutureMessage with ivalue::Future in RpcAgent retry logic (#4…

d730c7e

…9995) Summary: Pull Request resolved: #49995 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25745301 Pulled By: mrshenli fbshipit-source-id: b5e3a7e0b377496924847d8d70d61de32e2d87f4

Completely remove FutureMessage from RRef Implementations (#50004)

2d5f57c

Summary: Pull Request resolved: #50004 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25750602 Pulled By: mrshenli fbshipit-source-id: 06854a77f4fb5cc4c34a1ede843301157ebf7309

Completely remove FutureMessage from RPC TorchScript implementations (#…

b2da0b5

…50005) Summary: Pull Request resolved: #50005 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25750663 Pulled By: mrshenli fbshipit-source-id: 6d97156b61d82aa19dd0567ca72fe04bd7b5d1e7

Completely remove FutureMessage from distributed autograd (#50020)

0c94393

Summary: Pull Request resolved: #50020 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25752968 Pulled By: mrshenli fbshipit-source-id: 138d37e204b6f9a584633cfc79fd44c8c9c00f41

Remove FutureMessage from sender ProcessGroupAgent (#50023)

1deb895

Summary: Pull Request resolved: #50023 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25753217 Pulled By: mrshenli fbshipit-source-id: 5a98473c17535c8f92043abe143064e7fca4413b

Remove FutureMessage from sender TensorPipeAgent (#50024)

0684d07

Summary: Pull Request resolved: #50024 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25753386 Pulled By: mrshenli fbshipit-source-id: fdca051b805762a2c88f965ceb3edf1c25d40a56

Completely remove FutureMessage from FaultyProcessGroupAgent (#50025)

2831af9

Summary: Pull Request resolved: #50025 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25753587 Pulled By: mrshenli fbshipit-source-id: a5d4106a10d1b0d3e4c406751795f19af8afd120

Remove FutureMessage from RPC request callback logic (#50026)

1f795e1

Summary: Pull Request resolved: #50026 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25753588 Pulled By: mrshenli fbshipit-source-id: a6fcda7830901dd812fbf0489b001e6bd9673780

Completely Remove FutureMessage from RPC cpp tests (#50027)

0987510

Summary: Pull Request resolved: #50027 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25753815 Pulled By: mrshenli fbshipit-source-id: 85b9b03fec52b4175288ac3a401285607744b451

Completely Remove FutureMessage from RPC agents (#50028)

171648e

Summary: Pull Request resolved: #50028 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D25753887 Pulled By: mrshenli fbshipit-source-id: 40718349c2def262a16aaa24c167c0b540cddcb1

Revert D25687465: [PyTorch] Devirtualize TensorImpl::dim() with macro

c215ffb

Test Plan: revert-hammer Differential Revision: D25687465 (4de6b27) Original commit changeset: 89aabce165a5 fbshipit-source-id: fa5def17209d1691e68b1245fa0873fd03e88eaa

[numpy] torch.exp: promote integer inputs to float (#50093)

9f832c8

Summary: Reference: #42515 Pull Request resolved: #50093 Reviewed By: H-Huang Differential Revision: D25803549 Pulled By: mruberry fbshipit-source-id: e6f245b5e728f2dca6072f8c359f03dff63aa14d

Update autograd related comments (#50166)

006cfeb

Summary: Remove outdated comment and update to use new paths. Pull Request resolved: #50166 Reviewed By: zou3519 Differential Revision: D25824942 Pulled By: albanD fbshipit-source-id: 7dc694891409e80e1804eddcdcc50cc21b60f822

malfet and others added 29 commits January 15, 2021 23:39

Finished fleshing out the tensor expr bindings in expr.cpp (#50643)

2001f3a

Summary: Adds the rest of the ops. Pull Request resolved: #50643 Reviewed By: pbelevich Differential Revision: D25936346 Pulled By: Chillee fbshipit-source-id: 4e2a7afbeabde51991c39d187a8c35e766950ffe

Multi label margin loss (#50007)

7f3a407

Summary: Reopen PR for #46975 Pull Request resolved: #50007 Reviewed By: mruberry Differential Revision: D25850808 Pulled By: ngimel fbshipit-source-id: a232e02949182b7d3799448d24ad54a9e0bcf95c

Enable TensorPipe CUDA sending to self (#50674)

94d9a7e

Summary: Pull Request resolved: #50674 Test Plan: Imported from OSS Reviewed By: beauby Differential Revision: D25941964 Pulled By: mrshenli fbshipit-source-id: b53454efdce01f7c06f67dfb890d3c3bdc2c648f

Enable TensorPipe CUDA fallback channel (#50675)

ce30dba

Summary: Pull Request resolved: #50675 Test Plan: Imported from OSS Reviewed By: beauby Differential Revision: D25941963 Pulled By: mrshenli fbshipit-source-id: 205786d7366f36d659a3a3374081a458cfcb4dd1

[ROCm] re-enable test_sparse.py tests (#50557)

d5e5c54

Summary: Signed-off-by: Kyle Chen <kylechen@amd.com> cc: jeffdaily Pull Request resolved: #50557 Reviewed By: mruberry Differential Revision: D25941432 Pulled By: ngimel fbshipit-source-id: 534fc8a91a48fa8b3b397e63423cd8347b41bbe2

[testing] Port torch.{repeat, tile} tests to use OpInfo machinery (#…

316f0b8

…50199) Summary: Reference: #50013 Pull Request resolved: #50199 Reviewed By: ngimel Differential Revision: D25949791 Pulled By: mruberry fbshipit-source-id: 10eaf2d749fac8c08847f50461e72ad1c75c61e3

Adding missing decorator for test_device_map_gpu_mixed_self_4 (#50732)

1000403

Summary: Pull Request resolved: #50732 Test Plan: Imported from OSS Reviewed By: beauby Differential Revision: D25954041 Pulled By: mrshenli fbshipit-source-id: b2eeb1a77753cb8696613bfdc7bbc5001ae4c972

Add complex support for torch.{acosh, asinh, atanh} (#50387)

5d64658

Summary: Pull Request resolved: #50387 Test Plan: Imported from OSS Reviewed By: heitorschueroff Differential Revision: D25947496 Pulled By: anjali411 fbshipit-source-id: c70886a73378501421ff94cdc0dc737f1738bf6f

skyw merged commit 88ff3ac into skyw:master Jan 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pull in master #4

Pull in master #4

skyw commented Jan 19, 2021

Pull in master #4

Pull in master #4

Conversation

skyw commented Jan 19, 2021