Permalink
Switch branches/tags
Commits on Nov 6, 2018
  1. Append parameters when checking graphs for TorchScript Methods (#13553)

    apaszke authored and facebook-github-bot committed Nov 6, 2018
    Summary:
    Also, add an assertion in the GraphExecutor to make sure we don't
    access memory out of bounds.
    Pull Request resolved: #13553
    
    Differential Revision: D12924796
    
    Pulled By: soumith
    
    fbshipit-source-id: ea2a134084538484178b8ebad33d6716a8e1d633
Commits on Nov 5, 2018
  1. Stop depending on static analysis of tensor types in graph fuser (#13387

    apaszke authored and facebook-github-bot committed Nov 5, 2018
    )
    
    Summary:
    Built on top of #13108, so please review only the last commit.
    
    This makes the graph fuser ignore input types (device/scalar type) when considering graphs for fusion, making it much more robust to shape-prop failures. Those properties are now checked at run time, as part of the kernel validation. This should enable graph fusions in `jit_premul` and `jit_multilayer` timelines in our benchmarks.
    
    One regression is that I've disabled fusions of comparison ops (and `type_as`). That's because there's really no good way to ensure that those are really valid, and are a source of bugs (I filed #13384).
    
    cc ngimel mruberry zdevito zou3519
    Pull Request resolved: #13387
    
    Differential Revision: D12888104
    
    Pulled By: zou3519
    
    fbshipit-source-id: c233ea599679c34ac70fb4d8b8497c60aad9e480
Commits on Nov 1, 2018
  1. Re-enabled mm+add tree batching in the JIT (#13228)

    apaszke authored and facebook-github-bot committed Nov 1, 2018
    Summary:
    I've had to generously increase the range of the CreateADSubgraphs pass, because even though it collapses the RNN loop to a single differentiable subgraphs and a few other nodes, the range uses the distances in the original graph...
    
    cc zdevito zou3519
    Pull Request resolved: #13228
    
    Differential Revision: D12871316
    
    Pulled By: zou3519
    
    fbshipit-source-id: 32da6f30f7821e4339034f1a4dec41ed0849abfb
Commits on Sep 26, 2018
  1. Fix ONNX bug, add symbolic for full

    apaszke authored and facebook-github-bot committed Sep 26, 2018
    Summary: Pull Request resolved: #12052
    
    Differential Revision: D10044910
    
    Pulled By: apaszke
    
    fbshipit-source-id: 015ef372966d7594e1b450e348d457429f6ef20d
  2. Enable tracing of tensor factories with an out argument

    apaszke authored and facebook-github-bot committed Sep 26, 2018
    Summary: Pull Request resolved: #12051
    
    Differential Revision: D10044890
    
    Pulled By: apaszke
    
    fbshipit-source-id: 2d794bf408875600bc71f354f0b4961d6b715094
Commits on Sep 25, 2018
  1. Eliminate no-op adds and muls in peephole pass (#11801)

    apaszke authored and facebook-github-bot committed Sep 25, 2018
    Summary:
    Because we emit a lot of them in our symbolic AD. This brings down the backward time of an LSTM I'm testing from 14.2ms to 12.5ms (a 15% improvement).
    Pull Request resolved: #11801
    
    Differential Revision: D9916815
    
    Pulled By: apaszke
    
    fbshipit-source-id: 2d9cb886c424ccd43b9f996aad89950d3bddf494
Commits on Sep 24, 2018
  1. Stop moving constants into DifferentiableSubgraphs (#11809)

    apaszke authored and facebook-github-bot committed Sep 24, 2018
    Summary:
    Or even taking them as inputs. This prevents optimizations to happen
    either inside the differentiable subgraphs, or in the surrounding graph.
    Pull Request resolved: #11809
    
    Differential Revision: D10009680
    
    Pulled By: apaszke
    
    fbshipit-source-id: face638566228e470a6deec48dc2aa3a1cce26d4
Commits on Sep 21, 2018
  1. Specialize ArgumentSpecs on tuple elements too (#11863)

    apaszke authored and facebook-github-bot committed Sep 21, 2018
    Summary:
    This is pretty important because a common situation of passing LSTM hidden states as a tuple completely trashes performance of a network.
    
    Cleans up all our propagation/undef specialization passes, at a cost of increased complexity of `ArgumentSpec` and `GraphExecutor`. An alternative would be to simply flatten all tuple inputs to a graph ahead of time, but that might just end up being confusing in the future (you never know if you're working with a graph that can have tuple or not).
    Pull Request resolved: #11863
    
    Differential Revision: D9992814
    
    Pulled By: apaszke
    
    fbshipit-source-id: 0a565a3b23e32f8fa72c0534e07c1ce6187739fc
  2. Minor JIT improvements (#11654)

    apaszke authored and facebook-github-bot committed Sep 21, 2018
    Summary:
    - Disable addmm fusion. The reason for this is explained in the comment.
    - Tiny change in `stack.h` that lets us avoid constructing an unnecessary temporary `IValue` on the (C++) stack (it will only get created on the interpreter stack directly).
    - Fixed a correctness issue in requires grad propagation
    Pull Request resolved: #11654
    
    Reviewed By: colesbury
    
    Differential Revision: D9813739
    
    Pulled By: apaszke
    
    fbshipit-source-id: 23e83bc8605802f39bfecf447efad9239b9421c3
  3. Stop tracing _out overloads (#11910)

    apaszke authored and facebook-github-bot committed Sep 21, 2018
    Summary:
    They aren't recognized anywhere in the JIT
    Pull Request resolved: #11910
    
    Differential Revision: D9979968
    
    Pulled By: apaszke
    
    fbshipit-source-id: bb2505a14e3b1e54d5c243f99c80a4f4d918b204
  4. Pop stashed IntList in resize_, warn about its usage when tracing.

    apaszke authored and facebook-github-bot committed Sep 21, 2018
    Summary: Pull Request resolved: #11909
    
    Differential Revision: D9979595
    
    fbshipit-source-id: 07b1027bd6bd1605a31afd4f57bcd58e307fa41e
Commits on Sep 19, 2018
  1. Improve autograd profiler performance (#11773)

    apaszke authored and facebook-github-bot committed Sep 19, 2018
    Summary:
    To illustrate the benefits of this commit, I'll use the time/iter I got from one of the JIT benchmarks on my machine.
    
    | Run                                          | Time                    |
    |----------------------------------------------|-------------------------|
    | No profiler                                  | 45ms                    |
    | With profiler                                | 56ms                    |
    | Use `clock_gettime` instead of `std::chrono` | 48ms                    |
    | Touch all pages on block allocation          | 48ms (less jitter)      |
    | Use `const char*` instead of `std::string`   | 47ms (even less jitter) |
    Pull Request resolved: #11773
    
    Differential Revision: D9886858
    
    Pulled By: apaszke
    
    fbshipit-source-id: 58f926f09e95df0b11ec687763a72b06b66991d0
Commits on Sep 14, 2018
  1. Implement requires_grad propagation in the JIT (#11586)

    apaszke authored and facebook-github-bot committed Sep 14, 2018
    Summary:
    Previously, we would pretty much assume that all floating point tensors do require grad, which might result in some unnecessary compute.
    
    I don't really like the fact that `TensorType` uses `tensor.is_variable() && tensor.requires_grad()` to infer the value of `requires_grad`, but changing constants to keep variables turns out to be pretty hard. I got halfway there, but it would still need some more work.
    Pull Request resolved: #11586
    
    Reviewed By: ezyang
    
    Differential Revision: D9813648
    
    Pulled By: apaszke
    
    fbshipit-source-id: 77f77756d18ff7632fca3aa68ce855e1d7f3bdb8
Commits on Sep 12, 2018
  1. Improve tracer warnings (#11545)

    apaszke authored and facebook-github-bot committed Sep 12, 2018
    Summary:
    Also, fix a performance bug in `ensureUnique`. Previously it formatted the warning string even though we weren't tracing, so all that work would *always* happen in the hot path and be for nothing.
    
    A sample of how the new warnings look like:
    ```
    tmp.py:4: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Pytho
    n values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
      int(x)
    tmp.py:5: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this fun
    ction to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might caus
    e the trace to be incorrect.
      torch.tensor([1.])
    tmp.py:6: TracerWarning: There are 2 live references to the data region being modified when tracing in-place operator add_. This might cause t
    he trace to be incorrect, because all other views that also reference this data will not not reflect this change in the trace! On the other ha
    nd, if all other views use the same memory, but are disjoint (e.g. are outputs of torch.split), this might still be safe.
      torch.split(y, 2, dim=1)[0].add_(2)
    
    ```
    Pull Request resolved: #11545
    
    Differential Revision: D9782975
    
    Pulled By: apaszke
    
    fbshipit-source-id: 5b3abd31366e59c69e0b7ff278042b5563deb5a9
  2. Make .to() methods native functions (to fix JIT tracing)

    apaszke authored and facebook-github-bot committed Sep 12, 2018
    Summary: Pull Request resolved: #11491
    
    Differential Revision: D9771121
    
    Pulled By: apaszke
    
    fbshipit-source-id: 08d11101fb12093f8cf913b06359adddf3af9da7
  3. Release GIL when calling into JIT interpreter

    apaszke authored and facebook-github-bot committed Sep 12, 2018
    Summary: Pull Request resolved: #11541
    
    Differential Revision: D9777909
    
    Pulled By: apaszke
    
    fbshipit-source-id: d0217e203721262f3f131b54ea78f898df0b54ec
  4. Allow tracing random functions (only when using default generators) (#…

    apaszke authored and facebook-github-bot committed Sep 12, 2018
    …11539)
    
    Summary:
    Fixes #11504.
    
    zdevito, neerajprad, fritzo
    Pull Request resolved: #11539
    
    Differential Revision: D9777897
    
    Pulled By: apaszke
    
    fbshipit-source-id: 56983260f5b93da7d5540a6242769ea7bd50eb06
Commits on Sep 11, 2018
  1. Remove time prefix from rsync (#11525)

    apaszke authored and facebook-github-bot committed Sep 11, 2018
    Summary:
    This fails with zsh saying "time: command not found".
    
    cc soumith
    Pull Request resolved: #11525
    
    Differential Revision: D9772522
    
    Pulled By: apaszke
    
    fbshipit-source-id: b80d108fa6b174d68ada08a9fdbf7260ee37e08f
  2. Add support for tracing strings (#11506)

    apaszke authored and facebook-github-bot committed Sep 11, 2018
    Summary:
    This enabled `torch.einsum` both in tracing and in script mode. It's used all over Pyro at the moment, and is needed for any use of the JIT in there.
    
    Fixes #11157.
    
    zdevito fritzo neerajprad
    Pull Request resolved: #11506
    
    Differential Revision: D9764787
    
    Pulled By: apaszke
    
    fbshipit-source-id: 9b5251b9e7c5897034602bd07ff67b425d33326c
  3. Improve shape analysis to cover all most commonly used ops (#11358)

    apaszke authored and facebook-github-bot committed Sep 11, 2018
    Summary:
    [Here's a list](https://gist.github.com/apaszke/f0821840bdcc67a977832dc58acc1b85) of ops that are in `register_aten_ops.cpp`, but aren't supported in shape prop. Everything else should work now.
    Pull Request resolved: #11358
    
    Differential Revision: D9753693
    
    Pulled By: apaszke
    
    fbshipit-source-id: efeae0126ce16cb56b8797fc5246405588bcae3c
Commits on Sep 10, 2018
  1. Improve support for tracing sizes, add more tracer warnings (#11288)

    apaszke authored and facebook-github-bot committed Sep 10, 2018
    Summary:
    Many constructors like `torch.zeros` or `torch.randn` didn't support
    size tracing correctly which is fixed by this pass. Same issue has been
    fixed in legacy tensor constructors.
    
    Additionally, new tensor constructors, which do not participate in
    tracing (most notably `torch.tensor`, `torch.as_tensor` and
    `torch.from_numpy`) raise a warning when they are used.
    
    Finally, entering a traceable operation disables the tracing in its body.
    This is needed because
    
    zdevito
    Pull Request resolved: #11288
    
    Reviewed By: ezyang
    
    Differential Revision: D9751183
    
    Pulled By: apaszke
    
    fbshipit-source-id: 51444a39d76a3e164adc396c432fd5ee3c8d5f7f
Commits on Sep 5, 2018
  1. Improve error message to include return types too (#11245)

    apaszke authored and facebook-github-bot committed Sep 5, 2018
    Summary:
    Fixes #11057.
    Pull Request resolved: #11245
    
    Differential Revision: D9652698
    
    Pulled By: apaszke
    
    fbshipit-source-id: 4c5006e32e599c35367aa5acfae45de3ab8ac176
  2. Port PackedSequences functions to C++ (#11224)

    apaszke authored and facebook-github-bot committed Sep 5, 2018
    Summary:
    zdevito
    Pull Request resolved: #11224
    
    Differential Revision: D9652703
    
    Pulled By: apaszke
    
    fbshipit-source-id: 558e39457e590cad07516e5bb2ecb12789564950
  3. Treat numerical differences as warnings instead of errors when tracing (

    apaszke authored and facebook-github-bot committed Sep 5, 2018
    #11246)
    
    Summary:
    Also, make `torch.isclose` work with integral tensors and refactor `_check_trace` a bit.
    
    zdevito
    Pull Request resolved: #11246
    
    Differential Revision: D9652701
    
    Pulled By: apaszke
    
    fbshipit-source-id: fb0bdbfd1952e45e153541e4d471b423a5659f25
Commits on Sep 2, 2018
  1. Disable -Werror on macOS test build (#11090)

    apaszke authored and facebook-github-bot committed Sep 2, 2018
    Summary:
    cc goldsborough
    Pull Request resolved: #11090
    
    Reviewed By: soumith
    
    Differential Revision: D9582525
    
    Pulled By: apaszke
    
    fbshipit-source-id: 5d2c6e930e7b09f0ed5a35fbf4fe36b8845a2580
Commits on Aug 31, 2018
  1. Lower trivial differentiable subgraphs (#11110)

    apaszke authored and facebook-github-bot committed Aug 31, 2018
    Summary:
    zdevito
    Pull Request resolved: #11110
    
    Differential Revision: D9616408
    
    Pulled By: apaszke
    
    fbshipit-source-id: f1ae77d698bf0ada32f2c1c3f587e46a4f57a867
  2. Warn about non-traceable behavior when tracing (#11088)

    apaszke authored and facebook-github-bot committed Aug 31, 2018
    Summary:
    zdevito
    Pull Request resolved: #11088
    
    Differential Revision: D9585527
    
    Pulled By: apaszke
    
    fbshipit-source-id: 29a03cb152d83b626f748fff4501ac9e139994c2
  3. Unbreak the build

    apaszke authored and facebook-github-bot committed Aug 31, 2018
    Summary: The controller you requested could not be found.
    
    fbshipit-source-id: 861021dbe88f84d1a8bd80e04dd684527384629f
  4. Fix a bug in addmm fusion in the JIT (#11100)

    apaszke authored and facebook-github-bot committed Aug 31, 2018
    Summary:
    Fixes #10839.
    
    zdevito
    Pull Request resolved: #11100
    
    Differential Revision: D9585533
    
    Pulled By: apaszke
    
    fbshipit-source-id: 19e2710c8fc113f577faf14c080d8c89afbe23c4
  5. Change specialization rules in GraphExecutors (#10977)

    apaszke authored and facebook-github-bot committed Aug 31, 2018
    Summary:
    **Review last commit only.** Stacked on top of #10949.
    
    This commit fixes a number of issues connected to caching
    differentiability status of graphs inside graph executors,
    and changes the rules for optimization of differentiable subgraphs.
    Previously every one of those was instantiated as a separate graph
    executor, but now they are simply heavier-optimized graph regions,
    and graph executors are only instantiated for their backward.
    
    zdevito
    Pull Request resolved: #10977
    
    Differential Revision: D9600626
    
    Pulled By: apaszke
    
    fbshipit-source-id: dad09a0f586e396afbd5406319c1cd54fbb8a3d3
  6. Don't flatten output lists in the JIT IR (#10949)

    apaszke authored and facebook-github-bot committed Aug 31, 2018
    Summary:
    Operators like aten::chunk used to return a number of tensors, but
    now return a list. To make it easier to do shape prop through
    aten::chunk and fuse it, I've also introduced prim::ConstantChunk,
    which behaves like the previous implementation (has a variable length
    output list).
    
    The downside of this PR is that the introduction of more lists to the IR causes the LSTM and MiLSTM graphs to be considered as non-differentiable by the graph executor. I verified that they are still optimize correctly, and my next patch (that changes how the specializations/differentiation works) will restore those.
    
    zdevito
    Pull Request resolved: #10949
    
    Reviewed By: zdevito
    
    Differential Revision: D9556823
    
    Pulled By: apaszke
    
    fbshipit-source-id: 33e63b17fc7247cac6cfc05eb7eb9bf069b499ee
Commits on Aug 30, 2018
  1. Expose arbitrary cpp autograd functions to Python (#11082)

    apaszke authored and facebook-github-bot committed Aug 30, 2018
    Summary:
    This is needed because the JIT declares some custom autograd functions.
    
    colesbury
    Pull Request resolved: #11082
    
    Differential Revision: D9580456
    
    Pulled By: apaszke
    
    fbshipit-source-id: 6bf00c1188a20b2ee6ecf60e5a0099f8263ad55a
  2. Add entry for torch/lib/pythonX.Y in .gitignore (#11083)

    apaszke authored and facebook-github-bot committed Aug 30, 2018
    Summary:
    I've had `torch/lib/python3.6` show up as part of the build for some time now. It's not ignored which means I need to be extra careful about checking in files, or I end up with a thousand of them in my index.
    Pull Request resolved: #11083
    
    Differential Revision: D9580453
    
    Pulled By: apaszke
    
    fbshipit-source-id: 369e4fe87962696532d111b24f2a4a99b9572bf2
Commits on Aug 29, 2018
  1. Make it possible to disable JIT using env variables (#10867)

    apaszke authored and facebook-github-bot committed Aug 29, 2018
    Summary:
    zdevito
    Pull Request resolved: #10867
    
    Differential Revision: D9556882
    
    Pulled By: apaszke
    
    fbshipit-source-id: 04c0ca875d15d37dd9ac05ac7b515cd899ddb7e4
Commits on Aug 26, 2018
  1. Prevent JIT from overspecializing to every single size configuration (#…

    apaszke authored and facebook-github-bot committed Aug 26, 2018
    …10844)
    
    Summary:
    Please review the expects carefully to make sure there are no regressions. I tried to go over them one by one when they changed, but it's sometimes easy to miss finer details.
    
    Summary of changes:
    
    - Renamed `TensorType` to `CompleteTensorType`. Added a new `TensorType` which records only the scalar type, number of dimensions, and device of a value. The argument behind the rename is to encourage people to use `CompleteTensorType` less, as most passes will only have limited information available. To make transition easier `complete_type->cast<TensorType>()` works, and makes our passes work with both kinds of specialization if they don't need extra the extra detail.
    - Renamed `ArgumentSpec` to `CompleteArgumentSpec`. Added a new `ArgumentSpec`, which matches argument only at the level of the new `TensorType`.
    - Shape analysis can process graphs with both `CompleteTensorType` and `TensorType`.
    - Fuser was a part that heavily relied on full shape information being available. Now, we simply try to fuse the largest possible graphs, and have to do run-time checks to make sure they match the code we generate. If they don't, we fall back to regular interpretation. The shape checks are implementing using an optimized method exploiting algebraic properties of shapes with broadcasting, and the relations of broadcasting with pointwise ops. A full written proof of correctness of the shape checking algorithm is included in a comment in `graph_fuser.cpp`.
    
    zdevito ezyang mruberry ngimel csarofeen
    Pull Request resolved: #10844
    
    Differential Revision: D9498705
    
    Pulled By: apaszke
    
    fbshipit-source-id: 0c53c2fcebd871cc2a29c260f8d012276479cc61