[JIT] Support enough of closures to write autograd functions #15411

zdevito · 2018-12-19T22:12:29Z

This PR adds enough of the infra for supporting closures (inner script functions) in order to allow us to expression symbolic gradients using them. We do not actually ever run graphs that contain these closures. The symbolic_script infrastructure just extracts them out of the original forward graph and turns them into discrete forward/backward pairs. This cuts down on the type annotations necessary to write forward/backward pairs and aligns closely with the "differentiator" function approach to expression reverse-mode AD.

Example:

This code:

import torch

r = torch.jit.CompilationUnit(
'''
def mul_forward(self, other):
    def backward(grad_output):
        grad_self = (grad_output * other).sum_to_size(self.size())
        grad_other = (grad_output * self).sum_to_size(other.size())
        return grad_self, grad_other
    return self * other, backward
''')

print(r.module.code)

Will produce this graph (pretty printed for clarity):

def mul_forward(self,
    self: Tensor,
    other: Tensor) -> Tuple[Tensor, Tuple[None, Tuple[Tensor, Tensor]]]:
  backward = (self.__lambda, (other, self))
  return (torch.mul(self, other), backward)


def __lambda(self,
    context: Tuple[Tensor, Tensor],
    grad_output: Tensor) -> Tuple[Tensor, Tensor]:
  other, self, = context
  grad_self = torch.sum_to_size(torch.mul(grad_output, other), torch.size(self))
  grad_other = torch.sum_to_size(torch.mul(grad_output, self), torch.size(other))
  return (grad_self, grad_other)

symbolic_script will then do some modifications to remove the unsuppored prim::Function node, yielding:

# same as before but will self.__lambda removed
def mul_forward(self,
    self: Tensor,
    other: Tensor) -> Tuple[Tensor, Tuple[None, Tuple[Tensor, Tensor]]]:
  return (torch.mul(self, other), (other, self))


# just the captured closure, unchanged
def backward(self,
    context: Tuple[Tensor, Tensor],
    grad_output: Tensor) -> Tuple[Tensor, Tensor]:
  other, self, = context
  grad_self = torch.sum_to_size(torch.mul(grad_output, other), torch.size(self))
  grad_other = torch.sum_to_size(torch.mul(grad_output, self), torch.size(other))
  return (grad_self, grad_other)

ailzhang

Thanks!! This is definitely much cleaner than the old separate forward, backward way!
Nit: The compiled graph in the description has an additional self printed in the first line, is this intended?

def mul_forward(self,
    self: Tensor,

zdevito · 2018-12-19T23:27:15Z

extra self is a bug, David has a fix landing for it already. It shouldn't affect the correctness for our purposes.

ailzhang · 2018-12-19T23:29:04Z

torch/csrc/jit/symbolic_script.cpp

-      });
+    const std::vector<std::string> functions = {
+      R"(
+        def mul(self, other):


Hmmm how does the CompilationUnit differentiate when other is a Tensor or Scalar?

facebook-github-bot

@zdevito has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@zdevito has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: This PR adds enough of the infra for supporting closures (inner script functions) in order to allow us to expression symbolic gradients using them. We do not actually ever run graphs that contain these closures. The symbolic_script infrastructure just extracts them out of the original forward graph and turns them into discrete forward/backward pairs. This cuts down on the type annotations necessary to write forward/backward pairs and aligns closely with the "differentiator" function approach to expression reverse-mode AD. Example: This code: ``` import torch r = torch.jit.CompilationUnit( ''' def mul_forward(self, other): def backward(grad_output): grad_self = (grad_output * other).sum_to_size(self.size()) grad_other = (grad_output * self).sum_to_size(other.size()) return grad_self, grad_other return self * other, backward ''') print(r.module.code) ``` Will produce this graph (pretty printed for clarity): ``` def mul_forward(self, self: Tensor, other: Tensor) -> Tuple[Tensor, Tuple[None, Tuple[Tensor, Tensor]]]: backward = (self.__lambda, (other, self)) return (torch.mul(self, other), backward) def __lambda(self, context: Tuple[Tensor, Tensor], grad_output: Tensor) -> Tuple[Tensor, Tensor]: other, self, = context grad_self = torch.sum_to_size(torch.mul(grad_output, other), torch.size(self)) grad_other = torch.sum_to_size(torch.mul(grad_output, self), torch.size(other)) return (grad_self, grad_other) ``` symbolic_script will then do some modifications to remove the unsuppored prim::Function node, yielding: ``` def mul_forward(self, self: Tensor, other: Tensor) -> Tuple[Tensor, Tuple[None, Tuple[Tensor, Tensor]]]: return (torch.mul(self, other), (other, self)) def backward(self, context: Tuple[Tensor, Tensor], grad_output: Tensor) -> Tuple[Tensor, Tensor]: other, self, = context grad_self = torch.sum_to_size(torch.mul(grad_output, other), torch.size(self)) grad_other = torch.sum_to_size(torch.mul(grad_output, self), torch.size(other)) return (grad_self, grad_other) ``` Pull Request resolved: pytorch/pytorch#15411 Differential Revision: D13523340 Pulled By: zdevito fbshipit-source-id: 4d4a269460e595b16802c00ec55ae00e3e682d49

zdevito added 3 commits December 19, 2018 10:33

refactor

1787596

closure functionality

a7cc585

add the closure bits

d2c3332

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Dec 19, 2018

expect tests

982fd65

zdevito force-pushed the pr/closure_test branch from 3c552a6 to 982fd65 Compare December 19, 2018 22:15

zdevito requested review from ailzhang and apaszke December 19, 2018 22:17

ailzhang reviewed Dec 19, 2018

View reviewed changes

ailzhang approved these changes Dec 19, 2018

View reviewed changes

ailzhang reviewed Dec 19, 2018

View reviewed changes

facebook-github-bot reviewed Dec 19, 2018

View reviewed changes

facebook-github-bot reviewed Dec 20, 2018

View reviewed changes

facebook-github-bot closed this in 1a2ec10 Dec 20, 2018

ezyang added the merged label Jun 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[JIT] Support enough of closures to write autograd functions #15411

[JIT] Support enough of closures to write autograd functions #15411

Uh oh!

zdevito commented Dec 19, 2018

Uh oh!

ailzhang left a comment

Uh oh!

zdevito commented Dec 19, 2018

Uh oh!

ailzhang Dec 19, 2018

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[JIT] Support enough of closures to write autograd functions #15411

[JIT] Support enough of closures to write autograd functions #15411

Uh oh!

Conversation

zdevito commented Dec 19, 2018

Uh oh!

ailzhang left a comment

Choose a reason for hiding this comment

Uh oh!

zdevito commented Dec 19, 2018

Uh oh!

ailzhang Dec 19, 2018

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants