Skip to content

Conversation

zou3519
Copy link
Contributor

@zou3519 zou3519 commented Jan 11, 2023

Stack from ghstack:

We don't actually need output_shapes to implement
generate_vmap_rule=True support for autograd.Function.

  • We need this in the vjp (backward) case because autograd automatically
    reduces grad_inputs to inputs and we need to replicate that behavior.
    In order to replicate that behavior, we recorded the original input
    shapes so we know how to reduce the grad_input.
  • There is no such behavior for forward-mode AD, so we don't need to
    pass an output_shapes to reductify.

This PR simplifies the API of reductify and reductify_leaf. Instead
of accepting input_shape_without_bdim and allow_expanded_grad, we
now combine these into a single argument,
reduce_to_input_shape_without_bdim.

  • if it is None, then we don't do anything
  • if it is not-None and a shape, then we will reduce the grad to the
    provided shape.

Test Plan:

  • updated original unittests
  • wait for test suite

We don't actually need `output_shapes` to implement
`generate_vmap_rule=True` support for autograd.Function.
- We need this in the vjp (backward) case because autograd automatically
  reduces grad_inputs to inputs and we need to replicate that behavior.
  In order to replicate that behavior, we recorded the original input
  shapes so we know how to reduce the grad_input.
- There is no such behavior for forward-mode AD, so we don't need to
  pass an `output_shapes` to reductify.

This PR simplifies the API of `reductify` and `reductify_leaf`. Instead
of accepting `input_shape_without_bdim` and `allow_expanded_grad`, we
now combine these into a single argument,
`reduce_to_input_shape_without_bdim`.
- if it is None, then we don't do anything
- if it is not-None and a shape, then we will reduce the grad to the
  provided shape.

Test Plan:
- updated original unittests
- wait for test suite

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Jan 11, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/92024

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d281147:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

We don't actually need `output_shapes` to implement
`generate_vmap_rule=True` support for autograd.Function.
- We need this in the vjp (backward) case because autograd automatically
  reduces grad_inputs to inputs and we need to replicate that behavior.
  In order to replicate that behavior, we recorded the original input
  shapes so we know how to reduce the grad_input.
- There is no such behavior for forward-mode AD, so we don't need to
  pass an `output_shapes` to reductify.

This PR simplifies the API of `reductify` and `reductify_leaf`. Instead
of accepting `input_shape_without_bdim` and `allow_expanded_grad`, we
now combine these into a single argument,
`reduce_to_input_shape_without_bdim`.
- if it is None, then we don't do anything
- if it is not-None and a shape, then we will reduce the grad to the
  provided shape.

Test Plan:
- updated original unittests
- wait for test suite

[ghstack-poisoned]
We don't actually need `output_shapes` to implement
`generate_vmap_rule=True` support for autograd.Function.
- We need this in the vjp (backward) case because autograd automatically
  reduces grad_inputs to inputs and we need to replicate that behavior.
  In order to replicate that behavior, we recorded the original input
  shapes so we know how to reduce the grad_input.
- There is no such behavior for forward-mode AD, so we don't need to
  pass an `output_shapes` to reductify.

This PR simplifies the API of `reductify` and `reductify_leaf`. Instead
of accepting `input_shape_without_bdim` and `allow_expanded_grad`, we
now combine these into a single argument,
`reduce_to_input_shape_without_bdim`.
- if it is None, then we don't do anything
- if it is not-None and a shape, then we will reduce the grad to the
  provided shape.

Test Plan:
- updated original unittests
- wait for test suite

[ghstack-poisoned]
Copy link
Contributor

@soulitzer soulitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Have a minor comment on the naming of the flag.

def reductify(grad_input, grad_input_bdim, input_bdim, input_shape_without_bdim, batch_size,
allow_expanded_grad=True):
def reductify(grad_input, grad_input_bdim, input_bdim, batch_size,
reduce_to_input_shape_without_bdim=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "reduce_to_input_shape_without_bdim" sounds like a bool because it begins with a verb (when it is supposed to be a shape). Just to shuffle that around, maybe "{target,input}_shape_without_bdim{_to_reduce_to,}" is better?

We don't actually need `output_shapes` to implement
`generate_vmap_rule=True` support for autograd.Function.
- We need this in the vjp (backward) case because autograd automatically
  reduces grad_inputs to inputs and we need to replicate that behavior.
  In order to replicate that behavior, we recorded the original input
  shapes so we know how to reduce the grad_input.
- There is no such behavior for forward-mode AD, so we don't need to
  pass an `output_shapes` to reductify.

This PR simplifies the API of `reductify` and `reductify_leaf`. Instead
of accepting `input_shape_without_bdim` and `allow_expanded_grad`, we
now combine these into a single argument,
`reduce_to_input_shape_without_bdim`.
- if it is None, then we don't do anything
- if it is not-None and a shape, then we will reduce the grad to the
  provided shape.

Test Plan:
- updated original unittests
- wait for test suite

[ghstack-poisoned]
We don't actually need `output_shapes` to implement
`generate_vmap_rule=True` support for autograd.Function.
- We need this in the vjp (backward) case because autograd automatically
  reduces grad_inputs to inputs and we need to replicate that behavior.
  In order to replicate that behavior, we recorded the original input
  shapes so we know how to reduce the grad_input.
- There is no such behavior for forward-mode AD, so we don't need to
  pass an `output_shapes` to reductify.

This PR simplifies the API of `reductify` and `reductify_leaf`. Instead
of accepting `input_shape_without_bdim` and `allow_expanded_grad`, we
now combine these into a single argument,
`reduce_to_input_shape_without_bdim`.
- if it is None, then we don't do anything
- if it is not-None and a shape, then we will reduce the grad to the
  provided shape.

Test Plan:
- updated original unittests
- wait for test suite

[ghstack-poisoned]
We don't actually need `output_shapes` to implement
`generate_vmap_rule=True` support for autograd.Function.
- We need this in the vjp (backward) case because autograd automatically
  reduces grad_inputs to inputs and we need to replicate that behavior.
  In order to replicate that behavior, we recorded the original input
  shapes so we know how to reduce the grad_input.
- There is no such behavior for forward-mode AD, so we don't need to
  pass an `output_shapes` to reductify.

This PR simplifies the API of `reductify` and `reductify_leaf`. Instead
of accepting `input_shape_without_bdim` and `allow_expanded_grad`, we
now combine these into a single argument,
`reduce_to_input_shape_without_bdim`.
- if it is None, then we don't do anything
- if it is not-None and a shape, then we will reduce the grad to the
  provided shape.

Test Plan:
- updated original unittests
- wait for test suite

[ghstack-poisoned]
We don't actually need `output_shapes` to implement
`generate_vmap_rule=True` support for autograd.Function.
- We need this in the vjp (backward) case because autograd automatically
  reduces grad_inputs to inputs and we need to replicate that behavior.
  In order to replicate that behavior, we recorded the original input
  shapes so we know how to reduce the grad_input.
- There is no such behavior for forward-mode AD, so we don't need to
  pass an `output_shapes` to reductify.

This PR simplifies the API of `reductify` and `reductify_leaf`. Instead
of accepting `input_shape_without_bdim` and `allow_expanded_grad`, we
now combine these into a single argument,
`reduce_to_input_shape_without_bdim`.
- if it is None, then we don't do anything
- if it is not-None and a shape, then we will reduce the grad to the
  provided shape.

Test Plan:
- updated original unittests
- wait for test suite

[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants