[functorch] Fix torch.cat batching rule #86932

zou3519 · 2022-10-13T19:44:45Z

Stack from ghstack:

-> [functorch] Fix torch.cat batching rule #86932

The bug was discovered in #86842.

torch.cat has an edge case where it ignores all tensors of shape [0]. So
if any of the BatchedTensors have logical shape [0] but physical shape
[B, 0], then we coerce them to shape [0] by slicing them.

Why don't we just ignore those Tensors? We need to propagate
requires_grad-ness somehow (e.g. if the BatchedTensor wraps a Tensor of
shape [B, 0] that requires grad, then the output must require grad).

Test Plan:

new tests

The bug was discovered in #86842. torch.cat has an edge case where it ignores all tensors of shape [0]. So if any of the BatchedTensors have logical shape [0] but physical shape [B, 0], then we coerce them to shape [0] by slicing them. Why don't we just ignore those Tensors? We need to propagate requires_grad-ness somehow (e.g. if the BatchedTensor wraps a Tensor of shape [B, 0] that requires grad, then the output must require grad). Test Plan: - new tests [ghstack-poisoned]

pytorch-bot · 2022-10-13T19:44:47Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86932

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 Failures, 1 Pending

As of commit 167c1a6:

The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

The bug was discovered in #86842. torch.cat has an edge case where it ignores all tensors of shape [0]. So if any of the BatchedTensors have logical shape [0] but physical shape [B, 0], then we coerce them to shape [0] by slicing them. Why don't we just ignore those Tensors? We need to propagate requires_grad-ness somehow (e.g. if the BatchedTensor wraps a Tensor of shape [B, 0] that requires grad, then the output must require grad). Test Plan: - new tests ghstack-source-id: 3cd4196977aff46c079b63b811b5056290f772f8 Pull Request resolved: #86932

Chillee

Should we also port this batching rule to not use the old API? (doesn't need to be done in this PR, just wondering).

zou3519 · 2022-10-18T20:28:57Z

Should we also port this batching rule to not use the old API? (doesn't need to be done in this PR, just wondering).

Yes. The problem is the vmap codegen doesn't handle operators that accept TensorList yet

zou3519 · 2022-10-19T18:01:42Z

@pytorchbot merge -f "failures look unrelated (they occur in the non-functorch shards) (also there are no logs)"

zou3519 · 2022-10-20T18:00:03Z

@pytorchbot merge -f "failures look unrelated (they occur in the non-functorch shards) (also there are no logs)"

pytorchmergebot · 2022-10-20T18:01:27Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

github-actions · 2022-10-20T18:02:08Z

Hey @zou3519.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

The bug was discovered in pytorch#86842. torch.cat has an edge case where it ignores all tensors of shape [0]. So if any of the BatchedTensors have logical shape [0] but physical shape [B, 0], then we coerce them to shape [0] by slicing them. Why don't we just ignore those Tensors? We need to propagate requires_grad-ness somehow (e.g. if the BatchedTensor wraps a Tensor of shape [B, 0] that requires grad, then the output must require grad). Test Plan: - new tests Pull Request resolved: pytorch#86932 Approved by: https://github.com/Chillee

zou3519 requested review from samdow and Chillee October 14, 2022 15:58

Chillee approved these changes Oct 18, 2022

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 18, 2022

pytorchmergebot added the Merged label Oct 20, 2022

pytorchmergebot closed this in b805e1a Oct 20, 2022

facebook-github-bot deleted the gh/zou3519/554/head branch June 8, 2023 19:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[functorch] Fix torch.cat batching rule #86932

[functorch] Fix torch.cat batching rule #86932

zou3519 commented Oct 13, 2022 •

edited

pytorch-bot bot commented Oct 13, 2022 •

edited

Chillee left a comment

zou3519 commented Oct 18, 2022

zou3519 commented Oct 19, 2022

zou3519 commented Oct 20, 2022

pytorchmergebot commented Oct 20, 2022

github-actions bot commented Oct 20, 2022

[functorch] Fix torch.cat batching rule #86932

[functorch] Fix torch.cat batching rule #86932

Conversation

zou3519 commented Oct 13, 2022 • edited

pytorch-bot bot commented Oct 13, 2022 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86932

❌ 2 Failures, 1 Pending

Chillee left a comment

Choose a reason for hiding this comment

zou3519 commented Oct 18, 2022

zou3519 commented Oct 19, 2022

zou3519 commented Oct 20, 2022

pytorchmergebot commented Oct 20, 2022

Merge started

github-actions bot commented Oct 20, 2022

zou3519 commented Oct 13, 2022 •

edited

pytorch-bot bot commented Oct 13, 2022 •

edited