[transformer] BT enablement on fairseq - pytorch change #79186

frank-wei · 2022-06-09T05:45:46Z

The fairseq diff is split into two parts.
The first diff (this one)
This diff is about creating a mask left align function to check the mask condition for nested tensor. It is necessary for torchscript deployment.

The second diff (D37082681)
Fork the inference path inside the forward function. If loaded the checkpoint file and perform the inference, we will deploy BT. Otherwise, fairseq take the position.

Reviewed By: mikekgfb

Differential Revision: D36057338

facebook-github-bot · 2022-06-09T05:45:51Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/79186
📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓Need help or want to give feedback on the CI? Visit our office hours

✅ No Failures (0 Pending)

As of commit 9c62434 (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

facebook-github-bot · 2022-06-09T05:46:06Z

This pull request was exported from Phabricator. Differential Revision: D36057338

facebook-github-bot · 2022-06-09T05:52:07Z

This pull request was exported from Phabricator. Differential Revision: D36057338

facebook-github-bot · 2022-06-10T04:21:52Z

This pull request was exported from Phabricator. Differential Revision: D36057338

Summary: X-link: pytorch/pytorch#79186 Pull Request resolved: facebookresearch#4468 as titled Ford the inference path inside the forward function. If loaded the checkpoint file and perform the inference, we will deploy BT. Otherwise, fairseq take the position. In summary: Accuracy: accuracy loss due to the fp16, the maximum diff is around 0.009. If we set it to fp32, there is no accuracy loss Perf: the current fairseq has similar speed as vanilla version. After the enablement, the speedup is similar to standalone BT test. With batch size=64 For V100, the speedup reaches to 1.23x For A100, the speedup reaches to 1.38x After enable nested tensor, For V100, the speedup reaches to 2.46x Reviewed By: mikekgfb Differential Revision: D36057338 fbshipit-source-id: 229e72e6050bf70ddedcda8f47d158526910557f

facebook-github-bot · 2022-06-10T17:41:44Z

This pull request was exported from Phabricator. Differential Revision: D36057338

erichan1

LGTM!

jbschlosser

Confused about this - I just see this PR adding a _nested_tensor_from_mask_left_aligned op. Some questions:

How does this related to the PR summary? AFAICT just adding this op won't enable its use.
How does this differ from _nested_tensor_from_mask - does it just avoid checks by making the assumption the mask is left-aligned? If so, couldn't we avoid quite a bit of duplication?

frank-wei · 2022-06-10T21:57:47Z

Confused about this - I just see this PR adding a _nested_tensor_from_mask_left_aligned op. Some questions:

How does this related to the PR summary? AFAICT just adding this op won't enable its use.

The diff is optimized to split into two parts now. Part1 only includes this op change and associated with D36057338. Part2 only includes fairseq change (D37082681) but depend on Part1. The internal CI tests show the op runs well.

How does this differ from _nested_tensor_from_mask - does it just avoid checks by making the assumption the mask is left-aligned? If so, couldn't we avoid quite a bit of duplication?

It is the front part of "_nested_tensor_from_mask" implementation. But the purpose is to help check the left aligned in advance.

Summary: The fairseq diff is split into two parts. The first diff (this one) This diff is about creating a mask left align function to check the mask condition for nested tensor. It is necessary for torchscript deployment. The second diff (D37082681) Fork the inference path inside the forward function. If loaded the checkpoint file and perform the inference, we will deploy BT. Otherwise, fairseq take the position. Perf in summary: Accuracy: accuracy loss due to the fp16, the maximum diff is around 0.009. If we set it to fp32, there is no accuracy loss Perf: the current fairseq has similar speed as vanilla version. After the enablement, the speedup is similar to standalone BT test. With batch size=64 For V100, the speedup reaches to 1.23x For A100, the speedup reaches to 1.38x After enable nested tensor, For V100, the speedup reaches to 2.46x Test Plan: In D37082681 Reviewed By: mikekgfb Differential Revision: D36057338 fbshipit-source-id: 2b824481481b8972d168f2751afa77cc9b3cbe02

facebook-github-bot · 2022-06-10T22:04:09Z

This pull request was exported from Phabricator. Differential Revision: D36057338

facebook-github-bot · 2022-06-12T05:53:02Z

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot · 2022-06-12T05:56:22Z

@pytorchbot successfully started a merge job. Check the current status here

github-actions · 2022-06-12T05:57:02Z

Hey @frank-wei.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

Summary: Pull Request resolved: #79186 The fairseq diff is split into two parts. The first diff (this one) This diff is about creating a mask left align function to check the mask condition for nested tensor. It is necessary for torchscript deployment. The second diff (D37082681) Fork the inference path inside the forward function. If loaded the checkpoint file and perform the inference, we will deploy BT. Otherwise, fairseq take the position. Perf in summary: Accuracy: accuracy loss due to the fp16, the maximum diff is around 0.009. If we set it to fp32, there is no accuracy loss Perf: the current fairseq has similar speed as vanilla version. After the enablement, the speedup is similar to standalone BT test. With batch size=64 For V100, the speedup reaches to 1.23x For A100, the speedup reaches to 1.38x After enable nested tensor, For V100, the speedup reaches to 2.46x Test Plan: In D37082681 Reviewed By: mikekgfb Differential Revision: D36057338 fbshipit-source-id: 0ba75c254ccc4b4a29702ab0e18a36b5d0e1d832

The fairseq diff is split into two parts. The first diff (this one) This diff is about creating a mask left align function to check the mask condition for nested tensor. It is necessary for torchscript deployment. The second diff (D37082681) Fork the inference path inside the forward function. If loaded the checkpoint file and perform the inference, we will deploy BT. Otherwise, fairseq take the position. Reviewed By: mikekgfb Differential Revision: D36057338 Pull Request resolved: #79186 Approved by: https://github.com/erichan1

frank-wei requested a review from bdhirsh as a code owner June 9, 2022 05:45

facebook-github-bot added the cla signed label Jun 9, 2022

facebook-github-bot added the fb-exported label Jun 9, 2022

frank-wei requested review from ngimel and jbschlosser June 10, 2022 17:49

erichan1 self-requested a review June 10, 2022 21:16

frank-wei requested a review from zrphercule June 10, 2022 21:17

erichan1 approved these changes Jun 10, 2022

View reviewed changes

jbschlosser reviewed Jun 10, 2022

View reviewed changes

frank-wei changed the title ~~[transformer] BT enablement on fairseq~~ [transformer] BT enablement on fairseq - pytorch change Jun 10, 2022

pytorchmergebot added the Merged label Jun 12, 2022

pytorchmergebot closed this in f23685f Jun 12, 2022

erichan1 added release notes: nn release notes category topic: bug fixes topic category topic: performance topic category labels Jun 13, 2022

erichan1 mentioned this pull request Jul 21, 2022

Bt/1.12.1 #81914

Closed

erichan1 mentioned this pull request Jul 21, 2022

1.12.1/bt fix #81952

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[transformer] BT enablement on fairseq - pytorch change #79186

[transformer] BT enablement on fairseq - pytorch change #79186

Uh oh!

frank-wei commented Jun 9, 2022 •

edited

Loading

Uh oh!

facebook-github-bot commented Jun 9, 2022 •

edited

Loading

Uh oh!

facebook-github-bot commented Jun 9, 2022

Uh oh!

facebook-github-bot commented Jun 9, 2022

Uh oh!

facebook-github-bot commented Jun 10, 2022

Uh oh!

facebook-github-bot commented Jun 10, 2022

Uh oh!

erichan1 left a comment

Uh oh!

jbschlosser left a comment

Uh oh!

frank-wei commented Jun 10, 2022 •

edited

Loading

Uh oh!

facebook-github-bot commented Jun 10, 2022

Uh oh!

facebook-github-bot commented Jun 12, 2022

Uh oh!

pytorchmergebot commented Jun 12, 2022

Uh oh!

github-actions bot commented Jun 12, 2022

Uh oh!

Uh oh!

[transformer] BT enablement on fairseq - pytorch change #79186

[transformer] BT enablement on fairseq - pytorch change #79186

Uh oh!

Conversation

frank-wei commented Jun 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Jun 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

✅ No Failures (0 Pending)

Uh oh!

facebook-github-bot commented Jun 9, 2022

Uh oh!

facebook-github-bot commented Jun 9, 2022

Uh oh!

facebook-github-bot commented Jun 10, 2022

Uh oh!

facebook-github-bot commented Jun 10, 2022

Uh oh!

erichan1 left a comment

Choose a reason for hiding this comment

Uh oh!

jbschlosser left a comment

Choose a reason for hiding this comment

Uh oh!

frank-wei commented Jun 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Jun 10, 2022

Uh oh!

facebook-github-bot commented Jun 12, 2022

Uh oh!

pytorchmergebot commented Jun 12, 2022

Uh oh!

github-actions bot commented Jun 12, 2022

Uh oh!

Uh oh!

frank-wei commented Jun 9, 2022 •

edited

Loading

facebook-github-bot commented Jun 9, 2022 •

edited

Loading

frank-wei commented Jun 10, 2022 •

edited

Loading