Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MHA: Fix regression and apply bias flag to both in/out proj #52537

Closed
wants to merge 4 commits into from

Conversation

jbschlosser
Copy link
Contributor

Fixes #52257

Background

Reverts MHA behavior for bias flag to that of v1.5: flag enables or disables both in and out projection biases.

Updates type annotations for both in and out projections biases from Tensor to Optional[Tensor] for torch.jit.script usage.

Note: With this change, _LinearWithBias defined in torch/nn/modules/linear.py is no longer utilized. Completely removing it would require updates to quantization logic in the following files:

test/quantization/test_quantized_module.py
torch/nn/quantizable/modules/activation.py
torch/nn/quantized/dynamic/modules/linear.py
torch/nn/quantized/modules/linear.py
torch/quantization/quantization_mappings.py

This PR takes a conservative initial approach and leaves these files unchanged.

Is it safe to fully remove _LinearWithBias?

Test Plan

python test/test_nn.py TestNN.test_multihead_attn_no_bias

BC-Breaking Note

In v1.6, the behavior of MultiheadAttention's bias flag was incorrectly changed to affect only the in projection layer. That is, setting bias=False would fail to disable the bias for the out projection layer. This regression has been fixed, and the bias flag now correctly applies to both the in and out projection layers.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Feb 19, 2021

💊 CI failures summary and remediations

As of commit 75dfb0f (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

@gchanan gchanan added the module: bc-breaking Related to a BC-breaking change label Feb 19, 2021
@gchanan
Copy link
Contributor

gchanan commented Feb 19, 2021

I don't think we need to worry about removing _LinearWithBias.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbschlosser has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@codecov
Copy link

codecov bot commented Feb 22, 2021

Codecov Report

Merging #52537 (75dfb0f) into master (941ebec) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master   #52537   +/-   ##
=======================================
  Coverage   80.64%   80.64%           
=======================================
  Files        1969     1969           
  Lines      215943   215944    +1     
=======================================
+ Hits       174137   174138    +1     
  Misses      41806    41806           

@@ -65,7 +65,7 @@ def __init__(self, embed_dim: int, num_heads: int,
# TODO: The use of the `_LinearWithBias` increases the quantization noise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comment still relevant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth on this, but I left it in for the case where an older model is loaded and then quantized. Is this valid? The comment should probably be at least updated to reflect that out_proj could or could not be a _LinearWithBias.

Copy link
Contributor

@gchanan gchanan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgmt. @vkuzo mind taking a look at the quantization code?

@facebook-github-bot
Copy link
Contributor

@jbschlosser merged this pull request in a39b1c4.

@vkuzo vkuzo requested a review from z-a-f February 22, 2021 22:51
@vkuzo
Copy link
Contributor

vkuzo commented Feb 22, 2021

cc @z-a-f as an fyi

aocsa pushed a commit to Quansight/pytorch that referenced this pull request Mar 15, 2021
…52537)

Summary:
Fixes pytorch#52257

## Background
Reverts MHA behavior for `bias` flag to that of v1.5: flag enables or disables both in and out projection biases.

Updates type annotations for both in and out projections biases from `Tensor` to `Optional[Tensor]` for `torch.jit.script` usage.

Note: With this change, `_LinearWithBias` defined in `torch/nn/modules/linear.py` is no longer utilized. Completely removing it would require updates to quantization logic in the following files:
```
test/quantization/test_quantized_module.py
torch/nn/quantizable/modules/activation.py
torch/nn/quantized/dynamic/modules/linear.py
torch/nn/quantized/modules/linear.py
torch/quantization/quantization_mappings.py
```
This PR takes a conservative initial approach and leaves these files unchanged.

**Is it safe to fully remove `_LinearWithBias`?**

Pull Request resolved: pytorch#52537

Test Plan:
```
python test/test_nn.py TestNN.test_multihead_attn_no_bias
```

## BC-Breaking Note
In v1.6, the behavior of `MultiheadAttention`'s `bias` flag was incorrectly changed to affect only the in projection layer. That is, setting `bias=False` would fail to disable the bias for the out projection layer. This regression has been fixed, and the `bias` flag now correctly applies to both the in and out projection layers.

Reviewed By: bdhirsh

Differential Revision: D26583639

Pulled By: jbschlosser

fbshipit-source-id: b805f3a052628efb28b89377a41e06f71747ac5b
xsacha pushed a commit to xsacha/pytorch that referenced this pull request Mar 31, 2021
…52537)

Summary:
Fixes pytorch#52257

## Background
Reverts MHA behavior for `bias` flag to that of v1.5: flag enables or disables both in and out projection biases.

Updates type annotations for both in and out projections biases from `Tensor` to `Optional[Tensor]` for `torch.jit.script` usage.

Note: With this change, `_LinearWithBias` defined in `torch/nn/modules/linear.py` is no longer utilized. Completely removing it would require updates to quantization logic in the following files:
```
test/quantization/test_quantized_module.py
torch/nn/quantizable/modules/activation.py
torch/nn/quantized/dynamic/modules/linear.py
torch/nn/quantized/modules/linear.py
torch/quantization/quantization_mappings.py
```
This PR takes a conservative initial approach and leaves these files unchanged.

**Is it safe to fully remove `_LinearWithBias`?**

Pull Request resolved: pytorch#52537

Test Plan:
```
python test/test_nn.py TestNN.test_multihead_attn_no_bias
```

## BC-Breaking Note
In v1.6, the behavior of `MultiheadAttention`'s `bias` flag was incorrectly changed to affect only the in projection layer. That is, setting `bias=False` would fail to disable the bias for the out projection layer. This regression has been fixed, and the `bias` flag now correctly applies to both the in and out projection layers.

Reviewed By: bdhirsh

Differential Revision: D26583639

Pulled By: jbschlosser

fbshipit-source-id: b805f3a052628efb28b89377a41e06f71747ac5b
bhosmer added a commit that referenced this pull request May 26, 2021
Fixes a post-1.8 regression in nn.MultiheadAttention + quantization scriptability introduced in #52537. Passes the new test introduced in that PR, and fixes the repro found by @ngimel [here](https://gist.github.com/bhosmer/ef517d0774f2f10336b8140116fd6b62). 

Per comments in #52537 there's definitely a carnal dependency between quantization and the `_LinearWithBias` class by name that I'm reinstating here, but there may be cleaner ways to solve this - I don't really know what I'm doing 😁 . 

@jbschlosser @z-a-f LMK if you have ideas, happy to change this as desired. It'd be nice to get a fix into 1.9.

Differential Revision: [D28593830](https://our.internmc.facebook.com/intern/diff/D28593830)

[ghstack-poisoned]
bhosmer added a commit that referenced this pull request May 26, 2021
Fixes a post-1.8 regression in nn.MultiheadAttention + quantization scriptability introduced in #52537. Passes the new test introduced in that PR, and fixes the repro found by @ngimel [here](https://gist.github.com/bhosmer/ef517d0774f2f10336b8140116fd6b62). 

Per comments in #52537 there's definitely a carnal dependency between quantization and the `_LinearWithBias` class by name that I'm reinstating here, but there may be cleaner ways to solve this - I don't really know what I'm doing 😁 . 

@jbschlosser @z-a-f LMK if you have ideas, happy to change this as desired. It'd be nice to get a fix into 1.9.

_[Update: now using a better name instead of `_LinearWithBias`, but this remains a short-term fix to re-suppress a quantization API usage error that should properly be raised upstream. See issue #58969]_

Differential Revision: [D28593830](https://our.internmc.facebook.com/intern/diff/D28593830)

[ghstack-poisoned]
bhosmer added a commit that referenced this pull request May 26, 2021
Fixes a post-1.8 regression in nn.MultiheadAttention + quantization scriptability introduced in #52537. Passes the new test introduced in that PR, and fixes the repro found by @ngimel [here](https://gist.github.com/bhosmer/ef517d0774f2f10336b8140116fd6b62). 

Per comments in #52537 there's definitely a carnal dependency between quantization and the `_LinearWithBias` class by name that I'm reinstating here, but there may be cleaner ways to solve this - I don't really know what I'm doing 😁 . 

@jbschlosser @z-a-f LMK if you have ideas, happy to change this as desired. It'd be nice to get a fix into 1.9.

_[Update: now using a better name instead of `_LinearWithBias`, but this remains a short-term fix to re-suppress a quantization API usage error that should properly be raised upstream. See issue #58969]_

Differential Revision: [D28593830](https://our.internmc.facebook.com/intern/diff/D28593830)

[ghstack-poisoned]
bhosmer added a commit that referenced this pull request May 26, 2021
Fixes a post-1.8 regression in nn.MultiheadAttention + quantization scriptability introduced in #52537. Passes the new test introduced in that PR, and fixes the repro found by @ngimel [here](https://gist.github.com/bhosmer/ef517d0774f2f10336b8140116fd6b62). 

Per comments in #52537 there's definitely a carnal dependency between quantization and the `_LinearWithBias` class by name that I'm reinstating here, but there may be cleaner ways to solve this - I don't really know what I'm doing 😁 . 

@jbschlosser @z-a-f LMK if you have ideas, happy to change this as desired. It'd be nice to get a fix into 1.9.

_[Update: now using a better name instead of `_LinearWithBias`, but this remains a short-term fix to re-suppress a quantization API usage error that should properly be raised upstream. See issue #58969]_

Differential Revision: [D28593830](https://our.internmc.facebook.com/intern/diff/D28593830)

[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged module: bc-breaking Related to a BC-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MultiHeadAttention bias=False regression
5 participants