-
Notifications
You must be signed in to change notification settings - Fork 3.3k
conv_pointwise_type parameter #6476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
Signed-off-by: Dima Rekesh <bmwshop@gmail.com>
Signed-off-by: Dima Rekesh <bmwshop@gmail.com>
for more information, see https://pre-commit.ci
titu1994
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. But needs to be documented in Nemo ASR docs config section
should we add to this PR or do a separate PR for docs? |
|
Same PR would be good |
Signed-off-by: Dima Rekesh <bmwshop@gmail.com>
|
|
|
||
| The only condition that needs to be met is that **the final layer of the acoustic model must have the hidden dimension defined in ``model_defaults.enc_hidden``**. | ||
|
|
||
| Conformer Encoder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this for fast conformer ? Some flags are for fast conformer
| self.pointwise_conv2 = nn.Conv1d( | ||
| in_channels=dw_conv_input_dim, out_channels=d_model, kernel_size=1, stride=1, padding=0, bias=True | ||
| ) | ||
| elif self.conv_pointwise_type == 'linear': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder. Conv1d with kernel 1 is identical to linear layer, so what if we do something like silently convert the conv kernel shape to linear compatible weights and don't use explicitly the Lineae layer but use functional.linear() instead ?
My assumption is we don't want to break checkpoint compatibility - and we don't need to. We can take a model trained on conv1d and use the weight in a F.linear() after weight conversion - this way no checkpoint incompatibility is there for old models but still there is speedup
titu1994
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments
Signed-off-by: Dima Rekesh <bmwshop@gmail.com>
for more information, see https://pre-commit.ci
|
@bmwshop |
What does this PR do ?
Adding the conv_pointwise_type parameter that to ConformerEncoder enabling the use of nn.Linear() for a 10% acceleration
Collection: ASR
Changelog
Adding the conv_pointwise_type parameter to
Usage
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information