Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forcing conv subsampling to 32 bit #4293

Merged
merged 6 commits into from
Jun 1, 2022
Merged

Conversation

bmwshop
Copy link
Collaborator

@bmwshop bmwshop commented May 31, 2022

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.
Forces 2D conv subsampling to fp32

Collection: [Note which collection this PR will affect]
ASR

Changelog

  • Add specific line by line info of high level changes in this PR.
    One liner change in nemo/collections/asr/modules/conformer_encoder.py

Usage

  • You can potentially add a usage example below
    No changes
# Add a code snippet demonstrating how to use this 

No changes

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

@bmwshop bmwshop requested review from titu1994 and VahidooX May 31, 2022 22:53
@lgtm-com
Copy link

lgtm-com bot commented May 31, 2022

This pull request introduces 1 alert when merging 37ac555 into f6936ce - view on LGTM.com

new alerts:

  • 1 for Syntax error

# added in order to prevent slowdown in bfloat16 with CUDNN v8 API
# to be removed once the above is fixed in cudnn
with torch.cuda.amp.autocast(dtype=torch.float32)
audio_signal, length = self.pre_encode(audio_signal, length)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to explicitly cast the input to float here otherwise it does nothing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is my understanding that there's no need .. e.g. https://pytorch.org/docs/stable/amp.html

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is. Autocast only is disabled. If the input to the function is still fp16, then the operation will still be done in fp16. That's why we needed the explicit cast in SE too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the response and discussion regarding this SE PR - #2030

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm can you print out the input tensor, weight tensor and output tensor dtypes with and without explicit cast and see which one works properly ?

Copy link
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm.

Copy link
Collaborator

@VahidooX VahidooX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@titu1994 titu1994 merged commit ed35577 into main Jun 1, 2022
gkucsko pushed a commit to gkucsko/NeMo that referenced this pull request Jun 2, 2022
* forcing conv subsampling to 32 bit

* extra character

* comments added

* moving autocast into subsampling

* lint

Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Signed-off-by: Georg Kucsko <gkucsko@gmail.com>
@ericharper ericharper deleted the force_conv_subsampl_32bit branch September 22, 2022 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants