Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hanging issue by multiprocessing in SD tutorial and add ETA for VAD processing #4405

Merged
merged 9 commits into from
Jun 23, 2022

Conversation

fayejf
Copy link
Collaborator

@fayejf fayejf commented Jun 21, 2022

What does this PR do ?

Fix hanging/deadlock issue caused by multiprocessing in SD tutorial and add tdqm that informs ETA to reduce false alarm of hanging.
reflected review in #4378

Collection: ASR

Changelog

Cherry pick updates to r1.10.0 #4317, (modify the code in place here to avoid sign-off issue)
Add instruction and set num_workers=1 in SD tutorial as a workaround for python multiprocessing would hang with python in some cases.
Add tqdm that informs ETA for VAD processing

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
@lgtm-com
Copy link

lgtm-com bot commented Jun 21, 2022

This pull request introduces 1 alert when merging 3a21d4a into faaf02f - view on LGTM.com

new alerts:

  • 1 for Unused local variable

fayejf and others added 3 commits June 22, 2022 10:48
Signed-off-by: fayejf <fayejf07@gmail.com>
Signed-off-by: fayejf <fayejf07@gmail.com>
@lgtm-com
Copy link

lgtm-com bot commented Jun 22, 2022

This pull request introduces 1 alert when merging 00b3adc into 40f7fb8 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@lgtm-com
Copy link

lgtm-com bot commented Jun 22, 2022

This pull request introduces 1 alert when merging 4f47b44 into ba57ea9 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@lgtm-com
Copy link

lgtm-com bot commented Jun 22, 2022

This pull request introduces 1 alert when merging 55683f6 into 641c3ce - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@fayejf fayejf merged commit fc13bda into r1.10.0 Jun 23, 2022
@fayejf fayejf deleted the mp_eta branch June 23, 2022 02:49
ericharper pushed a commit that referenced this pull request Jun 24, 2022
…AD processing (#4405)

* cherry-pick pr 4317 and avoid signoff issue

Signed-off-by: fayejf <fayejf07@gmail.com>

* workaround for mp nb issue

Signed-off-by: fayejf <fayejf07@gmail.com>

* tdqm for mp functions in vad_utils

Signed-off-by: fayejf <fayejf07@gmail.com>

* style fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* reflect comment

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove

Signed-off-by: fayejf <fayejf07@gmail.com>
ericharper added a commit that referenced this pull request Jun 27, 2022
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Fix ASR Typos in tutorials (#4384)

* Fix typos

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (#4383)

Signed-off-by: tbartley94 <tbartley@nvidia.com>

Co-authored-by: tbartley94 <tbartley@nvidia.com>
(cherry picked from commit 0322b15)

Co-authored-by: Travis Bartley <Travismbartley@gmail.com>

* Punctuation and capitalization tests race condition (#4399)

* Add draft of race condition fixes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Minor improvements

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* More race condition fixes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Improve error message

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Improve error message

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Improve error message

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix tutorial typos and docs (#4415)

* Fix typos

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix typos

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add reconfigure on validation epoch start (#4393)

* Add reconfigure on validation epoch start

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove pdb

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* switch branch (#4424)

Signed-off-by: fayejf <fayejf07@gmail.com>

* Add ASR Scores to Docs (#4412)

* Fix link

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Correct model card

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add ASR Results to Docs

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Update info

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Update info

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Re-apply fixes from r1.9.0 (#4425)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Replace all with /content/ (#4427)

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix hanging issue by multiprocessing in SD tutorial and add ETA for VAD processing (#4405)

* cherry-pick pr 4317 and avoid signoff issue

Signed-off-by: fayejf <fayejf07@gmail.com>

* workaround for mp nb issue

Signed-off-by: fayejf <fayejf07@gmail.com>

* tdqm for mp functions in vad_utils

Signed-off-by: fayejf <fayejf07@gmail.com>

* style fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* reflect comment

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove

Signed-off-by: fayejf <fayejf07@gmail.com>

* [NLP] P&C Fix multi node cache issue, add pynini guard (#4410)

* add sleep to fix multi node cache issue, add pynini guard

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix lgtm

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add tempfile

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* savfe tmp file to the same dir

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Co-authored-by: PeganovAnton <peganoff2@mail.ru>

* fix the notebook (#4438)

Signed-off-by: Yi Dong <yidong@nvidia.com>

* update nemo version dialogue tutorial (#4437)

* docs: add table overflow handling for nested sections (#4441)

Co-authored-by: Nick Goncharenko <ngoncharenko@nvidia.com>

* Docs: Decrease Font Size on Tables  (#4444)

* docs: add table overflow handling for nested sections

* docs: set table font-size to small

Co-authored-by: Nick Goncharenko <ngoncharenko@nvidia.com>

* unify intent slot dataset util functions in tutorials (#4445)

* Notebook bug fix: add subfolder (#4442)

* add subfolder

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* exp_dir update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Fix typo in HiFi-GAN config's max steps (#4446)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Updated notebook to fix batch configuration and precision bugs (#4447)

* Updated notebook to fix batch configuration and precision bugs

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Deleted cell outputs

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Set datasets back to full dataset

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Travis Bartley <Travismbartley@gmail.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: Nick Goncharenko <8766167+nickolyamba@users.noreply.github.com>
Co-authored-by: Nick Goncharenko <ngoncharenko@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Davood-M pushed a commit to Davood-M/NeMo that referenced this pull request Aug 9, 2022
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Fix ASR Typos in tutorials (NVIDIA#4384)

* Fix typos

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (NVIDIA#4383)

Signed-off-by: tbartley94 <tbartley@nvidia.com>

Co-authored-by: tbartley94 <tbartley@nvidia.com>
(cherry picked from commit 0322b15)

Co-authored-by: Travis Bartley <Travismbartley@gmail.com>

* Punctuation and capitalization tests race condition (NVIDIA#4399)

* Add draft of race condition fixes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Minor improvements

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* More race condition fixes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Improve error message

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Improve error message

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Improve error message

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix tutorial typos and docs (NVIDIA#4415)

* Fix typos

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix typos

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add reconfigure on validation epoch start (NVIDIA#4393)

* Add reconfigure on validation epoch start

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove pdb

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* switch branch (NVIDIA#4424)

Signed-off-by: fayejf <fayejf07@gmail.com>

* Add ASR Scores to Docs (NVIDIA#4412)

* Fix link

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Correct model card

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add ASR Results to Docs

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Update info

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Update info

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Re-apply fixes from r1.9.0 (NVIDIA#4425)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Replace all with /content/ (NVIDIA#4427)

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix hanging issue by multiprocessing in SD tutorial and add ETA for VAD processing (NVIDIA#4405)

* cherry-pick pr 4317 and avoid signoff issue

Signed-off-by: fayejf <fayejf07@gmail.com>

* workaround for mp nb issue

Signed-off-by: fayejf <fayejf07@gmail.com>

* tdqm for mp functions in vad_utils

Signed-off-by: fayejf <fayejf07@gmail.com>

* style fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* reflect comment

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove

Signed-off-by: fayejf <fayejf07@gmail.com>

* [NLP] P&C Fix multi node cache issue, add pynini guard (NVIDIA#4410)

* add sleep to fix multi node cache issue, add pynini guard

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix lgtm

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add tempfile

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* savfe tmp file to the same dir

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Co-authored-by: PeganovAnton <peganoff2@mail.ru>

* fix the notebook (NVIDIA#4438)

Signed-off-by: Yi Dong <yidong@nvidia.com>

* update nemo version dialogue tutorial (NVIDIA#4437)

* docs: add table overflow handling for nested sections (NVIDIA#4441)

Co-authored-by: Nick Goncharenko <ngoncharenko@nvidia.com>

* Docs: Decrease Font Size on Tables  (NVIDIA#4444)

* docs: add table overflow handling for nested sections

* docs: set table font-size to small

Co-authored-by: Nick Goncharenko <ngoncharenko@nvidia.com>

* unify intent slot dataset util functions in tutorials (NVIDIA#4445)

* Notebook bug fix: add subfolder (NVIDIA#4442)

* add subfolder

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* exp_dir update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Fix typo in HiFi-GAN config's max steps (NVIDIA#4446)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Updated notebook to fix batch configuration and precision bugs (NVIDIA#4447)

* Updated notebook to fix batch configuration and precision bugs

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Deleted cell outputs

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Set datasets back to full dataset

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Travis Bartley <Travismbartley@gmail.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: Nick Goncharenko <8766167+nickolyamba@users.noreply.github.com>
Co-authored-by: Nick Goncharenko <ngoncharenko@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Fix ASR Typos in tutorials (NVIDIA#4384)

* Fix typos

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (NVIDIA#4383)

Signed-off-by: tbartley94 <tbartley@nvidia.com>

Co-authored-by: tbartley94 <tbartley@nvidia.com>
(cherry picked from commit 0322b15)

Co-authored-by: Travis Bartley <Travismbartley@gmail.com>

* Punctuation and capitalization tests race condition (NVIDIA#4399)

* Add draft of race condition fixes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Minor improvements

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* More race condition fixes

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Improve error message

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Improve error message

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Improve error message

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix tutorial typos and docs (NVIDIA#4415)

* Fix typos

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix typos

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add reconfigure on validation epoch start (NVIDIA#4393)

* Add reconfigure on validation epoch start

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove pdb

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* switch branch (NVIDIA#4424)

Signed-off-by: fayejf <fayejf07@gmail.com>

* Add ASR Scores to Docs (NVIDIA#4412)

* Fix link

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Correct model card

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add ASR Results to Docs

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Update info

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Update info

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Re-apply fixes from r1.9.0 (NVIDIA#4425)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Replace all with /content/ (NVIDIA#4427)

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix hanging issue by multiprocessing in SD tutorial and add ETA for VAD processing (NVIDIA#4405)

* cherry-pick pr 4317 and avoid signoff issue

Signed-off-by: fayejf <fayejf07@gmail.com>

* workaround for mp nb issue

Signed-off-by: fayejf <fayejf07@gmail.com>

* tdqm for mp functions in vad_utils

Signed-off-by: fayejf <fayejf07@gmail.com>

* style fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* reflect comment

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove

Signed-off-by: fayejf <fayejf07@gmail.com>

* [NLP] P&C Fix multi node cache issue, add pynini guard (NVIDIA#4410)

* add sleep to fix multi node cache issue, add pynini guard

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix lgtm

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add tempfile

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* savfe tmp file to the same dir

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Co-authored-by: PeganovAnton <peganoff2@mail.ru>

* fix the notebook (NVIDIA#4438)

Signed-off-by: Yi Dong <yidong@nvidia.com>

* update nemo version dialogue tutorial (NVIDIA#4437)

* docs: add table overflow handling for nested sections (NVIDIA#4441)

Co-authored-by: Nick Goncharenko <ngoncharenko@nvidia.com>

* Docs: Decrease Font Size on Tables  (NVIDIA#4444)

* docs: add table overflow handling for nested sections

* docs: set table font-size to small

Co-authored-by: Nick Goncharenko <ngoncharenko@nvidia.com>

* unify intent slot dataset util functions in tutorials (NVIDIA#4445)

* Notebook bug fix: add subfolder (NVIDIA#4442)

* add subfolder

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* exp_dir update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Fix typo in HiFi-GAN config's max steps (NVIDIA#4446)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Updated notebook to fix batch configuration and precision bugs (NVIDIA#4447)

* Updated notebook to fix batch configuration and precision bugs

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Deleted cell outputs

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Set datasets back to full dataset

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Travis Bartley <Travismbartley@gmail.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Yi Dong <43824965+yidong72@users.noreply.github.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: Nick Goncharenko <8766167+nickolyamba@users.noreply.github.com>
Co-authored-by: Nick Goncharenko <ngoncharenko@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants