New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[ASR] Add official ASR CTC example to `examples/pytorch/speech-recognition` #13620

Merged

patrickvonplaten merged 33 commits into huggingface:master from patrickvonplaten:add_asr_example

Sep 24, 2021

Contributor

patrickvonplaten commented Sep 17, 2021 •

edited

This PR adds a generic speech recognition for CTC example. It has been tested for single GPU and distributed training on Common Voice and is being tested on Librispeech currently.

Once datasets has https://github.com/huggingface/datasets/pull/2324/files merged and made a new release I will slightly adapt the script to leverage the new audio feature.

A couple of example runs with this script:

This example folder should have two additional scripts: 1 for Seq2Seq ASR + 1 for CTC + LM decoding which are left for future work

patrickvonplaten added 4 commits

September 16, 2021 17:57

up

9d169c4


          rename

873ae06


          add asr example

ef9b969


          add auto feature extractor

adc9a66

This was referenced Sep 17, 2021

[Trainer] Add nan/inf logging filter #13619

Merged

AutoTokenizer - add from_model_name method #13623

Closed

patrickvonplaten added 5 commits

September 18, 2021 00:18


          some more fixes

769aa5b


          correct layerdrop

c7e4845


          correct for multi-gpu dist

5306cdd


          Merge branch 'master' of https://github.com/huggingface/transformers …

f85401b

…into add_asr_example


          clean up

97936d3

LysandreJik mentioned this pull request

some error when I finetune wav2vec2 by rum_common_voice.py #13651

Closed

patrickvonplaten added 9 commits

September 21, 2021 22:54


          refactor

712fbc7


          Merge branch 'master' of https://github.com/huggingface/transformers …

1fa221c

…into add_asr_example


          refactor

2c40b50


          more fixes

a8b51f3


          more fixes

3c217c2


          Merge branch 'master' of https://github.com/huggingface/transformers …

24d4e27

…into add_asr_example


          clean-up

0e39d2f


          finish

30f2611

up

0c93c7a

patrickvonplaten commented

View reviewed changes

examples/pytorch/speech-recognition/README.md Outdated Show resolved Hide resolved


          Apply suggestions from code review

60c1b9c

patrickvonplaten commented

View reviewed changes

src/transformers/models/hubert/configuration_hubert.py Show resolved Hide resolved

patrickvonplaten commented

View reviewed changes

src/transformers/models/wav2vec2/configuration_wav2vec2.py Show resolved Hide resolved

patrickvonplaten commented

View reviewed changes

src/transformers/models/hubert/configuration_hubert.py Show resolved Hide resolved

patrickvonplaten commented

View reviewed changes

src/transformers/models/wav2vec2/configuration_wav2vec2.py Show resolved Hide resolved

patrickvonplaten commented

View reviewed changes

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated

+                  # 3. Next, we create the vocabulary of the model by extracting all unique characters from
+                  # the training and evaluation datasets
+                  # We need to make sure that only first rank saves vocabulary
+                  if training_args.world_size == 1 or dist.get_rank() == 0:

Contributor Author

patrickvonplaten Sep 22, 2021

this caused me an headache for 3 days -> in distributed training each process was creating a different ordering of characters in the vocabulary which essentially meant that each process has different label ids.

By using sorted(...) and making sure that only the first process creates & saves the vocabulary, the problem is solved.

Member

LysandreJik Sep 22, 2021

Nice find!

patrickvonplaten commented

View reviewed changes

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

patrickvonplaten changed the title ~~[WIP][ASR] Add official ASR CTC example to examples/pytorch/speech-recognition~~ [ASR] Add official ASR CTC example to examples/pytorch/speech-recognition

patrickvonplaten mentioned this pull request

Fine-Tuning Wav2Vec2 with PyTorch DDP #13660

Closed

patrickvonplaten requested review from sgugger, patil-suraj, anton-l, lhoestq and albertvillanova

September 22, 2021 17:37

patrickvonplaten added 2 commits

September 22, 2021 21:58


          update

3f7cf6f


          Merge branch 'add_asr_example' of https://github.com/patrickvonplaten…

a68952a

…/transformers into add_asr_example

sgugger approved these changes

View reviewed changes

Collaborator

sgugger left a comment

Thanks a lot for adding this example and great job figuring out the problem in a distributed setup!

examples/pytorch/speech-recognition/README.md Outdated Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

up

376f1fa

patil-suraj approved these changes

View reviewed changes

Contributor

patil-suraj left a comment

Looks really good! Thanks for adding this example

examples/pytorch/speech-recognition/requirements.txt Outdated Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved


          add note

39fd4e7

patil-suraj reviewed

View reviewed changes

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved


          apply surajs suggestions

f9ea79a

patrickvonplaten commented

View reviewed changes

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

patrickvonplaten and others added 3 commits

September 23, 2021 11:19


          Apply suggestions from code review

a1b8bda

Co-authored-by: Suraj Patil <surajp815@gmail.com>


          isort

d84f050


          small change

cd19fb2

anton-l approved these changes

View reviewed changes

Member

anton-l left a comment

Looks good, thank you very much for figuring out the DDP problem!

The torchaudio loader seems to be the best fit for the example 🙂
Although I think Windows users will be out of luck when they try to load mp3's (soundfile is used as a backend there, and it specifically excludes mp3: http://www.mega-nerd.com/libsndfile/#Features)

P.S. So sorry for the typo spam 😅

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

examples/pytorch/speech-recognition/README.md Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

src/transformers/models/hubert/configuration_hubert.py Outdated Show resolved Hide resolved

src/transformers/models/hubert/configuration_hubert.py Outdated Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved

patrickvonplaten and others added 4 commits

September 23, 2021 16:51


          Apply suggestions from code review

07bebd9

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>


          Apply suggestions from code review

9c2fabc

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>


          add hubert

32c47b2


          Merge branch 'add_asr_example' of https://github.com/patrickvonplaten…

25fe53a

…/transformers into add_asr_example

patrickvonplaten commented

View reviewed changes

examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Outdated Show resolved Hide resolved


          Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py

40f3dc8

patrickvonplaten merged commit 4a320f6 into huggingface:master

patrickvonplaten deleted the add_asr_example branch

September 24, 2021 05:01

stas00 pushed a commit to stas00/transformers that referenced this pull request


          [ASR] Add official ASR CTC example to `examples/pytorch/speech-recogn…

b6d5a14

…ition` (huggingface#13620)

* up

* rename

* add asr example

* add auto feature extractor

* some more fixes

* correct layerdrop

* correct for multi-gpu dist

* clean up

* refactor

* refactor

* more fixes

* more fixes

* clean-up

* finish

* up

* Apply suggestions from code review

* fix isort

* update

* up

* add note

* apply surajs suggestions

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* isort

* small change

* Apply suggestions from code review

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Apply suggestions from code review

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* add hubert

* Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request


          [ASR] Add official ASR CTC example to `examples/pytorch/speech-recogn…

42b6abd

…ition` (huggingface#13620)

* up

* rename

* add asr example

* add auto feature extractor

* some more fixes

* correct layerdrop

* correct for multi-gpu dist

* clean up

* refactor

* refactor

* more fixes

* more fixes

* clean-up

* finish

* up

* Apply suggestions from code review

* fix isort

* update

* up

* add note

* apply surajs suggestions

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* isort

* small change

* Apply suggestions from code review

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Apply suggestions from code review

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* add hubert

* Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request


          [ASR] Add official ASR CTC example to `examples/pytorch/speech-recogn…

50de209

…ition` (huggingface#13620)

* up

* rename

* add asr example

* add auto feature extractor

* some more fixes

* correct layerdrop

* correct for multi-gpu dist

* clean up

* refactor

* refactor

* more fixes

* more fixes

* clean-up

* finish

* up

* Apply suggestions from code review

* fix isort

* update

* up

* add note

* apply surajs suggestions

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* isort

* small change

* Apply suggestions from code review

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Apply suggestions from code review

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* add hubert

* Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

patil-suraj patil-suraj approved these changes

LysandreJik LysandreJik left review comments

anton-l anton-l approved these changes

sgugger sgugger approved these changes

lhoestq Awaiting requested review from lhoestq

albertvillanova Awaiting requested review from albertvillanova