Added Feature: Prefix decoding for wav2vec2 models #11606

deepang17 · 2021-05-06T06:21:17Z

What does this PR do?

Added the code for prefix decoding for wav2vec2 based models.

Fixes #11283

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@patrickvonplaten @patil-suraj

deepang17 · 2021-05-06T06:41:11Z

Currently the code supports prefix decoding without LM. I am still working to integrate the kenlm version.

Problem faced currently: I created a custom kenlm and tried to run the code, but it stops without throwing any error at line results = self.decoder.decode(emissions_ptr, T, N) I am currently trying to fix it. (RESOLVED)

Shall I create a .sh or .txt file to guide on how to install flashlight dependencies?

…into wav2vec2contribution

deepang17 · 2021-05-07T09:31:11Z

Performance:
Model:- facebook/wav2vec2-base-960h
Dataset:- timit_asr, clean, test[:5%]
Viterbi Decoding:- wer: 0.115
KenLM Decoding:- wer: 0.098

…into wav2vec2contribution

patrickvonplaten · 2021-05-13T10:43:04Z

Wuhuhu! This is an amazing contribution @deepang17 - Super exciting to merge this notebook :-) And yes, it would be great if you could add a section to the README.md that explains how to use your script + maybe with some results (using Prefix decoding vs. not using it on e.g. Timit_asr and/or Librispeech evaluation - kinda like you already did above). I'm also very happy to help you run some evals!

patrickvonplaten

Longterm we could even think about merging this into src/transformers/models/wav2vec2/ - but for now this is great!

deepang17 · 2021-05-13T13:06:02Z

Thank you for the appreciation. I will do the required changes to README.md and push a commit soon.

samuelazran · 2021-05-18T15:40:10Z

Currently the code supports prefix decoding without LM. I am still working to integrate the kenlm version.

Problem faced currently: I created a custom kenlm and tried to run the code, but it stops without throwing any error at line results = self.decoder.decode(emissions_ptr, T, N) I am currently trying to fix it. (RESOLVED)

Shall I create a .sh or .txt file to guide on how to install flashlight dependencies?

@deepang17
Did you pushed that fix? I've tried your code and it is crushing at the "self.decoder.decode". What was your fix?

What is the status of this PR?

deepang17 · 2021-05-19T03:04:43Z

Currently the code supports prefix decoding without LM. I am still working to integrate the kenlm version.

Problem faced currently: I created a custom kenlm and tried to run the code, but it stops without throwing any error at line results = self.decoder.decode(emissions_ptr, T, N) I am currently trying to fix it. (RESOLVED)

Shall I create a .sh or .txt file to guide on how to install flashlight dependencies?

@deepang17
Did you pushed that fix? I've tried your code and it is crushing at the "self.decoder.decode". What was your fix?

What is the status of this PR?

You can fix it by replacing !cmake .. -DCMAKE_BUILD_TYPE=Release -DKENLM_MAX_ORDER=20 -DCMAKE_POSITION_INDEPENDENT_CODE=ON to !cmake ..

samuelazran · 2021-05-20T15:12:40Z

Currently the code supports prefix decoding without LM. I am still working to integrate the kenlm version.

Problem faced currently: I created a custom kenlm and tried to run the code, but it stops without throwing any error at line results = self.decoder.decode(emissions_ptr, T, N) I am currently trying to fix it. (RESOLVED)

Shall I create a .sh or .txt file to guide on how to install flashlight dependencies?

@deepang17
Did you pushed that fix? I've tried your code and it is crushing at the "self.decoder.decode". What was your fix?
What is the status of this PR?

You can fix it by replacing !cmake .. -DCMAKE_BUILD_TYPE=Release -DKENLM_MAX_ORDER=20 -DCMAKE_POSITION_INDEPENDENT_CODE=ON to !cmake ..

Can you please publish a Google Colab or a bash script to do the installation? I could't figure out where to do the change you suggested in the build, I'v used the Google Colab example from flashlight.

joaoalvarenga · 2021-06-04T12:54:42Z

examples/research_projects/wav2vec2/run_wav2vec2_eval_with_lm.py

+        with torch.no_grad():
+            logits = model(input_values).logits
+
+        target_dictionary = [t for t in processor.tokenizer.get_vocab().keys()]


Testing W2lViterbiDecoder i figure out that this list must by ordered by original token index.

Suggested change

target_dictionary = [t for t in processor.tokenizer.get_vocab().keys()]

vocab = processor.tokenizer.get_vocab()

target_dictionary = sorted(vocab.keys(), key=lambda k: vocab[k])

tommy19970714 · 2021-06-07T06:44:37Z

@deepang17 Thank you for your amazing work!
I made Google Colab to reproduce this pull request. @samuelazran You can check this.
https://colab.research.google.com/drive/1HHEBS3I4biQ8ZDyfJDtHi4E4onOtYe46?usp=sharing

Viterbi decoding works well, but KenLM decoding has the following error.

  File "run_wav2vec2_eval_with_lm.py", line 292, in <module>
    main()
  File "run_wav2vec2_eval_with_lm.py", line 281, in main
    results = selected_dataset.map(map_to_result)
  File "/usr/local/lib/python3.7/dist-packages/datasets/arrow_dataset.py", line 1606, in map
    desc=desc,
  File "/usr/local/lib/python3.7/dist-packages/datasets/arrow_dataset.py", line 176, in wrapper
    out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/datasets/fingerprint.py", line 397, in wrapper
    out = func(self, *args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/datasets/arrow_dataset.py", line 1911, in _map_single
    example = apply_function_on_filtered_inputs(example, i, offset=offset)
  File "/usr/local/lib/python3.7/dist-packages/datasets/arrow_dataset.py", line 1826, in apply_function_on_filtered_inputs
    function(*fn_args, effective_indices, **fn_kwargs) if with_indices else function(*fn_args, **fn_kwargs)
  File "run_wav2vec2_eval_with_lm.py", line 265, in map_to_result
    decoder = W2lKenLMDecoder(eval_args, target_dictionary)
  File "run_wav2vec2_eval_with_lm.py", line 201, in __init__
    self.lm = KenLM(args.kenlm_model, self.word_dict)
TypeError: __init__(): incompatible constructor arguments. The following argument types are supported:
    1. flashlight.lib.text.flashlight_lib_text_decoder.KenLM(path: str, usr_token_dict: fl::lib::text::Dictionary)

Invoked with: None, <flashlight.lib.text.flashlight_lib_text_dictionary.Dictionary object at 0x7fe0ef7294b0>

@deepang17 Do you know this error? It exactly gives as an argument of flashlight.lib.text.flashlight_lib_text_decoder.KenLM the dict obtained from flashlight.lib.text.dictionary.create_word_dict.

patrickvonplaten · 2021-06-17T11:48:44Z

@deepang17 - do you have updates regarding the README.md script? :-) I can take over the PR by next week otherwise!

deepang17 · 2021-06-17T13:22:01Z

Hello @patrickvonplaten, Sorry for the delay. I was occupied due to some personal issues. I am on the verge of completing the README.md. I will commit the updated README soon.

tommy19970714 · 2021-06-23T13:24:29Z

@deepang17 Any updates?

shiva1393 · 2021-06-29T06:16:58Z

@deepang17 Any updates?

github-actions · 2021-07-23T15:02:57Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

patrickvonplaten · 2021-08-04T16:31:44Z

This PR seems to be stuck since quite some time now. Is anyone interested in finishing / testing this PR?

Might be better to start fresh otherwise with a blog post ) colab that explains how to make a complete ASR end-to-end system - cc @anton-l

github-actions · 2021-08-29T15:02:20Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

hbasafa · 2021-11-02T19:24:55Z

@patrickvonplaten
I'm in! I've searched this topic but it seems there is no official implementation on this topic, and It would be so nice to add this feature. If this feature is still in the backlog, I would be happy to contribute. Looking forward to your alert!

patrickvonplaten · 2021-11-04T17:31:48Z

Hey @hbasafa,

I'm now working on this topic full time.

We will most likely foster a closer collaboration between pyctcdecode and Transformers. Here is a github repo that shows how to use pyctcdecode with Wav2Vec2 for LM supported decoding. It works quite well with KenLM.

hbasafa · 2021-11-05T10:03:06Z

Nice one! I will check it out.

As I was in hurry, I've already used this code that could be easily installed via pip.
The code sample is also provided in here
And Now I also focused on other decoding strategies to add there.

However, Thank you for sharing!
@patrickvonplaten

machakos23 · 2021-11-14T10:45:19Z

Hey @hbasafa,

I'm now working on this topic full time.

We will most likely foster a closer collaboration between pyctcdecode and Transformers. Here is a github repo that shows how to use pyctcdecode with Wav2Vec2 for LM supported decoding. It works quite well with KenLM.

hi @patrickvonplaten - this is great news. Where is the best place to follow your progress?

patrickvonplaten · 2021-11-15T15:47:13Z

This PR: #14339

It all depends a bit on how fast we can merge a load_from_hf_hub function to pyctcdecode

ADDED FEATURE: Prefix decoding for wav2vec2 models

749bb8c

deepang17 changed the title ~~ADDED FEATURE: Prefix decoding for wav2vec2 models~~ [WIP] ADDED FEATURE: Prefix decoding for wav2vec2 models May 6, 2021

deepang17 added 3 commits May 6, 2021 15:16

Merge branch 'master' of https://github.com/huggingface/transformers …

510000d

…into wav2vec2contribution

Merge branch 'master' of https://github.com/huggingface/transformers …

810143c

…into wav2vec2contribution

ADDED FEATURE: Flashlight KenLM decoding support for wav2vec2 models

f369c17

deepang17 changed the title ~~[WIP] ADDED FEATURE: Prefix decoding for wav2vec2 models~~ ADDED FEATURE: Prefix decoding for wav2vec2 models May 7, 2021

Merge branch 'master' of https://github.com/huggingface/transformers …

9ee266f

…into wav2vec2contribution

patrickvonplaten approved these changes May 13, 2021

View reviewed changes

patrickvonplaten changed the title ~~ADDED FEATURE: Prefix decoding for wav2vec2 models~~ Added Feature: Prefix decoding for wav2vec2 models May 13, 2021

joaoalvarenga reviewed Jun 4, 2021

View reviewed changes

github-actions bot closed this Aug 1, 2021

patrickvonplaten reopened this Aug 4, 2021

github-actions bot closed this Sep 8, 2021

deepang17 deleted the wav2vec2contribution branch November 4, 2021 17:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Feature: Prefix decoding for wav2vec2 models #11606

Added Feature: Prefix decoding for wav2vec2 models #11606

deepang17 commented May 6, 2021 •

edited

Loading

deepang17 commented May 6, 2021 •

edited

Loading

deepang17 commented May 7, 2021

patrickvonplaten commented May 13, 2021 •

edited

Loading

patrickvonplaten left a comment

deepang17 commented May 13, 2021 •

edited

Loading

samuelazran commented May 18, 2021 •

edited

Loading

deepang17 commented May 19, 2021

samuelazran commented May 20, 2021

joaoalvarenga Jun 4, 2021 •

edited

Loading

tommy19970714 commented Jun 7, 2021 •

edited

Loading

patrickvonplaten commented Jun 17, 2021

deepang17 commented Jun 17, 2021

tommy19970714 commented Jun 23, 2021

shiva1393 commented Jun 29, 2021

github-actions bot commented Jul 23, 2021

patrickvonplaten commented Aug 4, 2021

github-actions bot commented Aug 29, 2021

hbasafa commented Nov 2, 2021

patrickvonplaten commented Nov 4, 2021

hbasafa commented Nov 5, 2021 •

edited

Loading

machakos23 commented Nov 14, 2021

patrickvonplaten commented Nov 15, 2021 •

edited

Loading

	target_dictionary = [t for t in processor.tokenizer.get_vocab().keys()]
	vocab = processor.tokenizer.get_vocab()
	target_dictionary = sorted(vocab.keys(), key=lambda k: vocab[k])

Added Feature: Prefix decoding for wav2vec2 models #11606

Added Feature: Prefix decoding for wav2vec2 models #11606

Conversation

deepang17 commented May 6, 2021 • edited Loading

What does this PR do?

Before submitting

Who can review?

deepang17 commented May 6, 2021 • edited Loading

deepang17 commented May 7, 2021

patrickvonplaten commented May 13, 2021 • edited Loading

patrickvonplaten left a comment

Choose a reason for hiding this comment

deepang17 commented May 13, 2021 • edited Loading

samuelazran commented May 18, 2021 • edited Loading

deepang17 commented May 19, 2021

samuelazran commented May 20, 2021

joaoalvarenga Jun 4, 2021 • edited Loading

Choose a reason for hiding this comment

tommy19970714 commented Jun 7, 2021 • edited Loading

patrickvonplaten commented Jun 17, 2021

deepang17 commented Jun 17, 2021

tommy19970714 commented Jun 23, 2021

shiva1393 commented Jun 29, 2021

github-actions bot commented Jul 23, 2021

patrickvonplaten commented Aug 4, 2021

github-actions bot commented Aug 29, 2021

hbasafa commented Nov 2, 2021

patrickvonplaten commented Nov 4, 2021

hbasafa commented Nov 5, 2021 • edited Loading

machakos23 commented Nov 14, 2021

patrickvonplaten commented Nov 15, 2021 • edited Loading

deepang17 commented May 6, 2021 •

edited

Loading

deepang17 commented May 6, 2021 •

edited

Loading

patrickvonplaten commented May 13, 2021 •

edited

Loading

deepang17 commented May 13, 2021 •

edited

Loading

samuelazran commented May 18, 2021 •

edited

Loading

joaoalvarenga Jun 4, 2021 •

edited

Loading

tommy19970714 commented Jun 7, 2021 •

edited

Loading

hbasafa commented Nov 5, 2021 •

edited

Loading

patrickvonplaten commented Nov 15, 2021 •

edited

Loading