[Speech] Refactor Examples #14040

patrickvonplaten · 2021-10-17T21:01:44Z

What does this PR do?

This PR adapts all Wav2Vec2-like models to use the same examples and makes sure that ...ForCTC and ...ForSequenceClassification have exactly the same structure. HuBERT, SEW, SEW-D and soon UniSpeech and others that are based on Wav2Vec2 usually don't have any specific heads, but rather should just follow the Wav2Vec2 head design of CTC and the Superb head design for SequenceClassification (SpeakerVerification, ....). Therefore we should make it as easy as possible to add such heads to new Wav2Vec2 versions.

This PR makes sure that a simple #Copied from ... command can be used for such heads which should allow us to work faster when adding new speech models while making sure the design is unified and correct.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

patrickvonplaten · 2021-10-17T21:59:31Z

Wait until #14026 (comment) is fixed

patrickvonplaten · 2021-10-18T09:55:09Z

src/transformers/models/auto/modeling_auto.py

@@ -476,6 +476,8 @@
        # Model for Audio Classification mapping
        ("wav2vec2", "Wav2Vec2ForSequenceClassification"),
        ("hubert", "HubertForSequenceClassification"),
+        ("sew", "SEWForSequenceClassification"),


Similar to how all BERT heads (ForQA, ForSequenceClass, ForMC, ...) are added to all BERT-like models for easy comparison and added functionality, all speech models should have the superb heads.

Yes, agreed!

LysandreJik

This is great! 100% agree that all models should have the SUPERB heads

LysandreJik · 2021-10-18T10:02:47Z

src/transformers/models/auto/modeling_auto.py

@@ -476,6 +476,8 @@
        # Model for Audio Classification mapping
        ("wav2vec2", "Wav2Vec2ForSequenceClassification"),
        ("hubert", "HubertForSequenceClassification"),
+        ("sew", "SEWForSequenceClassification"),


Yes, agreed!

LysandreJik · 2021-10-18T10:03:52Z

src/transformers/models/hubert/modeling_hubert.py

-            output = (logits,) + outputs[1:]
+            output = (logits,) + outputs[_HIDDEN_STATES_START_POSITION:]


This is way simpler to understand! We should do something like that for BERT & friends too

tests/test_modeling_sew.py

sgugger

This is great! Thanks a lot for refactoring this.

anton-l

Looks very clean now 👍

anton-l · 2021-10-18T13:38:58Z

src/transformers/models/hubert/modeling_hubert.py

@@ -1141,8 +1124,8 @@ def forward(
    """,
    HUBERT_START_DOCSTRING,
 )
+# Copied from transformers.models.wav2vec2.modeling_wav2vec2.Wav2Vec2ForSequenceClassification with Wav2Vec2->Hubert, wav2vec2->hubert, WAV_2_VEC_2->HUBERT


Great that it works now!

…into refactor_speech_heads

* adapt_examples * up * up * up * up * add auto models * finish

patrickvonplaten added 5 commits October 17, 2021 22:57

adapt_examples

1a54817

up

c6d8591

up

4e0a5d2

up

6fc9bae

up

d05ed62

add auto models

9626990

patrickvonplaten requested review from anton-l, sgugger and LysandreJik October 18, 2021 09:50

patrickvonplaten commented Oct 18, 2021

View reviewed changes

LysandreJik approved these changes Oct 18, 2021

View reviewed changes

sgugger approved these changes Oct 18, 2021

View reviewed changes

anton-l approved these changes Oct 18, 2021

View reviewed changes

patrickvonplaten added 2 commits October 18, 2021 17:18

finish

bbfa732

Merge branch 'master' of https://github.com/huggingface/transformers …

5136e12

…into refactor_speech_heads

patrickvonplaten merged commit d5ff69f into huggingface:master Oct 18, 2021

patrickvonplaten deleted the refactor_speech_heads branch October 18, 2021 15:43

Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request Jan 27, 2022

[Speech] Refactor Examples (huggingface#14040)

8a165d3

* adapt_examples * up * up * up * up * add auto models * finish

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Speech] Refactor Examples #14040

[Speech] Refactor Examples #14040

patrickvonplaten commented Oct 17, 2021 •

edited

patrickvonplaten commented Oct 17, 2021

patrickvonplaten Oct 18, 2021

LysandreJik Oct 18, 2021

LysandreJik left a comment

LysandreJik Oct 18, 2021

LysandreJik Oct 18, 2021

sgugger left a comment

anton-l left a comment

anton-l Oct 18, 2021

		output = (logits,) + outputs[1:]
		output = (logits,) + outputs[_HIDDEN_STATES_START_POSITION:]

[Speech] Refactor Examples #14040

[Speech] Refactor Examples #14040

Conversation

patrickvonplaten commented Oct 17, 2021 • edited

What does this PR do?

Before submitting

Who can review?

patrickvonplaten commented Oct 17, 2021

patrickvonplaten Oct 18, 2021

Choose a reason for hiding this comment

LysandreJik Oct 18, 2021

Choose a reason for hiding this comment

LysandreJik left a comment

Choose a reason for hiding this comment

LysandreJik Oct 18, 2021

Choose a reason for hiding this comment

LysandreJik Oct 18, 2021

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

anton-l left a comment

Choose a reason for hiding this comment

anton-l Oct 18, 2021

Choose a reason for hiding this comment

patrickvonplaten commented Oct 17, 2021 •

edited