Add SEW CTC models #14158

anton-l · 2021-10-26T11:57:49Z

What does this PR do?

This adds the conversion steps and bugfixes to support finetuned SEW and SEW-D checkpoints (https://github.com/asappresearch/sew#asr-model-fine-tuned-on-librispeech-train-clean-100h)

TODO

Add model cards with code examples and WER results
Update unsupervised models' weights

anton-l · 2021-10-26T12:01:32Z

src/transformers/models/sew/modeling_sew.py

+        self.project_features = config.conv_dim[-1] != config.hidden_size
+        if self.project_features:
+            self.feature_projection = nn.Linear(config.conv_dim[-1], config.hidden_size)
+        self.feature_dropout = nn.Dropout(config.feat_proj_dropout)


@patrickvonplaten I was wrong about this previously. Both SEW and SEW-D require projections depending on the model size (larger models have different hidded_sizes), and my previous assumption was based only on the tiny checkpoints.

If this design is OK with you, I'll update the already uploaded unsupervised checkpoints as well.

Yes this sounds good to me! Just to make sure, some checkpoints have the projection and others don't?

Yes, both SEW and SEW-D have checkpoints with and without the projection (depending on the sizes)

Basically, any model with config.conv_dim[-1] != config.hidden_size (see the list below) needs the projection, while the others don't. That's how it's implemented in the original sew codebase.

anton-l · 2021-10-26T12:02:10Z

tests/test_modeling_sew.py

+        model = SEWForCTC.from_pretrained("anton-l/sew-tiny-100k-ft-ls100h").to(torch_device)
+        processor = Wav2Vec2Processor.from_pretrained("anton-l/sew-tiny-100k-ft-ls100h", do_lower_case=True)


TODO: change to asapp

feel free to do it right away :-)

patrickvonplaten · 2021-10-26T16:15:01Z

src/transformers/models/sew_d/modeling_sew_d.py

@@ -1383,12 +1364,13 @@ def forward(
        extract_features = extract_features.transpose(1, 2)
        extract_features = self.layer_norm(extract_features)

+        if self.project_features:
+            extract_features = self.feature_projection(extract_features)


Do we need both cases here? Which checkpoitns have self.projection_features = False and which have self.projection_features =True ?

project_features = False:

sew-tiny-100k

sew-d-small-100k

sew-d-mid-100k

sew-d-mid-k127-100k

sew-d-mid-400k

sew-d-mid-k127-400k

project_features = True:

sew-small-100k

sew-mid-100k

sew-d-tiny-100k

sew-d-base-100k

sew-d-base-plus-100k

sew-d-base-plus-400k

patrickvonplaten · 2021-10-26T16:15:33Z

src/transformers/models/sew_d/modeling_sew_d.py

 class SEWDModel(SEWDPreTrainedModel):
    def __init__(self, config: SEWDConfig):
        super().__init__(config)
        self.config = config
        self.feature_extractor = SEWDFeatureExtractor(config)
        self.layer_norm = nn.LayerNorm(config.conv_dim[-1], eps=config.layer_norm_eps)
-        self.feature_projection = nn.Linear(config.conv_dim[-1], config.hidden_size)
+
+        self.project_features = config.conv_dim[-1] != config.hidden_size


We need both cases I assume no?

* Add SEW CTC models * Update paths * Update paths

anton-l added 2 commits October 26, 2021 14:51

Add SEW CTC models

7af110d

Update paths

5aebb17

anton-l requested a review from patrickvonplaten October 26, 2021 11:57

anton-l commented Oct 26, 2021

View reviewed changes

patrickvonplaten reviewed Oct 26, 2021

View reviewed changes

patrickvonplaten approved these changes Oct 26, 2021

View reviewed changes

anton-l added 2 commits October 27, 2021 12:18

Update paths

6558efa

Merge remote-tracking branch 'upstream/master' into add-sew-ctc

a848d8d

anton-l merged commit e1dc5af into huggingface:master Oct 27, 2021

anton-l mentioned this pull request Oct 27, 2021

SEW - Masked Spec errors out in training #14171

Closed

anton-l deleted the add-sew-ctc branch October 27, 2021 17:22

Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request Jan 27, 2022

Add SEW CTC models (huggingface#14158)

726f94f

* Add SEW CTC models * Update paths * Update paths

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SEW CTC models #14158

Add SEW CTC models #14158

anton-l commented Oct 26, 2021 •

edited

Loading

anton-l Oct 26, 2021 •

edited

Loading

patrickvonplaten Oct 26, 2021

anton-l Oct 26, 2021

anton-l Oct 26, 2021 •

edited

Loading

anton-l Oct 26, 2021

patrickvonplaten Oct 26, 2021

patrickvonplaten Oct 26, 2021

anton-l Oct 26, 2021

patrickvonplaten Oct 26, 2021

		model = SEWForCTC.from_pretrained("anton-l/sew-tiny-100k-ft-ls100h").to(torch_device)
		processor = Wav2Vec2Processor.from_pretrained("anton-l/sew-tiny-100k-ft-ls100h", do_lower_case=True)

Add SEW CTC models #14158

Add SEW CTC models #14158

Conversation

anton-l commented Oct 26, 2021 • edited Loading

What does this PR do?

TODO

anton-l Oct 26, 2021 • edited Loading

Choose a reason for hiding this comment

patrickvonplaten Oct 26, 2021

Choose a reason for hiding this comment

anton-l Oct 26, 2021

Choose a reason for hiding this comment

anton-l Oct 26, 2021 • edited Loading

Choose a reason for hiding this comment

anton-l Oct 26, 2021

Choose a reason for hiding this comment

patrickvonplaten Oct 26, 2021

Choose a reason for hiding this comment

patrickvonplaten Oct 26, 2021

Choose a reason for hiding this comment

anton-l Oct 26, 2021

Choose a reason for hiding this comment

patrickvonplaten Oct 26, 2021

Choose a reason for hiding this comment

anton-l commented Oct 26, 2021 •

edited

Loading

anton-l Oct 26, 2021 •

edited

Loading

anton-l Oct 26, 2021 •

edited

Loading