Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SEW CTC models #14158

Merged
merged 4 commits into from
Oct 27, 2021
Merged

Add SEW CTC models #14158

merged 4 commits into from
Oct 27, 2021

Conversation

anton-l
Copy link
Member

@anton-l anton-l commented Oct 26, 2021

What does this PR do?

This adds the conversion steps and bugfixes to support finetuned SEW and SEW-D checkpoints (https://github.com/asappresearch/sew#asr-model-fine-tuned-on-librispeech-train-clean-100h)

TODO

  • Add model cards with code examples and WER results
  • Update unsupervised models' weights

Comment on lines +783 to +786
self.project_features = config.conv_dim[-1] != config.hidden_size
if self.project_features:
self.feature_projection = nn.Linear(config.conv_dim[-1], config.hidden_size)
self.feature_dropout = nn.Dropout(config.feat_proj_dropout)
Copy link
Member Author

@anton-l anton-l Oct 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patrickvonplaten I was wrong about this previously. Both SEW and SEW-D require projections depending on the model size (larger models have different hidded_sizes), and my previous assumption was based only on the tiny checkpoints.

If this design is OK with you, I'll update the already uploaded unsupervised checkpoints as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this sounds good to me! Just to make sure, some checkpoints have the projection and others don't?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, both SEW and SEW-D have checkpoints with and without the projection (depending on the sizes)

Copy link
Member Author

@anton-l anton-l Oct 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, any model with config.conv_dim[-1] != config.hidden_size (see the list below) needs the projection, while the others don't. That's how it's implemented in the original sew codebase.

Comment on lines 535 to 536
model = SEWForCTC.from_pretrained("anton-l/sew-tiny-100k-ft-ls100h").to(torch_device)
processor = Wav2Vec2Processor.from_pretrained("anton-l/sew-tiny-100k-ft-ls100h", do_lower_case=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: change to asapp

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel free to do it right away :-)

@@ -1383,12 +1364,13 @@ def forward(
extract_features = extract_features.transpose(1, 2)
extract_features = self.layer_norm(extract_features)

if self.project_features:
extract_features = self.feature_projection(extract_features)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both cases here? Which checkpoitns have self.projection_features = False and which have self.projection_features =True ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

project_features = False:

  • sew-tiny-100k
  • sew-d-small-100k
  • sew-d-mid-100k
  • sew-d-mid-k127-100k
  • sew-d-mid-400k
  • sew-d-mid-k127-400k

project_features = True:

  • sew-small-100k
  • sew-mid-100k
  • sew-d-tiny-100k
  • sew-d-base-100k
  • sew-d-base-plus-100k
  • sew-d-base-plus-400k

class SEWDModel(SEWDPreTrainedModel):
def __init__(self, config: SEWDConfig):
super().__init__(config)
self.config = config
self.feature_extractor = SEWDFeatureExtractor(config)
self.layer_norm = nn.LayerNorm(config.conv_dim[-1], eps=config.layer_norm_eps)
self.feature_projection = nn.Linear(config.conv_dim[-1], config.hidden_size)

self.project_features = config.conv_dim[-1] != config.hidden_size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need both cases I assume no?

@anton-l anton-l merged commit e1dc5af into huggingface:master Oct 27, 2021
@anton-l anton-l deleted the add-sew-ctc branch October 27, 2021 17:22
Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request Jan 27, 2022
* Add SEW CTC models

* Update paths

* Update paths
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants