Skip to content
This repository has been archived by the owner on Nov 21, 2022. It is now read-only.

Feature/asr #251

Closed
wants to merge 23 commits into from
Closed

Conversation

rafaelvp-db
Copy link

Initial PR for ASR support

@rafaelvp-db
Copy link
Author

hey @SeanNaren just made the PR, would be great if you could take a look and share some insights :)

@SeanNaren
Copy link
Contributor

Hi @rafaelvp-db thanks for making the PR!

There are quite a few pieces missing, hopefully, I can assist in helping you get the right things implemented!

Firstly I don't think your current code is entirely correct (unless I've made a mistake). The Wav2Vec model/dataset/tokenizer are completely different and should probably exist as new classes inherited from TaskTransformer/TransformerDataModule.

What I think would be a good idea is to get this blog post implemented into a TaskTransformer and a TransformerDataModule as you've already outlined: https://huggingface.co/blog/fine-tune-wav2vec2-english

This would involve

Overall I would assume something like this to work:

import pytorch_lightning as pl

from lightning_transformers.task.audio.speech_recognition import (
    SpeechRecognitionDataConfig,
    SpeechRecognitionDataModule,
    SpeechRecognitionTransformer,
)

if __name__ == "__main__":
    model = SpeechRecognitionTransformer("facebook/wav2vec2-base", ctc_loss_reduction="mean", vocab_file="vocab.json")
    dm = SpeechRecognitionDataModule(
        cfg=SpeechRecognitionDataConfig(
            batch_size=1,
            dataset_name="timit_asr",
        ),
        tokenizer=model.tokenizer,
    )
    trainer = pl.Trainer(accelerator="auto", devices="auto", max_epochs=1)

    trainer.fit(model, dm)

@rafaelvp-db
Copy link
Author

Thanks for the guidelines @SeanNaren! Let me look into that.

@codecov
Copy link

codecov bot commented May 29, 2022

Codecov Report

Merging #251 (a3d8528) into master (9f25baa) will decrease coverage by 1%.
The diff coverage is 58%.

❗ Current head a3d8528 differs from pull request most recent head 45cbab7. Consider uploading reports for the commit 45cbab7 to get more accurate results

@@          Coverage Diff          @@
##           master   #251   +/-   ##
=====================================
- Coverage      75%    74%   -1%     
=====================================
  Files          73     77    +4     
  Lines        1622   1682   +60     
=====================================
+ Hits         1210   1245   +35     
- Misses        412    437   +25     

@SeanNaren
Copy link
Contributor

SeanNaren commented May 30, 2022

Hows it going @rafaelvp-db?

The code looks muuuch nicer, amazing job! Anything I can assist with? I notice that the example requires a vocab.json, I'm sure we can YOLO it and use the alphabet.

@stale
Copy link

stale bot commented Sep 20, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Sep 20, 2022
@stale stale bot closed this Oct 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
wontfix This will not be worked on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants