Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions examples/asr/emformer_rnnt/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ This directory contains sample implementations of training and evaluation pipeli
### Pipeline Demo

[`pipeline_demo.py`](./pipeline_demo.py) demonstrates how to use the `EMFORMER_RNNT_BASE_LIBRISPEECH`
or `EMFORMER_RNNT_BASE_TEDLIUM3` bundle that wraps a pre-trained Emformer RNN-T produced by the corresponding recipe below to perform streaming and full-context ASR on several audio samples.
bundle that wraps a pre-trained Emformer RNN-T produced by the LibriSpeech recipe below to perform streaming and full-context ASR on several audio samples.

## Model Types

Expand Down Expand Up @@ -67,8 +67,6 @@ The table below contains WER results for dev and test subsets of TED-LIUM releas
| dev | 0.108 |
| test | 0.098 |

[`tedlium3/eval_pipeline.py`](./tedlium3/eval_pipeline.py) evaluates the pre-trained `EMFORMER_RNNT_BASE_TEDLIUM3` bundle on the dev and test sets of TED-LIUM release 3. Running the script should produce WER results that are identical to those in the above table.

### MuST-C release v2.0

The MuST-C model is configured with a vocabulary size of 500. Consequently, the MuST-C model's last linear layer in the joiner has an output dimension of 501 (500 + 1 to account for the blank symbol). In contrast to those of the datasets for the above two models, MuST-C's transcripts are cased and punctuated; we preserve the casing and punctuation when training the SentencePiece model.
Expand Down
15 changes: 1 addition & 14 deletions examples/asr/emformer_rnnt/pipeline_demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,8 @@

import torch
import torchaudio
from common import MODEL_TYPE_LIBRISPEECH, MODEL_TYPE_MUSTC, MODEL_TYPE_TEDLIUM3
from mustc.dataset import MUSTC
from common import MODEL_TYPE_LIBRISPEECH
from torchaudio.pipelines import EMFORMER_RNNT_BASE_LIBRISPEECH, RNNTBundle
from torchaudio.prototype.pipelines import (
EMFORMER_RNNT_BASE_MUSTC,
EMFORMER_RNNT_BASE_TEDLIUM3,
)

logger = logging.getLogger(__name__)

Expand All @@ -35,14 +30,6 @@ class Config:
partial(torchaudio.datasets.LIBRISPEECH, url="test-clean"),
EMFORMER_RNNT_BASE_LIBRISPEECH,
),
MODEL_TYPE_MUSTC: Config(
partial(MUSTC, subset="tst-COMMON"),
EMFORMER_RNNT_BASE_MUSTC,
),
MODEL_TYPE_TEDLIUM3: Config(
partial(torchaudio.datasets.TEDLIUM, release="release3", subset="test"),
EMFORMER_RNNT_BASE_TEDLIUM3,
),
}


Expand Down
90 changes: 0 additions & 90 deletions examples/asr/emformer_rnnt/tedlium3/eval_pipeline.py

This file was deleted.

49 changes: 0 additions & 49 deletions examples/asr/librispeech_conformer_rnnt/README.md

This file was deleted.

194 changes: 0 additions & 194 deletions examples/asr/librispeech_conformer_rnnt/data_module.py

This file was deleted.

Loading