Skip to content

labspire/RESPIN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Demo

Bhojpuri

Models

The WERs specified are without the use of any language model.

Model Pre-training data Fine-tuning data Model Links WER (test-RESPIN) CER (test-RESPIN)
data2vec-aqc --- Bhojpuri fairseq 14.628 3.794
  • finetuning procedures can be found here.
  • Inference procedures can be found here.
  • Single file inference procedures can be found here

Directory Structure

RESPIN
├── configs
│   └── finetuned ── data2vec-aqc.yaml
├── data
│   ├── examples
│   └── bh
├── models
│   ├── finetuned
│   │   └── indic_finetuned
│   └── pretrained
├── recipes
│   ├── Training
│   │   └── train.sh
│   ├── Inference
│   │   └── infer.sh
│   └── fairseq_preprocessing
│       ├── data_prep.py
│       ├── metrics.py
│       └── run_data_prep.sh
├── requirements.txt
└── README.md

Requirements and Installation

  • Create a new conda environment:
conda create -n env_name python=3.10
conda activate env_name
  • Python version >= 3.10
  • PyTorch version >= 2.0.0
  • Fairseq version >= 0.12.2
  • CUDA >= 11.8
  • For training new models, you'll also need an NVIDIA GPU and NCCL
  • To install requirements:
pip install -r requirements.txt
  • To install fairseq and develop locally:
git clone https://github.com/Speech-Lab-IITM/data2vec-aqc
cd data2vec-aqc/
pip install --editable ./
  • For faster training install NVIDIA's apex library:
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" \
  --global-option="--deprecated_fused_adam" --global-option="--xentropy" \
  --global-option="--fast_multihead_attn" ./
git clone https://github.com/Speech-Lab-IITM/torchaudio-augmentations
cd torchaudio-augmentations
pip install --editable ./ 
pip install flashlight-text

git clone https://github.com/flashlight/sequence && cd sequence
pip install .
  • To install parse_options:
wget https://raw.githubusercontent.com/kaldi-asr/kaldi/master/egs/wsj/s5/utils/parse_options.sh && sudo mv parse_options.sh /usr/local/bin/

Required Step

Add the musan dataset path in:
data2vec-aqc/fairseq/data/audio/raw_audio_dataset.py

path_to_musan_noise_set = 'path_to_musan_dataset'
  • For musan dataset. Musan

Reference Code

  1. Facebook AI Research Sequence-to-Sequence Toolkit written in Python. fairseq
  2. SPRING-LAB (data2vec_aqc)
  3. OpenSLR musan

About

This repo contains models and recipes developed under project RESPIN.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors