This repository contains Kaldi recipes on the LibriSpeech corpora to execute the fMLLR feature extraction process.
Once Kaldi is installed, replace the files under $KALDI_ROOT/egs/librispeech/s5/
with the files in the repository (especially run.sh
and cmd.sh
).
These code and procedures are also publised on the fMLLR wiki page. Feel free to use or modify them, any bug report or improvement suggestion will be appreciated. If you have any questions, please contact r07942089@ntu.edu.tw.
- Install Kaldi
- As suggested during the installation, do not forget to add the path of the Kaldi binaries into $HOME/.bashrc. For instance, make sure that .bashrc contains the following paths:
export KALDI_ROOT=/home/mirco/kaldi
PATH=$PATH:$KALDI_ROOT/tools/openfst
PATH=$PATH:$KALDI_ROOT/src/featbin
PATH=$PATH:$KALDI_ROOT/src/gmmbin
PATH=$PATH:$KALDI_ROOT/src/bin
PATH=$PATH:$KALDI_ROOT/src/nnetbin
export PATH
- Remember to change the KALDI_ROOT variable using your path. As a first test to check the installation, open a bash shell, type
copy-feats
orhmm-info
and make sure no errors appear.
- This is for converting kaldi .ark files to our data format.
- Install Kaldiio
- Use the following command to install:
pip3 install kaldiio
Below are the basic steps to extract fMLLR features from the open source speech corpora Librispeech, note that the instructions below are for the subsets train-clean-100
,train-clean-360
, dev-clean
, and test-clean
, but they can be easily extended to support the other sets dev-other
, test-other
, and train-other-500
.
- If running on a single machine, change the following lines in
$KALDI_ROOT/egs/librispeech/s5/cmd.sh
and replacequeue.pl
torun.pl
.- Change the lines to:
export train_cmd="run.pl --mem 2G"
export decode_cmd="run.pl --mem 4G"
export mkgraph_cmd="run.pl --mem 8G"
- Change the
data
path inrun.sh
to your LibriSpeech data path, the directoryLibriSpeech/
should be under that path. For example:
data=/media/andi611/1TBSSD
- Run the Kaldi recipe
run.sh
for LibriSpeech at least until Stage 13 (included), make sure thatflac
is installed if you are using a Linux machine:
./run.sh # Use the `run.sh` from this repo
sudo apt-get install flac
- Copy
exp/tri4b/trans.*
files intoexp/tri4b/decode_tgsmall_train_clean_*/
mkdir exp/tri4b/decode_tgsmall_train_clean_100 && cp exp/tri4b/trans.* exp/tri4b/decode_tgsmall_train_clean_100/
- Compute the fmllr features by running the following script.
./compute_fmllr.sh
- Compute alignments using:
# aligments on dev_clean and test_clean
steps/align_fmllr.sh --nj 10 data/dev_clean data/lang exp/tri4b exp/tri4b_ali_dev_clean
steps/align_fmllr.sh --nj 10 data/test_clean data/lang exp/tri4b exp/tri4b_ali_test_clean
steps/align_fmllr.sh --nj 30 data/train_clean_100 data/lang exp/tri4b exp/tri4b_ali_clean_100
steps/align_fmllr.sh --nj 30 data/train_clean_360 data/lang exp/tri4b exp/tri4b_ali_clean_360
- Apply cmvn and dump the fmllr features to new .ark files:
./dump_fmllr_cmvn.sh
- Use the python script to convert kaldi generated .ark featrues to .npy for your own dataloader, an example python script is provided:
python3 ark2libri.py