Skip to content
Merged
64 changes: 64 additions & 0 deletions egs/malach/s5/README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Copyright 2019 IBM Corp. (Author: Michael Picheny) Adapted AMI recipe to MALACH corpus

This s5 recipe for MALACH data is a modified version of the s5b
recipe for AMI.

You need to download the malach data to get started. For information about the MALACH database see :
USC-SFI MALACH Interviews and Transcripts English - Speech Recognition Edition
https://catalog.ldc.upenn.edu/LDC2019S11

Once the data is unloaded and untar-ed, you need to run:

run_prepare_shared.sh - prepares most of the data for the system
run.sh - builds the system

Beforehand, you need to edit BOTH scripts to point to
where you downloaded and untar-ed the data. Find the lines in
run_prepare_shared.sh and run.sh that say:

malach_dir=dummy_directory

Replace "dummy_directory" with the fully-qualified location of the actual data
data. For example, let's say you copied the data distribution tar file to
/user/jdoe/malach and untar-ed it there. That would create a high level directory called
/user/jdoe/malach/malach_eng_speech_recognition. You would then change the above line to read:

malach_dir=/user/doe/malach/malach_eng_speech_recognition/data

Note that the scripts were "tweaked" to always use sclite scoring
(vs. default kaldi scoring).

Other issues that we have run up against in setting up this recipe
that may or may not impact you:

On the system on which these scripts were developed, we run python 2.7
and a relatively older version of CUDA by default. We had to modify
path.sh to point to the right load libraries for both python 3 (a
number of the scripts use python three) and an appropriate library
consistent with the level of CUDA we were using. Please modify path.sh
accordingly.

You may also have to modify "configure" line 405 in
/speech7/picheny5_nb/forked_kaldi/kaldi/src to point to where your
version of CUDA lives.

Basic pipeline results summary:

tri2:
%WER 39.1 | 843 12345 | 66.5 25.1 8.3 5.7 39.1 74.0 | -0.230 | exp/tri2/decode_dev_malach.o4g.kn.pr1-9/ascore_13/dev.ctm.filt.sys

tri3.si:
%WER 42.8 | 843 12345 | 63.4 28.0 8.5 6.3 42.8 76.9 | -1.079 | exp/tri3/decode_dev_malach.o4g.kn.pr1-9.si/ascore_12/dev.ctm.filt.sys

tri3:
%WER 34.5 | 843 12345 | 70.7 22.1 7.1 5.2 34.5 69.2 | -0.398 | exp/tri3/decode_dev_malach.o4g.kn.pr1-9/ascore_15/dev.ctm.filt.sys

tri3_cleaned.si:
%WER 43.1 | 843 12345 | 63.6 28.2 8.2 6.7 43.1 79.0 | -1.095 | exp/tri3_cleaned/decode_dev_malach.o4g.kn.pr1-9.si/ascore_12/dev.ctm.filt.sys

tri3_cleaned:
%WER 35.1 | 843 12345 | 71.0 22.6 6.4 6.1 35.1 72.7 | -0.431 | exp/tri3_cleaned/decode_dev_malach.o4g.kn.pr1-9/ascore_13/dev.ctm.filt.sys

Results using the chain model, and rescoring the chain model with various LSTMs, can be found in s5/local/chain/run_tdnn.sh


18 changes: 18 additions & 0 deletions egs/malach/s5/cmd.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# you can change cmd.sh depending on what type of queue you are using.
# If you have no queueing system and want to run on a local machine, you
# can change all instances 'queue.pl' to run.pl (but be careful and run
# commands one by one: most recipes will exhaust the memory on your
# machine). queue.pl works with GridEngine (qsub). slurm.pl works
# with slurm. Different queues are configured differently, with different
# queue names and different ways of specifying things like memory;
# to account for these differences you can create and edit the file
# conf/queue.conf to match your queue's configuration. Search for
# conf/queue.conf in http://kaldi-asr.org/doc/queue.html for more information,
# or search for the string 'default_config' in utils/queue.pl or utils/slurm.pl.

export train_cmd="run.pl --mem 1G"
export decode_cmd="run.pl --mem 2G"
# the use of cuda_cmd is deprecated, used only in 'nnet1',
export cuda_cmd="run.pl --gpu 1 --mem 20G"


3 changes: 3 additions & 0 deletions egs/malach/s5/conf/decode.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
beam=11.0 # beam for decoding. Was 13.0 in the scripts.
first_beam=8.0 # beam for 1st-pass decoding in SAT.

2 changes: 2 additions & 0 deletions egs/malach/s5/conf/mfcc.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
--use-energy=false # only non-default option.
--sample-frequency=16000
10 changes: 10 additions & 0 deletions egs/malach/s5/conf/mfcc_hires.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# config for high-resolution MFCC features, intended for neural network training
# Note: we keep all cepstra, so it has the same info as filterbank features,
# but MFCC is more easily compressible (because less correlated) which is why
# we prefer this method.
--use-energy=false # use average of log energy, not energy.
--num-mel-bins=40 # similar to Google's setup.
--num-ceps=40 # there is no dimensionality reduction.
--low-freq=20 # low cutoff frequency for mel bins... this is high-bandwidth data, so
# there might be some information at the low end.
--high-freq=-400 # high cutoff frequently, relative to Nyquist of 8000 (=7600)
1 change: 1 addition & 0 deletions egs/malach/s5/conf/online_cmvn.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# configuration file for apply-cmvn-online, used in the script ../local/run_online_decoding.sh
72 changes: 72 additions & 0 deletions egs/malach/s5/local/chain/compare_wer_general.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
#!/bin/bash

echo -n "System "
for x in $*; do printf " % 10s" $x; done
echo

#for d in exp/chain_cleaned/tdnn*/decode_*; do grep Sum $d/*sc*/*ys | utils/best_wer.sh; done|grep eval_hires


echo -n "WER on dev "
for x in $*; do
wer=$(grep Sum exp/chain_cleaned/${x}/decode_dev/*sc*/*ys | utils/best_wer.sh | awk '{print $2}')
printf "% 10s" $wer
done
echo

echo -n "Rescore with lstm 1a "
for x in $*; do
wer=$(grep Sum exp/chain_cleaned/${x}/decode_dev*tdnn_1a/*sc*/*ys | utils/best_wer.sh | awk '{print $2}')
printf "% 10s" $wer
done
echo

echo -n "Rescore with lstm 1b "
for x in $*; do
wer=$(grep Sum exp/chain_cleaned/${x}/decode_dev*tdnn_1b/*sc*/*ys | utils/best_wer.sh | awk '{print $2}')
printf "% 10s" $wer
done
echo

echo -n "Rescore with lstm bs_1a "
for x in $*; do
wer=$(grep Sum exp/chain_cleaned/${x}/decode_dev*tdnn_bs_1a/*sc*/*ys | utils/best_wer.sh | awk '{print $2}')
printf "% 10s" $wer
done
echo

echo -n "Final train prob "
for x in $*; do
if [[ "${x}" != *online* ]]; then
prob=$(grep Overall exp/chain_cleaned/${x}/log/compute_prob_train.final.log | grep -v xent | awk '{print $8}')
printf "% 10s" $prob
fi
done
echo

echo -n "Final valid prob "
for x in $*; do
if [[ "${x}" != *online* ]]; then
prob=$(grep Overall exp/chain_cleaned/${x}/log/compute_prob_valid.final.log | grep -v xent | awk '{print $8}')
printf "% 10s" $prob
fi
done
echo

echo -n "Final train prob (xent) "
for x in $*; do
if [[ "${x}" != *online* ]]; then
prob=$(grep Overall exp/chain_cleaned/${x}/log/compute_prob_train.final.log | grep -w xent | awk '{print $8}')
printf "% 10s" $prob
fi
done
echo

echo -n "Final valid prob (xent) "
for x in $*; do
if [[ "${x}" != *online* ]]; then
prob=$(grep Overall exp/chain_cleaned/${x}/log/compute_prob_valid.final.log | grep -w xent | awk '{print $8}')
printf "% 10s" $prob
fi
done
echo
1 change: 1 addition & 0 deletions egs/malach/s5/local/chain/run_tdnn.sh
Loading