Description:

This branch contains the working code for the master's project focusing on language representation and modelling for Swiss German ASR (2019/2020).

The scripts are an implementation of a basic ASR framework based on Kaldi and were originally developed by Spitch AG, with the following functionality:

Neural networks acoustic model training.
WFST lingware compilation.
Evaluation.

The Kaldi (version 5.5.) recipe egs/wsj/s5 (commit 8cc5c8b32a49f8d963702c6be681dcf5a55eeb2e) was used as reference.

Main scripts:

run_archimob.sh: acoustic model training

run_archimob.sh <archimob_input_csv> <archimob_wav_files_directory> <am_output_directory> <transcription_type> <pronunciation_lexicon>

compile_and_decode.sh: lingware compilation and validation

compile_and_decode.sh <arpa_lm> <am_output_directory> <archimob_dev_csv> <archimob_wav_files_directory> <lw_output_directory> <transcription_type> <lmwt_params> <flexwer_mapping>

evaluate.sh: test set decoding and evaluation

./evaluate.sh <archimob_test_csv> <archimob_wav_files_directory> <am_output_directory> <lw_output_directory> <eval_output_directory> <lmwt> <transcription_type> <flexwer_mapping>

Configuration:

path.sh: script to specify the Kaldi root directory and to add certain directories to the path.

cmd.sh: script to select the way of running parallel jobs.

Folders:

Framework specific:

archimob: scripts related to processing the Archimob files for word-level modelling.

archimob_char: scripts related to processing the Archimob files for character-level modelling.

uzh: secondary scripts not included in the Kaldi recipe.

manual: manually generated files.

doc: documentation files.

lms: scripts for compiling language models

scripts: small scripts for processing different parts of ArchiMob and Kaldi outputs

experiments: Makefiles containing commands for exectuing experiments (e.g. training AMs, compiling WFSTs and evaluating)

Kaldi:

conf: configuration files

local: original recipe-specific files from egs/wsj/s5

utils: utilities shared among all the Kaldi recipes

steps: general scripts related to the different steps followed in the Kaldi recipes

Steps for running experiment on dialectial (Dieth) transcriptions

To generate original lexicon from a csv file:

First, extract Dieth transcription utterances from train.csv (possibly also dev.csv/test.csv)

python ./archimob/process_archimob_csv.py \
-i ../data/archimob_r2/train.csv \
-trans orig \
-t ../processed/dieth/dieth_trans.txt

Then create lexicon by mapping grapheme clusters to phones symbols (according to Fran's original approach)

python ./archimob/create_simple_lexicon.py \
-v ../processed/dieth/dieth_trans.txt \
-c manual/clusters.txt \
-o ../processed/dieth/dieth_lexicon.txt

Train AMs

bash ./run_archimob.sh \
../data/archimob_r2/train.csv \
../data/archimob_r2/chunked_wav_files \
../processed/dieth/am_out \
'orig' \
../processed/dieth/dieth_lexicon.txt

Compile WFST and decode on validation set to get best WIP and LMWT

NB. This step assumes a pre-computed LM in .arpa format (as produced by SRILM/MITLM), e.g., ../lms/dieth/mitlm_mkn_3.arpa.
NB. If mapping of normalised to dieth wordforms is available, include it as the last argument for computing FlexWER.

bash ./compile_and_decode.sh \
../lms/dieth/mitlm_mkn_3.arpa \
../processed/dieth/am_out \
../data/archimob_r2/dev.csv \
../data/archimob_r2/chunked_wav_files \
../processed/dieth/lw_out/ \
orig \
"--min-lmwt 5 --max-lmwt 20" \
../data/archimob_r2/norm2dieth_clean.json

Decoding test set and evaluating performance

NB. Specify best LMWT according to validation set decoding explicitly (in this example, 11)
NB. If mapping of normalised to dieth wordforms is available, include it as the last argument for computing FlexWER.

bash ./evaluate.sh \
../data/archimob_r2/test.csv \
../data/archimob_r2/chunked_wav_files \
../processed/dieth/am_out \
../processed/dieth/lw_out/ \
../processed/dieth/eval_out/ \
11 \
orig \
../data/archimob_r2/norm2dieth_clean.json

Steps for running experiment on normalised transcriptions

Steps are largely the same as above. The main differences include: - lexicon generation - language model training - for all basic commands, the <transcription_type> argument must be norm, not orig - no surface-level mapping for FlexWER evaluations

Useful tips for working with normalised transcriptions:

ensure that the csv has been normalised to remove unwanted diacrtitics (e.g. 'õ', 'ã', etc.)
ensure that input lexicon has been extended to cover as many in-vocabulary words as possible

Example call for AM training:

bash ./run_archimob.sh \
../data/archimob_r2/train.csv \
../data/archimob_r2/chunked_wav_files \
../norm/am_out \
'norm' \
../processed/norm/extended_lexicon.txt

Updated 25/07/2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description:

Main scripts:

Configuration:

Folders:

Steps for running experiment on dialectial (Dieth) transcriptions

Steps for running experiment on normalised transcriptions

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
archimob		archimob
archimob_char		archimob_char
conf		conf
data		data
doc		doc
evaluation		evaluation
experiments		experiments
install_uzh_server		install_uzh_server
lms		lms
local		local
manual		manual
rnnlm		rnnlm
scripts		scripts
steps		steps
trash		trash
utils		utils
uzh		uzh
.gitignore		.gitignore
README.md		README.md
cmd.sh		cmd.sh
compile_and_decode.sh		compile_and_decode.sh
compile_and_decode_char.sh		compile_and_decode_char.sh
compile_and_decode_norm.sh		compile_and_decode_norm.sh
compile_lingware.sh		compile_lingware.sh
decode_nnet.sh		decode_nnet.sh
evaluate.sh		evaluate.sh
path.sh		path.sh
run_archimob.sh		run_archimob.sh
train_AM.sh		train_AM.sh
train_character_AM.sh		train_character_AM.sh
train_norm_AM.sh		train_norm_AM.sh
train_norm_AM_02.04.20.sh		train_norm_AM_02.04.20.sh

tannonk/two-headed-master

Folders and files

Latest commit

History

Repository files navigation

Description:

Main scripts:

Configuration:

Folders:

Steps for running experiment on dialectial (Dieth) transcriptions

Steps for running experiment on normalised transcriptions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages