Transformer based ASR System for German Language

This repository contains code to build a German ASR system based on transformers and the Common voice 9.0 dataset. We use the pre-trained Wav2Vec Transformer as well as the pre-trained Wave2Vec Conformer , both from Facebook.

Hardware Requirements

Warning: The transformer model has around 100 millions trainable parameters; the conformer model is almost 7 times the transformer model.Therefore, be cautious with hardware.

At least 100Gb Disk space
At least 32Gb GPU Memory if you intend to use GPU.(particularly for the conformer)

Setup

Recommended Python version 3.9

Create a new conda-environment and activate it.
- conda create -n ailab python=3.9
- conda activate ailab
Install all requirements. (Please manually install cuda if you plan to use a GPU)
- pip install -r requirements.txt

Preprocessing

The preprocessing does the following:

Downloads the dataset via HuggingFace. You need to create an account and get a token to be able to download this particular dataset.
Removes unnecessary columns from the dataset.
Resamples the audio files from an initial frequency of 48 000 Hz to 16 000Hz.
Removes special characters.
Takes care of padding the sentences.
Saves the data set in directories training_set, validation_set, test_set.
Creates the model's tokenizer and saves it.

The script requires following arguments: Token ( string from huggingface), num_workers(int). It can be launched via : python prepare_dataset.py --token Token --workers num_workers

Training

There are 2 training scripts: One for the Transformer model and one for the Conformer model. The scripts need as arguments the following:

Number of epochs: An integer
Percentage of data to use: an Integer in [0 , 100].
If to resume the training or not.

To train the Transformer for Example, One could use the following command:

python transformer_training.py --epochs 10 --data 50 --no-resume_training > output_transformer.txt

Models hperparameters can be changed in the scripts.

Test

To test the trained model, first start by copying the vocab.json file created during tokenization into the directory where the model is saved.

Then test the model by running test_script.py. It requires follow arguments:

Model Directory: The path to the model you want to test.
Model: transformer or conformer.
Percentage of data to use for testing: an Integer in [0 , 100].
If to print predicted sentences or not.

An Example of command could be: python model_testing.py --model transformer --model_dir transformer_model --data 50 --print_examples > output_results.txt

Warning: At the time this repository was created, the conformer model was still in development. Therefore, some compatibility problems may arise.

Note: Fill free to contact me in case you have some issues or questions.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
src		src
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer based ASR System for German Language

Hardware Requirements

Setup

Preprocessing

Training

Test

About

Releases

Packages

Languages

License

hciays/ailab_ss2022

Folders and files

Latest commit

History

Repository files navigation

Transformer based ASR System for German Language

Hardware Requirements

Setup

Preprocessing

Training

Test

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages