WARNING: The official repository was moved to https://github.com/bepierre/SpeechVGG!

To make sure you are using the most recent, up-to-date version, please follow the link above.

While most of the below still holds up, the code, data and links can be outdated. Apologies for the inconvenience.

Mikolaj Kegler, 4th May 2020.

SpeechVGG: A deep feature extractor for speech processing

For some context... Here, we present code underlying SpeechVGG (sVGG) a simple, yet efficient feature extractor for training deep learning frameworks for speech processing through deep feature losses. In (Kegler et al., 2019) we firstly applied sVGG to improve the performance of the network for recovering missing parts of speech spectrograms (i.e. speech inpainting). In the follow up paper (Beckmann et al., 2019) we present a systematic analysis of the influence of the sVGG parameters on the main framework.

To summarize quickly we showed how training a VGG inspired speech to word classifier could be used to extract higher level feature losses.

The trained network can then be used to train another network on another task, using the information learned in the classifiction task.

Now we are going to walk you through how to use it!

Train your own...

Requirements

Packages :

Python 3.6.8
numpy 1.16.4
h5py 2.8.0
SoundFile 0.10.2
SciPy 1.2.1
Tensorflow 1.13.1
Keras 2.2.4

Data :

You should create a folder 'LibriSpeech' with the following folders :

LibriSpeech
	|_ word_labels
	|_ split
        |____ test-clean
        |____ test-other
        |____ dev-clean
        |____ dev-other
        |____ train-clean-100
        |____ train-clean-360
        |____ train-other-500

The word_label folder should contain the aligned labels, this folder can be downloaded here.

The split folder should contain the extracted Librispeech datasets that can be downloaded here.

Generate dataset

First, preprocess the data (here, LibriSpeech for example):

python preprocess.py --data ./LibriSpeech --dest_path ./LibriSpeechWords

Then, obtain the mean and standard deviation of the desired dataset (for normalization):

python compute_dataset_props.py --data ./LibriSpeechWords/train-clean-100/ --output_folder ./

Parameters will be saved as dataset_props_log.h5 file. Here we attach a version obtained from training part of LibriSpeech data.

Train

Now you can train the model using the training script:

python train.py --name my_model_name --train ./LibriSpeechWords/train-clean-100/ --test ./LibriSpeechWords/test-clean/ --weight_path ./results/

Finally the weights of the model will be saved in the desired direction, here './results/'. Subsequently you can use the trained model, for example, to obtain deep feature losses (as we did in Kegler et al., 2019 & Beckmann et al., 2019).

or ...use our pre-trained models!

Available here, for all the configurations considered in (Beckmann et al., 2019).

Links:

Original papers:

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
figures		figures
libs		libs
ReadMe.md		ReadMe.md
compute_dataset_props.py		compute_dataset_props.py
dataset_props_log.h5		dataset_props_log.h5
preprocess.py		preprocess.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WARNING: The official repository was moved to https://github.com/bepierre/SpeechVGG!

To make sure you are using the most recent, up-to-date version, please follow the link above.

SpeechVGG: A deep feature extractor for speech processing

Train your own...

Requirements

Packages :

Data :

Generate dataset

Train

or ...use our pre-trained models!

Links:

About

Releases

Packages

Contributors 2

Languages

MKegler/SpeechVGG

Folders and files

Latest commit

History

Repository files navigation

WARNING: The official repository was moved to https://github.com/bepierre/SpeechVGG!

To make sure you are using the most recent, up-to-date version, please follow the link above.

SpeechVGG: A deep feature extractor for speech processing

Train your own...

Requirements

Packages :

Data :

Generate dataset

Train

or ...use our pre-trained models!

Links:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages