Sonified Latent Data 🎤🧮

Experiments to sonify different latent distributions generated by encoders from multiple datasets.

This repository is made to assist my university thesis about building a WaveNet VAE. After this, my plan is to couple a WaveNet + VAE similar to [Chorowski et al., 2019] with latent distrubitions generated from multiple different dataset-encoder couples.

The goal is to discover if the sonified audio results contains meaningful differences and features from the input datasets. A lot of earlier experiments relevant to this project I tried using the WaveNet decoder can be found in my denoising repository.

How To Use

First install all dependencies from requirements.txt
Every model has its' own notebook containing training and inference instructions
Models can be downloaded from here: (to be made)
Datasets will are listed at the end of this readme

Models

Standard WaveNet VAE

The first result can be found in media/FirstResult.wav. It's very noisy, but it's definitely trying to make some patterns of speech.

This model is very similar to the one described by Chorowski et al. and follows the following model: I decided to go with a normal VAE and not the quantized variant because it allows me to more easily interpolate and play with the latent space.

For the actual code I took inspriation, and sometimes flat out copied, from the following repositories:

Training and model

My model is downloadable from 'n.b.t.', I trained it on the LJSpeech dataset. You can train your own model using train.py from the WaveNetVAE folder or by using the WaveVaePlayground.ipynb jupyter notebook.

Example usage of CLI train.py:

python3 train.py -tp "./traindatasetfolder/" -vp "./validationdatasetfolder/" -ep 100

Short Flag	Long Flag	Description
`-tp`	`--train_path`	Path of training data
`-vp`	`--validation_path`	Path of validation data
`-ep`	`--epochs`	Amount of epochs to train
`-ex`	`--export_path`	Model export location
`-bs`	`--batch_size`	Batch size
`-lr`	`--learning_rate`	Learning rate
`-kla`	`--kl_anneal`	KL multiplier increase per step
`-mkl`	`--max_kl`	Maximum KL multiplier
`-lpe`	`--logs_per_epoch`	Validation frequency
`-d`	`--device`	Train device, e.g. `cuda:0`, `cpu`
`-mf`	`--max_files`	Maximum amount of files in dataset

Tybalt WaveNet VAE

A alteration on the Tybalt VAE model by Way et al. I gave it one extra linear layer to help reducing the data to a smaller latent space.

Training and model

My model is downloadable from the releases section, it's trained on the TCGA dataset. You can train your own model using train.py from the Tybalt model folder or by using the TybaltPlayground.ipynb jupyter notebook. The acquisition and preprocessing scripts are available in the original Tybalt GitHub.

Example usage of CLI train.py:

python3 train.py -dp "./traindatasetfolder/" -ep 100

Short Flag	Long Flag	Description
`-dp`	`--data_path`	Path of all data
`-ep`	`--epochs`	Amount of epochs to train
`-ex`	`--export_path`	Model export location
`-bs`	`--batch_size`	Batch size
`-lr`	`--learning_rate`	Learning rate
`-kla`	`--kl_anneal`	KL multiplier increase per step
`-mkl`	`--max_kl`	Maximum KL multiplier
`-lpe`	`--logs_per_epoch`	Validation frequency
`-d`	`--device`	Train device, e.g. `cuda:0`, `cpu`
`-mf`	`--max_files`	Maximum amount of files in dataset

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.idea		.idea
media		media
models		models
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
TybaltPlayground.ipynb		TybaltPlayground.ipynb
WaveVaePlayground.ipynb		WaveVaePlayground.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sonified Latent Data 🎤🧮

Experiments to sonify different latent distributions generated by encoders from multiple datasets.

How To Use

Models

Standard WaveNet VAE

Training and model

Tybalt WaveNet VAE

Training and model

Mocap WaveNet VAE

Datasets

About

Releases

Contributors 2

Languages

License

WouterBesse/Sonified-Latent-Data

Folders and files

Latest commit

History

Repository files navigation

Sonified Latent Data 🎤🧮

Experiments to sonify different latent distributions generated by encoders from multiple datasets.

How To Use

Models

Standard WaveNet VAE

Training and model

Tybalt WaveNet VAE

Training and model

Mocap WaveNet VAE

Datasets

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Contributors 2

Languages