Skip to content
Wavenet Autoencoder for Unsupervised speech representation learning (after Chorowski, Jan 2019)
Branch: master
Clone or download
Latest commit f310210 Apr 17, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
dat
par
scripts
README.md
ae_wavenet.ipynb almost working rfield Apr 2, 2019
bottlenecks.py progress fixing sgvb Apr 18, 2019
checkpoint.py maybe fixed checkpointing Apr 13, 2019
data.py
mfcc.py actual training? Apr 14, 2019
model.py
notes.txt adding sgvb and vae Apr 17, 2019
padding_notes.txt jitter, cuda-agnostic Apr 7, 2019
parse_tools.py maybe fixed checkpointing Apr 13, 2019
rfield.py extra metrics, cleaner checkpointing Apr 11, 2019
rfield_notes.txt rfield extra porcelain Apr 5, 2019
todo.txt
train.py actual training? Apr 14, 2019
upsampling_notes.txt
util.py
wave_encoder.py actual training? Apr 14, 2019
wavenet.py

README.md

PyTorch implementation of Jan Chorowski, Jan 2019 paper"

This is a PyTorch implementation of https://arxiv.org/abs/1901.08810.

[Under Construction]

Update April 14, 2019

Began training on Librispeech dev (http://www.openslr.org/resources/12/dev-clean.tar.gz), see dat/example_train.log

TODO

  1. VAE and VQVAE versions of the bottleneck / training objectives
  2. Inference mode

Example training setup

code_dir=/path/to/ae-wavenet
run_dir=/path/to/my_runs

# Get the data
cd $run_dir
wget http://www.openslr.org/resources/12/dev-clean.tar.gz
tar zxvf dev-clean.tar.gz
$code_dir/scripts/librispeech_to_rdb.sh LibriSpeech/dev-clean > librispeech.dev-clean.rdb 

# Train
cd $code_dir 
python train.py new -af par/arch.basic.json -tf par/train.basic.json -nb 4 -si 10 \
  -rws 100000 -fpu 1.0 $run_dir/model%.ckpt $run_dir/librispeech.dev-clean.10.r1.rdb
You can’t perform that action at this time.