Implementation of Latent Optimal Path by Gumbel Propagation for Variational Bayesian Dynamic Programming

This is a pytorch-based implementation of the paper: Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming.

Example implementation of Bayesian Dynamic Programming for general DAG

The GeneralDAG folder gives example implimentation including any general toy DAG, Time series alignment and monotonic alignment.

sample.py:

Implementation of Bayesian Dynamic Programming on any general DAG given the distance matrix (W) and a temperature scalar $\alpha$.

The distance matrix should be a NxN matrix, which:

W[u,:] represents u's children v if W[u,v] $\neq -\infty$.
W[:,v] represents v's parent u if W[u,v] $\neq -\infty$.

The function compute the optimal path by argmax(W). The compute complexity is $O(N)$ where N is the node size and all computations are based on numpy array.

test.py

Verify the correctness of sample.py by comparing with brue force method and argmin(-W), including:

Check cumulative probabilities.
Check sampling the optimal path by sampling 1000 times (gibbs & reversed ).
Check omega and expectation of the distribution (gibbs & brue force & reversed).

Time series alignment

The time_series_alginment folder contains the implementation the Bayesian DP on DTW DAG refer to the paper Section.4.1

dtw.py: a class implement the whole stuffs of the Bayesian DP according to the pseudo algorithms
FindPaths.py: Given any lengths of two sequence, return all possible alignment matrixs
dtw_verification.py: a class to verify the correctness of dtw.py
MLE.py: Treat the distance matrix as parameters, using MLE to get an approximated distance matrix theta given paths sampled from a ground truth Gibbs distribution.
Gradient_check.py: Check the MLE gradient when using MLE.py.

Monotonic Alignment

The monotonic alignment uses right and diagonal moves only. Its computational DAG shows below:

Pseudo algorithms and detail explaination can be found at Section 4.2 on the appendex of the paper. This folder contains:

ma.py: a class of implementation of whole Bayesian DP on Monotonic alignment DAG.
ma_verify.py: A series of Monte Carlo method to verify the correctness of ma.py.

Example: End-to-end Text-to-Speech with Monotonic Alignment

This part contains the implementation codes of the BDPVAE-TTS on the RyanSpeech dataset.

Dependencies

You can install the Python dependencies with

pip3 install -r requirements.txt

Preprocessing

First, run

python3 prepare_align.py config/RyanSpeech/preprocess.yaml

for some preparations. And then run the preprocessing script.

python3 preprocess.py config/RyanSpeech/preprocess.yaml

Build monotonic align sampling code (Cython):

cd monotonic_align; python setup.py build_ext --inplace

Training

Train your model with DDP

python3 train_ddp.py -p config/RyanSpeech/preprocess.yaml -m config/RyanSpeech/model.yaml -t config/RyanSpeech/train.yaml

or

python3 train.py -p config/RyanSpeech/preprocess.yaml -m config/RyanSpeech/model.yaml -t config/RyanSpeech/train.yaml

TensorBoard

Use

tensorboard --logdir output/log/RyanSpeech

Single Inference

For single text inference, run

python3 synthesize.py --mode single --text "YOUR_DESIRED_TEXT" --restore_step RESTORE_STEP --mode single -p config/RyanSpeech/preprocess.yaml -m config/RyanSpeech/model.yaml -t config/RyanSpeech/train.yaml

The generated utterances will be put in output/result/.

Batch Inference

Batch inference is also supported, try

python3 synthesize.py --mode batch --source preprocessed_data/RyanSpeech/val.txt --restore_step RESTORE_STEP --mode batch -p config/RyanSpeech/preprocess.yaml -m config/RyanSpeech/model.yaml -t config/RyanSpeech/train.yaml

to synthesize all utterances in preprocessed_data/RyanSpeech/val.txt

Pretrianed Model

The pretrianed BDPVAE-TTS on RyanSpeech dataset can be download via this Onedrive Link. Please put it to the path: ./output/ckpt/RyanSpeech/

Reference

keonlee's VAENAR-TTS; Glow-TTS; Fastspeech2; DiffSinger;

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Figure		Figure
GeneralDAG		GeneralDAG
audio		audio
config/RyanSpeech		config/RyanSpeech
model		model
monotonic_align		monotonic_align
preprocessed_data/RyanSpeech		preprocessed_data/RyanSpeech
preprocessor		preprocessor
utils		utils
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
prepare_align.py		prepare_align.py
preprocess.py		preprocess.py
reinforce_baseline.py		reinforce_baseline.py
requirements.txt		requirements.txt
synthesize.py		synthesize.py
train.py		train.py
train_ddp.py		train_ddp.py

License

XinleiNIU/LatentOptimalPathsBayesianDP

Folders and files

Latest commit

History

Repository files navigation

Implementation of Latent Optimal Path by Gumbel Propagation for Variational Bayesian Dynamic Programming

Example implementation of Bayesian Dynamic Programming for general DAG

sample.py:

test.py

Time series alignment

Monotonic Alignment

Example: End-to-end Text-to-Speech with Monotonic Alignment

Dependencies

Preprocessing

Training

TensorBoard

Single Inference

Batch Inference

Pretrianed Model

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Languages