Skip to content

BenJamesbabala/vc-vawgan

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Demo

https://jeremycchsu.github.io/vc-vawgan/

Hyper-parameters and other Experimental Settings

z_dim: 64 the dimenstion of the phonetic content space
y_dim: 10 the number of speakers
merge_dim: 171 the dimension of speaker space
clamping: 0.01 the K-Lipschitz scalar

For the CNN architecture, please see the architecture.json file.

Mean Opinion Scores (MOS)

Scales:
5: Excellent
4: Good
3: Fair
2: poor
1: bad

In the intra-gender experiment (SF1 to TF2), the evaluators have access to two extra audio files for reference (per pair): a GMM baseline whose mean MOS was 1.53 and the true target whose mean MOS was 4.72.

Note:

  • The error bar in Fig. 2 indicates standard deviation of the sample, not confidence interval of the mean.
  • The ANOVA tests on the MOS scores returned very small p-values, so we used the word significant in our paper. (intra-gender: VAW-GAN against VAE; inter-gender: VAW-GAN against VAE)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published