Skip to content
/ TTS Public
forked from coqui-ai/TTS

πŸΈπŸ’¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

License

Notifications You must be signed in to change notification settings

AI-Unicamp/TTS

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

πŸ§‘β€πŸŽ€ Expressive Text-to-Speech

This is a repository forked from Coqui-AI (🐸TTS ) used to research about expressive TTS in our AI-Unicamp-CPQD group. The original codes are kept in "main" branch which is not our default visualization.

Here we keep the "unicamp' branch as our main branch, while "main" branch remains as the original and updated. You can see here the original README.md.

πŸ” About the group

We are an expressive TTS research group located at Unicamp and CPQD (Brazil).

πŸ”¨ Implementations

Expressive Models

  • Tacotron 2
  • Fastpitch

Expressive Datasets

  • EMOVDB
  • IEMOCAP
  • ESD

Style Encoders

  • Look-Up
  • Reference Encoder (Coarse/Fine-Grained)
  • GST
  • VAE
  • VQ-VAE
  • VAE+Flow
  • Diffusion

Disentanglement Blocks

  • Style Classifier
  • Speaker Classifier + GRL (Gradient Reversal Layer)

Style Reference Features

  • Pitch
  • Energy
  • Mel-Spectrogram

Agregation Types

  • Sum, Concat or AdaIN

Enhancing Losses

  • Orthogonal Loss
  • CLIP Loss
  • Cycle consistency Loss(*)

About

πŸΈπŸ’¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 96.2%
  • Python 3.8%
  • Shell 0.0%
  • HTML 0.0%
  • Makefile 0.0%
  • Cython 0.0%