Skip to content

uniaudio666/UniAudio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

UniAudio: An Audio Foundation Model Toward Universal Audio Generation

Introduction

UniAudio is a universal audio generation model, which can solve a lot of audio generation task with only one model, such as TTS, VC, Singing voice synthesis, speech enhancement, speech extraction, text-to-sound, text-to-music and so on. In the following, the details of UniAudio will be introduced.

  • Neural Audio Codec Models
  • Top-level Design
  • Training own UniAudio for any task with your own dataset.

Neural Audio Codec Models

Please refer to codec folder to find the training codec of Neural Audio Codec. We will release the checkpoint of our trained codec after the double-blind review.

Top-level Design

The framework of UniAudio is very simple and useful. It includes 4 steps: (1) define your task. (2) prepare data. (3) tokenize data and save it as .pth file. (4) Training and inference The more clear documents for UniAudio will be released after the double-blind review.

Training the UniAudio

The details of training document will be released.

About

The official source code of UniAudio

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published