GitHub - uniaudio666/UniAudio: The official source code of UniAudio

UniAudio: An Audio Foundation Model Toward Universal Audio Generation

Introduction

UniAudio is a universal audio generation model, which can solve a lot of audio generation task with only one model, such as TTS, VC, Singing voice synthesis, speech enhancement, speech extraction, text-to-sound, text-to-music and so on. In the following, the details of UniAudio will be introduced.

Neural Audio Codec Models
Top-level Design
Training own UniAudio for any task with your own dataset.

Neural Audio Codec Models

Please refer to codec folder to find the training codec of Neural Audio Codec. We will release the checkpoint of our trained codec after the double-blind review.

Top-level Design

The framework of UniAudio is very simple and useful. It includes 4 steps: (1) define your task. (2) prepare data. (3) tokenize data and save it as .pth file. (4) Training and inference The more clear documents for UniAudio will be released after the double-blind review.

Training the UniAudio

The details of training document will be released.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Evaluation		Evaluation
UniAudio		UniAudio
codec		codec
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation

Evaluation

UniAudio

UniAudio

codec

codec

readme.md

readme.md

Repository files navigation

UniAudio: An Audio Foundation Model Toward Universal Audio Generation

Introduction

Neural Audio Codec Models

Top-level Design

Training the UniAudio

About

Releases

Packages

Languages

uniaudio666/UniAudio

Folders and files

Latest commit

History

Repository files navigation

UniAudio: An Audio Foundation Model Toward Universal Audio Generation

Introduction

Neural Audio Codec Models

Top-level Design

Training the UniAudio

About

Resources

Stars

Watchers

Forks

Languages