GSVI : GPT-SoVITS Inference Plugin

Welcome to GSVI, an inference-specialized plugin built on top of GPT-SoVITS to enhance your text-to-speech (TTS) experience with a user-friendly API interface. This plugin enriches the original GPT-SoVITS project, making voice synthesis more accessible and versatile.

Please note that we do not recommend using GSVI for training. Its existence is to make the process of using GPT-soVITS simpler and more comfortable for others, and to make model sharing easier.

This fork is mainly based on the fast_inference_ branch, using a lot of PR code contributed by ChasonJiang. Thanks to this great developer. ”Dalao NB！“

At the same time, the Inference folder used by this branch is the main submodule, coming from https://github.com/X-T-E-R/TTS-for-GPT-soVITS.

Features

High-level abstract interface for easy character and emotion selection
Comprehensive TTS engine support (speaker selection, speed adjustment, volume control)
User-friendly design for everyone
Simply place the shared character model folder, and you can quickly use it.
High compatibility and extensibility for various platforms and applications (for example: SillyTavern)

Getting Started

Install manually or use prezip for Windows
Put your character model folders
Run bat file or run python file manually
If you encounter issues, join our community or consult the FAQ. QQ Group: 863760614 , Discord (AI Hub):

We look forward to seeing how you use GSVI to bring your creative projects to life!

Prezip : https://huggingface.co/XTer123/GSVI_prezip/tree/main

Usage

Use With Bat Files

You could see a bunch of bat files in 0 Bat Files/

If you want to update, then run bat 0 and 1 (or 999 0 1)
If you want to start with a single gradio file, then run bat 3
If you want to start with backend and frontend , run bat 5 and 6
If you want to manage your models, run 10.bat

Python Files

Start with a single gradio file

Gradio Application: app.py

Model Management

Gradio Model Management Interface: webui/webui.py

API Documentation

For API documentation, visit our Yuque documentation page. or API Doc.md

Model Folder Format

In a character model folder, like trained/Character1/

Put the pth / ckpt / wav files in it, the wav should be named as the prompt text

Like :

trained
--hutao
----hutao-e75.ckpt
----hutao_e60_s3360.pth
----hutao said something.wav

Add a emotion for your model

To make that, open the Model Manage Tool (10.bat /webuis/character_manager/webui.py)

It can assign a reference audio to each emotion, aiming to achieve the implementation of emotion options.

Installation

You could install this with the guide bellow, then download pretrained models from GPT-SoVITS Models and place them in GPT_SoVITS/pretrained_models, and put your character model folder in trained

Or just download the pre-packaged distribution for Windows. ( then put your character model folder in trained )

About the character model folder, see below

Tested Environments

Python 3.9, PyTorch 2.0.1, CUDA 11
Python 3.10.13, PyTorch 2.1.2, CUDA 12.3
Python 3.9, PyTorch 2.3.0.dev20240122, macOS 14.3 (Apple silicon)

Note: numba==0.56.4 requires py<3.11

Windows

If you are a Windows user (tested with win>=10), you can directly download the pre-packaged distribution and double-click on go-webui.bat to start GPT-SoVITS-WebUI.

Or pip install -r requirements.txt , and then double click the install.bat

Linux

conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
bash install.sh

macOS

Note: The models trained with GPUs on Macs result in significantly lower quality compared to those trained on other devices, so we are temporarily using CPUs instead.

First make sure you have installed FFmpeg by running brew install ffmpeg or conda install ffmpeg, then install by using the following commands:

conda create -n GPTSoVits python=3.10
conda activate GPTSoVits

pip install -r requirements.txt
git submodule init
git submodule update --init --recursive

Install FFmpeg ( No need if use prezip )

Conda Users

conda install ffmpeg

Ubuntu/Debian Users

sudo apt install ffmpeg
sudo apt install libsox-dev
conda install -c conda-forge 'ffmpeg<7'

Windows Users

Download and place ffmpeg.exe and ffprobe.exe in the GPT-SoVITS root.

Pretrained Models ( No need if use prezip )

Download pretrained models from GPT-SoVITS Models and place them in GPT_SoVITS/pretrained_models.

Docker

Please prepare local path and models before running the following command.

output:The output dirctory of wav files
logs: for recording logs
SoVITS_weights: SoVITS weights
GPT_SoVITS: all pretrained_models are in GPT_SoVITS/pretrained_models which is a big size
nltk_data: nltk library, please download it with the following command:

python -m nltk.downloader -d ./nltk_data averaged_perceptron_tagger cmudict

trained: trained models(From which you trained or borrowed from others)

docker build -t gpt-sovits-inference:latest -f Dockerfile .
docker run --rm -it -d --gpus="device=0" --env=is_half=False \
  --volume=<Replace with the path of your project>/GPT-SoVITS-Inference/output:/workspace/output \
  --volume=<Replace with the path of your project>/GPT-SoVITS-Inference/logs:/workspace/logs \
  --volume=<Replace with the path of your project>/GPT-SoVITS-Inference/SoVITS_weights:/workspace/SoVITS_weights \
  --volume=<Replace with the path of your project>/GPT-SoVITS-Inference/GPT_SoVITS/:/workspace/GPT_SoVITS \
  --volume=<Replace with the path of your project>/GPT-SoVITS-Inference/nltk_data:/usr/local/nltk_data \
  --volume=<Replace with the path of your project>/GPT-SoVITS-Inference/trained:/workspace/trained \
  --workdir=/workspace -p 5000:5000 --shm-size="16G" gpt-sovits-inference:latest

Remove the pyaudio in the requirements.txt !!!!

Credits

This fork is mainly based on the fast_inference_ branch of GPT-soVITS project, using a lot of PR code contributed by ChasonJiang.

Special thanks to the following projects and contributors:

Name		Name	Last commit message	Last commit date
Latest commit History 976 Commits
0 一键启动脚本		0 一键启动脚本
Docker		Docker
GPT_SoVITS		GPT_SoVITS
Synthesizers		Synthesizers
docs		docs
i18n/locale		i18n/locale
src		src
tmp_audio		tmp_audio
tools		tools
webuis		webuis
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
api_doc.md		api_doc.md
app.py		app.py
colab_webui.ipynb		colab_webui.ipynb
common_config.json		common_config.json
docker-compose.yaml		docker-compose.yaml
dockerbuild.sh		dockerbuild.sh
gpt-sovits_kaggle.ipynb		gpt-sovits_kaggle.ipynb
gsv_config.json		gsv_config.json
install.sh		install.sh
pure_api.py		pure_api.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GSVI : GPT-SoVITS Inference Plugin

Features

Getting Started

Usage

Use With Bat Files

Python Files

Start with a single gradio file

Model Management

API Documentation

Model Folder Format

Add a emotion for your model

Installation

Tested Environments

Windows

Linux

macOS

Install FFmpeg ( No need if use prezip )

Conda Users

Ubuntu/Debian Users

Windows Users

Pretrained Models ( No need if use prezip )

Docker

Credits

Theoretical

Pretrained Models

Text Frontend for Inference

WebUI Tools

Thanks to all contributors for their efforts

About

Releases 2

Packages

Languages

License

X-T-E-R/GPT-SoVITS-Inference

Folders and files

Latest commit

History

Repository files navigation

GSVI : GPT-SoVITS Inference Plugin

Features

Getting Started

Usage

Use With Bat Files

Python Files

Start with a single gradio file

Model Management

API Documentation

Model Folder Format

Add a emotion for your model

Installation

Tested Environments

Windows

Linux

macOS

Install FFmpeg ( No need if use prezip )

Conda Users

Ubuntu/Debian Users

Windows Users

Pretrained Models ( No need if use prezip )

Docker

Credits

Theoretical

Pretrained Models

Text Frontend for Inference

WebUI Tools

Thanks to all contributors for their efforts

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages