This Python project utilizes the Demucs library for voice separation and Retrieval-based Voice Conversion (RVC) techniques to achieve two primary objectives:
-
Voice Separation: It aims to separate vocals from instrumental audio tracks within Audacity audio clips. The project leverages the power of Demucs, a state-of-the-art source separation model, to cleanly isolate the human voice from the accompanying instruments.
-
Voice Conversion: Once the vocals are extracted, this project employs Retrieval-based Voice Conversion (RVC) technology to transform the original voice into another voice.
The project seamlessly interfaces with Audacity using mod-script-pipe, allowing for easy processing of audio clips directly within the Audacity environment.
Downlaod hubert_base.pt and place it in the root folder
pip install -r requirements.txt
mod-script-pipe : This script will communicate with Audacity using mod-script-pipe. So you need to enable it. To do that:
- Open Audacity's preferences (
ctrl+P
orEdit --> Preferences
) - In Modules tab, select
Enabled
formod-script-pipe
- Voice models should be inside the
models/
folder - Each voice model consists of a Folder, which will be used as the name of the model.
- Each voice model folder should contain a
.pth
and.index
file
- Import an audio clip into Audacity (It must be the first clip).
- Select the audio range you want to edit
- Start the server by running the following command :
python server.py
- Start the client by running the following command :
python index.py
- The
index.py
file will guide you through the process.
I've chosen to separate the processing part of the project from the Audacity communication into two distinct scripts for practical reasons. By doing this, it allows us to:
-
Optimize Resource Usage: I can run the processing server on a high-performance machine equipped with a GPU, ensuring efficient and speedy voice separation and conversion tasks.
-
Decouple Audacity Work: Simultaneously, Audacity can be run on a less powerful machine since it primarily handles communication tasks. This separation of responsibilities enables us to utilize resources more effectively, ensuring that both the audio editing and processing tasks can be performed efficiently.
You can offload computation to a dedicated server. Follow these steps:
Server: Install and run this project on your powerfull server (with CUDA enabled GPU).
Client Configuration: On your Audacity machine, edit the SERVER_URL in utils.py
to match the server's IP or hostname.
File Transfer: Note that communication between the client and server will involve transferring audio files zipped using FastAPI.
This setup optimizes performance by leveraging high-end resources for tasks like voice separation and conversion while maintaining seamless communication with Audacity on a less powerful system.