Convert any text into a custom voice using edge-tts and RVC (Retrieval-based Voice Conversion).
Built with Streamlit for a sleek, interactive web interface.
- 🗣️ 13+ Base Voices — Choose from Turkish, English, German, French, Spanish, Japanese, Korean, and Chinese neural voices via edge-tts
- 🎤 RVC Voice Conversion — Transform the base speech into any custom voice with a trained
.pthmodel - 🎼 Multiple F0 Methods —
rmvpe(best quality),harvest,crepe,pm(fastest) - 🎵 Pitch Control — Adjust pitch ±24 semitones to match the target model's range
- ⚙️ Fine-Tuning — Index rate, RMS mix rate, consonant protection sliders
- 🖥️ GPU & CPU Support — Automatically uses CUDA if available, falls back to CPU
- ⬇️ Download — Listen in-browser and download the final WAV file
git clone https://github.com/ErenBalkis/rvc-tts-studio.git
cd rvc-tts-studiopython -m venv venv
# Windows
venv\Scripts\activate
# macOS / Linux
source venv/bin/activate
⚠️ You must install the correct PyTorch build before installing the other dependencies.
| Hardware | Command |
|---|---|
| CPU only | pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu |
| CUDA 11.8 | pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 |
| CUDA 12.1 | pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 |
pip install -r requirements.txtCreate a subfolder for each voice inside the models/ directory. Each subfolder should contain a .pth file and optionally a .index file:
models/
├── my_voice/
│ ├── my_voice.pth
│ └── my_voice.index ← optional, improves quality
├── another_voice/
│ └── another_voice.pth
└── .gitkeep
💡 Where to find RVC models? Search for pre-trained RVC models on Hugging Face or weights.gg.
streamlit run app.pyThe app will open at http://localhost:8501 🎉
rvc-tts-studio/
├── models/ # Your RVC model subfolders (.pth & .index)
│ └── .gitkeep # Keeps the empty folder in git
├── temp/ # Temporary audio files (auto-managed)
│ └── .gitkeep
├── app.py # Streamlit UI & application logic
├── rvc_module.py # RVC inference & model discovery module
├── requirements.txt # Python dependencies
├── .gitignore # Git ignore rules
└── README.md # This file
- Select a model from the sidebar dropdown (click 🔄 Refresh Models after adding new files).
- Type or paste your text in the main area.
- Choose a base voice — the edge-tts speaker used before RVC conversion.
- Adjust the pitch slider to match your target model's vocal range (positive = higher, negative = lower).
- Fine-tune quality settings in the sidebar (F0 method, index rate, RMS mix, protect).
- Click 🚀 Generate Voice and wait for the result.
- Listen in-browser or ⬇️ Download the final WAV.
| Setting | Description | Recommended |
|---|---|---|
| F0 Method | Pitch extraction algorithm | rmvpe for best quality |
| Index Rate | How much target voice character to apply (0.0–1.0) | 0.75 |
| RMS Mix Rate | Volume envelope mixing (0.0–1.0) | 0.25 |
| Protect | Consonant protection against artifacts (0.0–0.5) | 0.33 |
| Pitch Shift | Semitone adjustment (-24 to +24) | 0 (adjust per model) |
| Problem | Solution |
|---|---|
numpy version conflict |
Make sure numpy < 2.0 is installed: pip install "numpy<2.0" |
torch not found |
Install PyTorch manually using the table in Step 3 above |
| No models in dropdown | Place .pth files inside subfolders in models/ and click Refresh Models |
| CUDA out of memory | Use a smaller model or switch to CPU (torch CPU build) |
rvc-python import error |
Run pip install rvc-python>=0.1.5 |
- Streamlit — Interactive web UI
- edge-tts — Microsoft Edge's online TTS engine
- rvc-python — RVC inference wrapper
- PyTorch — Deep learning backend
- librosa — Audio processing
- NumPy / SciPy — Numerical computing