by asoqwer
Welcome to AutoPitch Trainer v2.0! This utility trains machine learning models to automatically generate pitch bends for USTX files used in OpenUtau. Learn from your manually tuned files and save time applying similar pitches to new projects.
This README provides a quick overview and setup guide. For detailed step-by-step instructions, explanations, and recommendations, please refer to the full tutorial:
➡️ AutoPitch Trainer v2.0 Full Tutorial (Google Docs) ⬅️
- Train a pitch generation model from scratch using your USTX files.
- Fine-tune a pre-trained model on a new dataset.
- Resume interrupted training sessions.
- Process new USTX files to apply pitch bends automatically.
- Convert trained models to ONNX format for use in UtauV.
-
Clone the Repository:
git clone https://github.com/emeraldsingers/AutoPitchTrainer cd autopitch_trainer -
Create a Virtual Environment (Recommended):
python -m venv venv # Windows: venv\Scripts\activate # macOS/Linux: source venv/bin/activate
-
Install PyTorch with CUDA (Highly Recommended): For significantly faster training on compatible NVIDIA GPUs, install PyTorch with CUDA. Visit the official PyTorch website to get the command matching your system. For CUDA 11.8:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
For CPU-only installation (much slower):
pip3 install torch torchvision torchaudio
-
Install Other Dependencies:
pip install -r requirements.txt
- Launch the Trainer:
python main.py
- Select a Tab: The GUI has several tabs for different tasks:
- Train: Build a new model from scratch using a directory of tuned USTX files.
- Fine-Tune: Adapt an existing model (
.pth+phoneme_vocab.yaml) using a new set of USTX files. - Resume Training: Continue a previous training session using the saved checkpoint (
last_autopitch_model.pth) and original data. - Process: Apply pitch bends from a trained model (
.pth+phoneme_vocab.yaml) to one or more USTX files.- Note: The "Best Loss" model option may not function as expected, as the trainer might not save this specific checkpoint correctly. Using "Last" is generally recommended.
- Convert to ONNX: Export a trained model (
last_autopitch_model.pth+phoneme_vocab.yaml) to the.onnxformat needed by UtauV.
- Follow On-Screen Instructions: Each tab has fields for specifying input/output directories, model files, epochs, etc. Use the "Browse" buttons to select paths.
- Need More Details? For specific steps on preparing data, recommended settings, and understanding the process for each tab, consult the Full Tutorial.
- After converting your model to ONNX (
autopitch.onnxandphoneme_vocab.yaml), you need to place these files in the correct UtauV dependency folder (Dependencies/autopitch). - You might also need to configure the
noiseScaleinDependencies/autopitch/config.jsonwithin UtauV to control pitch variation. - See the Full Tutorial for detailed UtauV installation and usage instructions.
- "Error loading AutoPitch" (in UtauV): Ensure
autopitch.onnxandphoneme_vocab.yamlare correctly placed inDependencies/autopitch. - No Pitch Bends Applied (in UtauV): Make sure notes are selected before applying AutoPitch, or verify the quality of your trained model and training data.
- For other issues: Check the Troubleshooting section in the Full Tutorial.
For questions or support, contact asoqwer via:
- Email:
emeraldproject13@gmail.com - Discord:
asoqwer - Telegram: Emerald Project Support Bot
- Emerald Project: emeraldsingers.github.io
- UtauV Releases: Github Releases
- PreTrained AutoPitch models: Mega.NZ
- Emerald Project Telegram: Telegram Channel
- asoqwer on YT: asoqwer - YouTube
This project is licensed under the GNU GPLv3 License - see the LICENSE file for details.