# RAVE

# **RAVE** is a variational autoencoder for fast and high-quality neural audio synthesis by Antoine Caillon and Philippe Esling. [Article on arxiv](https://arxiv.org/abs/2111.05011) & [Source code on Github](https://github.com/acids-ircam/RAVE)

*Note that this notebook currently supports RAVE until version 2.2.2, which seems to be the most reliable.* 

----

Notebook author: [Martin Heinze](https://github.com/devstermarts)

Last updated: 10.06.2025

## Install Miniconda, RAVE, dependencies

In [None]:
#Install Miniconda
!mkdir /kaggle/temp
%cd /kaggle/temp
!curl -L https://repo.anaconda.com/miniconda/Miniconda3-py39_4.12.0-Linux-x86_64.sh -o miniconda.sh
!chmod +x miniconda.sh
!sh miniconda.sh -b -p /kaggle/temp/miniconda

#Upgrade ipython ipykernel, install ffmpeg
!/kaggle/temp/miniconda/bin/pip install --upgrade ipython ipykernel
!/kaggle/temp/miniconda/bin/conda install ffmpeg --yes

In [None]:
#Install RAVE
#Pin to specific release tag if necessary using 'acids-rave==tag.version‘
#Note that this notebook currently supports training RAVE until version 2.2.2.
!/kaggle/temp/miniconda/bin/pip install acids-rave==2.2.2

In [None]:
#Force reinstall of RAVE compatible torch version. 
!/kaggle/temp/miniconda/bin/pip install --force-reinstall torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118

## Preprocess the dataset
Preprocessing is necessary once per training dataset. In below section, the data from preprocessing is stored in '/kaggle/working/processed'. Use the filepath to your dataset as input path, this is usually '/kaggle/input/your_audio_dataset'.

In [None]:
#Preprocess dataset
!mkdir /kaggle/working/processed
!/kaggle/temp/miniconda/bin/rave preprocess \
--input_path /kaggle/input/your_audio_dataset/ \
--output_path /kaggle/working/processed/

## Start initial training
While training the output is stored in a folder named 'your_training_name' followed by an underscore and a 10-character-string inside '/kaggle/working/runs'. '/kaggle/working/' also contains a 'status' folder with 'data.mdb' and 'lock.mdb' files.

***For initial training, the below section has been enabled, disable when you want to resume your training.***

In [None]:
#Initiate training
#Use config parameters as preferred for your training. 
#The setup below is exemplary. 
#For configuration options check https://github.com/acids-ircam/RAVE#training
#Other training parameters can be customized using --override, check train.py 
%cd /kaggle/working
!/kaggle/temp/miniconda/bin/rave train \
--config v2.gin \
--db_path /kaggle/working/processed/ \
--name your_training_name \
--override CAPACITY=128 \
--override PHASE_1_DURATION=800000

## Resume training
In order to conveniently resume training RAVE on Kaggle, you can transform the output data of the first run into a new dataset from the output tab of your notebook. You then add the new dataset to the notebook, disable the above initial, **Preprocess the dataset** and **Start initial training** sections of the notebook and adjust the below section according to your initial training configuration.

To further progress on the training after the first run, update the processed dataset by creating a new version from the output of the latest run. Make sure to check for updates on the dataset in your notebook before resuming the training.

***When you want to resume your training, disable the initial training section above before running the notebook.***

In [None]:
#Copy contents of earlier training to /kaggle/working folder.
!cp -r /kaggle/input/root_folder_of_your_earlier_training/* /kaggle/working
%cd /kaggle/working/

#Resume training
#Use config parameters as used in your earlier training.
!/kaggle/temp/miniconda/bin/rave train \
--config v2.gin \
--db_path /kaggle/working/processed/ \
--name your_training_name \
--override CAPACITY=128 \
--override PHASE_1_DURATION=800000 \
--ckpt /kaggle/working/runs/your_training_name_with_a_random_string_in_the_end/version_X/checkpoints/

## Export model
After your training is finished, you can export a model (.ts Torchscript) file.

***Disable the initial and resume training sections above when you only want to export your model.*** 

In [None]:
#Model export. 
#Use '--streaming' flag to export a model capable of real time processing.
#Use '--stereo' for a model with two decoders.

#Copy contents of earlier training to /kaggle/working folder.
!cp -r /kaggle/input/root_folder_of_your_earlier_training/* /kaggle/working

#Export model
!/kaggle/temp/miniconda/bin/rave export \
--run /kaggle/working/runs/your_training_name_with_a_random_string_in_the_end \
--streaming \
--stereo