# RAVE-Latent Diffusion

**RAVE-Latent Diffusion** is a denoising diffusion model designed to generate new RAVE latent codes with a large context window, faster than realtime, while maintaining music structural coherency. Code by [Moisés Horta Valenzuela](https://github.com/moiseshorta) \ 𝔥𝔢𝔵𝔬𝔯𝔠𝔦𝔰𝔪𝔬𝔰 

https://github.com/moiseshorta/RAVE-Latent-Diffusion


**RAVE** is a variational autoencoder for fast and high-quality neural audio synthesis by Antoine Caillon and Philippe Esling. [Article on arxiv](https://arxiv.org/abs/2111.05011) & [Source code on Github](https://github.com/acids-ircam/RAVE)

*Note that this notebook has been tested with RAVE models until version 2.2.2 - using models created with version >= 2.3. might not lead to a successful training.*

----

Notebook author: [Martin Heinze](https://github.com/devstermarts)

Last updated: 23.11.2024

## Install Miniconda, RAVE-Latent Diffusion, dependencies

In [None]:
## Install Miniconda
!mkdir /kaggle/temp/
%cd /kaggle/temp
!curl -L https://repo.anaconda.com/miniconda/Miniconda3-py39_4.12.0-Linux-x86_64.sh -o miniconda.sh
!chmod +x miniconda.sh
!sh miniconda.sh -b -p /kaggle/temp/miniconda

## Install RAVE-Latent Diffusion and dependencies
!git clone https://github.com/moiseshorta/RAVE-Latent-Diffusion.git
%cd /kaggle/temp/RAVE-Latent-Diffusion
!/kaggle/temp/miniconda/bin/pip install -r requirements.txt

## Preprocess RAVE model and audio dataset

***This section is only needed once before initial training. Disable for training resume or output creation.***

In [None]:
!mkdir /kaggle/working/latents/
!/kaggle/temp/miniconda/bin/python preprocess.py \
--rave_model "/kaggle/input/your_rave_model.ts" \
--audio_folder "/kaggle/input/your_dataset" \
--sample_rate 44100 \
--latent_length 4096 \
--latent_folder "/kaggle/working/latents"

## Start training

***This section is for initial training. Disable for training resume or output creation.***

For all available parameters, check train.py 

In [None]:
!mkdir /kaggle/working/checkpoints/
!/kaggle/temp/miniconda/bin/python train.py \
--name your_training_name \
--latent_folder "/kaggle/working/latents" \
--save_out_path "/kaggle/working/checkpoints" \
--save_interval 100 #Default=50. LD checkpoints are big, to avoid memory depletion, this is set to higher value

## Resume training

***This section is for training resume. Disable for initial training or output creation. Make sure to use the same configuration as in your initial training.***

For all available parameters, check train.py 

In [None]:
#Copy contents of earlier training to /kaggle/working folder.
!cp -r /kaggle/input/root_folder_of_your_earlier_training/* /kaggle/working

!/kaggle/temp/miniconda/bin/python train.py \
--name your_training_name \
--latent_folder "/kaggle/working/latents" \
--save_out_path "/kaggle/working/checkpoints" \
--checkpoint_path "/kaggle/working/checkpoints" \
--save_interval 100 #Default=50. LD checkpoints are big, to avoid memory depletion, this is set to higher value

## Generate output wave files

***This section is for output creation. Disable for initial training or resume training.***

For all available parameters, check generate.py 

In [None]:
!mkdir /kaggle/working/output/

#The below command is exemplary for one output file, change to your liking and c/p with different parameters as you seem fit.
!/kaggle/temp/miniconda/bin/python generate.py \
--model_path "/kaggle/input/root_folder_of_your_earlier_training/checkpoints/model_with_best_suffix.pt" \
--rave_model "/kaggle/input/your_rave_model.ts" \
--sample_rate 44100 \
--diffusion_steps 100 \
--seed 91827536 \
--temperature=1 \
--latent_length 8192 \
--length_mult 1 \
--lerp=True \
--seed_a=91827536 \
--seed_b=19283574 \
--output_path "/kaggle/working/output" \
--name your_filename