# 00. Download Audio datasets, Processed Data, and Trained Models

As GitHub blocks files larger than 100 MiB, the original audio files, pre-processed datasets, trained models, and checkpoints have all been removed from the repository (hence resulting in some empty folders). Relevant links and code are provided to download all the necessary files.

The code in the following blocks will download files needed for each step using [gdown](https://pypi.org/project/gdown/) python package.

Please download data accordingly, depending on the needs and interests:
1. Start from Scratch
   - if you want to **go through everything from the start**, download `1. Original Datasets` and follow the code and instructions from `01-dataset-preprocessing.ipynb`
   - alternatively, you can also use your own dataset
2. RAVE - Model Training
   - if you want to **start from model training**, download `2. Preprocessed Datasets` and follow the code and instructions from `02-model-training.ipynb` (skip 01)
3. Audio Generation with Trained Models
   - if you want to jump on to **audio generation** using the trained models directly, download `3. Trained Models` and follow the code and instructions from `03-audio-generation.ipynb` (skip 01&02)
4. Audio Visualisation with **Dorothy** (Example)
   - if you want to explore **audio reactive drawing using Python**, download `3. Trained Models`, select the desired models and follow the instructions in `04-audio-visualisation.ipynb`. 
   - Example visulisation code is included in `./rave-visualisation-example-dorothy/rave-techno-visualisation-example.py`, which is based on a rave drawing template from [Dorothy](https://github.com/Louismac/dorothy/), a Creative Computing Python Library for Interactive Audio Generation and Audio Reactive Drawing, Copyright (c) 2023 Louismac. The part of drawing code is derived from 23/24 DSAI STEM Course Submitted assignment - part 3, which can be modified to create different drawings.

In [1]:
# You may need to install/upgrade the version of gdown if you are using it the first time
# gdown 5.1.0, https://pypi.org/project/gdown/, directly download Google Drive files
# this code was derived from the lecture notebook - week 5 GANs
!pip install --upgrade --no-cache-dir gdown



In [2]:
import gdown

### 1. Download Original Datasets

All training audio files were copyright-free audio obtained online.

The Piano Dataset consists of classic piano music and solo piano pieces (mp3) obtained from musopen.org, chosic.com, soundcloud.com, and poxabay.com.

The Techno Dataset consists of a collection of dark techno/cyberpunk/industrial-type beat/Dark Synth music (mp3) from https://www.youtube.com/@AimToHeadOfficial (Instrumentals/Beats by Aim To Head).

The code in this block will download the training audio files.

An alternative way is to download the audio files directly from the Google Drive, unzip and place them in the target path.
- piano audio files - `./sounds/all_piano`: https://drive.google.com/uc?id=17pevxD5lMmmC-Ld0gtUyZDnWW5BrgvaE
- techno audio files - `./sounds/all_techno`: https://drive.google.com/uc?id=1MCTfucrSucX6uP4z_9hm6nWqDRrNRQjI

In [10]:
# download the piano audio files, using the sharing URL from Google Drive
# the file needs to be public in order for gdown to download
piano_url = "https://drive.google.com/file/d/17pevxD5lMmmC-Ld0gtUyZDnWW5BrgvaE/view?usp=sharing"
output = "all_piano.zip"
gdown.download(url=piano_url, output=output, fuzzy=True)

Downloading...
From (original): https://drive.google.com/uc?id=17pevxD5lMmmC-Ld0gtUyZDnWW5BrgvaE
From (redirected): https://drive.usercontent.google.com/download?id=17pevxD5lMmmC-Ld0gtUyZDnWW5BrgvaE&confirm=t&uuid=d963d4e5-20e8-4655-86fc-15cddc4d8531
To: /Users/irenex/Documents/GitHub/AI-4-Media-Rave-Audio-Generation/all_piano.zip
100%|██████████| 544M/544M [00:17<00:00, 30.9MB/s] 


'all_piano.zip'

In [13]:
# download the techno audio files
techno_url = "https://drive.google.com/file/d/1MCTfucrSucX6uP4z_9hm6nWqDRrNRQjI/view?usp=sharingg"
output = "all_techno.zip"
gdown.download(url=techno_url, output=output, fuzzy=True)

Downloading...
From (original): https://drive.google.com/uc?id=1MCTfucrSucX6uP4z_9hm6nWqDRrNRQjI
From (redirected): https://drive.usercontent.google.com/download?id=1MCTfucrSucX6uP4z_9hm6nWqDRrNRQjI&confirm=t&uuid=96dfe677-d3f4-45fd-a187-0cd1c3ac17e4
To: /Users/irenex/Documents/GitHub/AI-4-Media-Rave-Audio-Generation/all_techno.zip
100%|██████████| 944M/944M [00:30<00:00, 30.7MB/s] 


'all_techno.zip'

In [12]:
# unzip and move dataset - piano (mac and linux)
!unzip all_piano.zip
!mv all_piano ./sounds
!rm all_piano.zip
!rm -rf __MACOSX

Archive:  all_piano.zip
   creating: all_piano/
  inflating: all_piano/Bagatelle No. 25 in A Minor (Ludwig van Beethoven).mp3  
  inflating: __MACOSX/all_piano/._Bagatelle No. 25 in A Minor (Ludwig van Beethoven).mp3  
  inflating: all_piano/Brendan_Kinsella_-_Mozart_-_Sonata_No_13_In_B_Flat_Major_K333_-_II_Andante_Cantabile(chosic.com).mp3  
  inflating: __MACOSX/all_piano/._Brendan_Kinsella_-_Mozart_-_Sonata_No_13_In_B_Flat_Major_K333_-_II_Andante_Cantabile(chosic.com).mp3  
  inflating: all_piano/Piano Sonata No. 10 in G, Op. 14, II. Andante.mp3  
  inflating: __MACOSX/all_piano/._Piano Sonata No. 10 in G, Op. 14, II. Andante.mp3  
  inflating: all_piano/Kimiko_Ishizaka_-_03_-_Variatio_2_a_1_Clav(chosic.com).mp3  
  inflating: __MACOSX/all_piano/._Kimiko_Ishizaka_-_03_-_Variatio_2_a_1_Clav(chosic.com).mp3  
  inflating: all_piano/Piano Sonata no. 16 'Facile', K. 545.mp3  
  inflating: __MACOSX/all_piano/._Piano Sonata no. 16 'Facile', K. 545.mp3  
  inflating: all_piano/Karine_Gilan

In [14]:
# unzip and move dataset - techno (mac and linux)
!unzip all_techno.zip
!mv all_techno ./sounds
!rm all_techno.zip
!rm -rf __MACOSX

Archive:  all_techno.zip
   creating: all_techno/
  inflating: all_techno/Brace.mp3    
  inflating: __MACOSX/all_techno/._Brace.mp3  
  inflating: all_techno/Bones.mp3    
  inflating: __MACOSX/all_techno/._Bones.mp3  
  inflating: all_techno/HEINOUS.mp3  
  inflating: __MACOSX/all_techno/._HEINOUS.mp3  
  inflating: all_techno/INSTINCTS.mp3  
  inflating: __MACOSX/all_techno/._INSTINCTS.mp3  
  inflating: all_techno/Witchtripper.mp3  
  inflating: __MACOSX/all_techno/._Witchtripper.mp3  
  inflating: all_techno/Bonds.mp3    
  inflating: __MACOSX/all_techno/._Bonds.mp3  
  inflating: all_techno/Worthless.mp3  
  inflating: __MACOSX/all_techno/._Worthless.mp3  
  inflating: all_techno/Violet.mp3   
  inflating: __MACOSX/all_techno/._Violet.mp3  
  inflating: all_techno/Wave.mp3     
  inflating: __MACOSX/all_techno/._Wave.mp3  
  inflating: all_techno/Remebrance.mp3  
  inflating: __MACOSX/all_techno/._Remebrance.mp3  
  inflating: all_techno/Module.mp3   
  inflating: __MACOSX/all_te

In [None]:
# # unzip and move dataset (wins)
# !tar -xf all_piano.zip
# !robocopy mnist ./sounds
# !del all_piano.zip
# !rmdir /s /q all_piano

### 2. Download Preprocessed Datasets (Audio Files)

The code in this block will download the preprocessed (merged) audio files, which are the outcomes of `01-dataset-preprocesing.ipynb', and can be directly used to in rave model training.

An alternative way is to download the datasets directly from the Google Drive, unzip and place them in the target path.
- piano dataset (6:00:35 mp3 file) - `./sounds/preprocessed_audio`: https://drive.usercontent.google.com/download?id=1oxWRdJ5xMOwgiti2n_GvoP0WVA2dPhL0&authuser=0
- techno dataset (6:40:32 mp3 file) - `./sounds/preprocessed_audio`: https://drive.usercontent.google.com/download?id=1ggWXmRKGGl3IhKWYa4zCYr1JOwqeYCSh&authuser=0

In [16]:
# download the piano audio file and move to specified path
piano_url = "https://drive.google.com/file/d/1oxWRdJ5xMOwgiti2n_GvoP0WVA2dPhL0/view?usp=sharing"
output = "piano-dataset.mp3"
gdown.download(url=piano_url, output=output, fuzzy=True)
!mv piano-dataset.mp3 ./sounds/preprocessed_audio

Downloading...
From (original): https://drive.google.com/uc?id=1oxWRdJ5xMOwgiti2n_GvoP0WVA2dPhL0
From (redirected): https://drive.usercontent.google.com/download?id=1oxWRdJ5xMOwgiti2n_GvoP0WVA2dPhL0&confirm=t&uuid=4deb6f84-ade3-45cb-b86e-62b057c4b685
To: /Users/irenex/Documents/GitHub/AI-4-Media-Rave-Audio-Generation/piano-dataset.mp3
100%|██████████| 173M/173M [00:05<00:00, 29.2MB/s] 


In [17]:
# download the techno audio file and move to specified path
piano_url = "https://drive.google.com/file/d/1ggWXmRKGGl3IhKWYa4zCYr1JOwqeYCSh/view?usp=sharing"
output = "techno-dataset.mp3"
gdown.download(url=piano_url, output=output, fuzzy=True)
!mv techno-dataset.mp3 ./sounds/preprocessed_audio

Downloading...
From (original): https://drive.google.com/uc?id=1ggWXmRKGGl3IhKWYa4zCYr1JOwqeYCSh
From (redirected): https://drive.usercontent.google.com/download?id=1ggWXmRKGGl3IhKWYa4zCYr1JOwqeYCSh&confirm=t&uuid=ca548a87-0cee-416e-acf1-8fee3296f6fa
To: /Users/irenex/Documents/GitHub/AI-4-Media-Rave-Audio-Generation/techno-dataset.mp3
100%|██████████| 211M/211M [00:07<00:00, 27.2MB/s] 


### 3. Download Trained Models (.ts)

The code in this block will download the trained models (.ts), which are the outcomes of `02-model-training.ipynb', and can be directly used for audio generation/visualisation or to be used live with Max/MSP or PureData.

Each model comes with several versions (different checkpoints) which can be used for comparison.

An alternative way is to download the models directly from the Google Drive, unzip and place them in the target path.
- piano models (epoch109/1011/2011/3013/4069) - `./models`: https://drive.usercontent.google.com/download?id=1Dgf3u0t5maErozNkQgbLfKFK-8SaoqWL&authuser=0
  - 109/1011/2011 are in training stage 1
  - 3013/4069 are in traning stage 2; 4069 at around 2m steps
- techno models (epoch119/1019/2019/3019/4059) - `./models`: https://drive.google.com/uc?id=1YDA3rG_yxjWDpZ2HghXwsVWymuHKj14-
  - 119/1019/2019 are in training stage 1
  - 3019/4059 are in traning stage 2; 4059 at around 2m steps

In [3]:
# download the piano models zip file
piano_models = "https://drive.google.com/file/d/1Dgf3u0t5maErozNkQgbLfKFK-8SaoqWL/view?usp=sharing"
output = "rav_piano_models.zip"
gdown.download(url=piano_models, output=output, fuzzy=True)

Downloading...
From (original): https://drive.google.com/uc?id=1Dgf3u0t5maErozNkQgbLfKFK-8SaoqWL
From (redirected): https://drive.usercontent.google.com/download?id=1Dgf3u0t5maErozNkQgbLfKFK-8SaoqWL&confirm=t&uuid=bb1dd15c-b132-492d-9cfb-ee67eae7146b
To: /Users/irenex/Documents/GitHub/AI-4-Media-Rave-Audio-Generation/rav_piano_models.zip
100%|██████████| 592M/592M [00:23<00:00, 25.4MB/s] 


'rav_piano_models.zip'

In [4]:
# unzip and move models to models/
!unzip rav_piano_models.zip
!mv rav_piano_models/* models/
!rm rav_piano_models.zip
!rm -rf __MACOSX
!rm -rf rav_piano_models

Archive:  rav_piano_models.zip
   creating: rav_piano_models/
  inflating: rav_piano_models/rav_piano_2001.ts  
  inflating: __MACOSX/rav_piano_models/._rav_piano_2001.ts  
  inflating: rav_piano_models/rav_piano_1011.ts  
  inflating: __MACOSX/rav_piano_models/._rav_piano_1011.ts  
  inflating: rav_piano_models/rav_piano_4069.ts  
  inflating: rav_piano_models/rav_piano_3013.ts  
  inflating: __MACOSX/rav_piano_models/._rav_piano_3013.ts  
  inflating: rav_piano_models/rav_piano_109.ts  
  inflating: __MACOSX/rav_piano_models/._rav_piano_109.ts  


In [5]:
# download the techno models zip file 
techno_models = "https://drive.google.com/file/d/1YDA3rG_yxjWDpZ2HghXwsVWymuHKj14-/view?usp=sharing"
output = "rav_techno_models.zip"
gdown.download(url=techno_models, output=output, fuzzy=True)

Downloading...
From (original): https://drive.google.com/uc?id=1YDA3rG_yxjWDpZ2HghXwsVWymuHKj14-
From (redirected): https://drive.usercontent.google.com/download?id=1YDA3rG_yxjWDpZ2HghXwsVWymuHKj14-&confirm=t&uuid=109c6f27-8491-414b-a964-51ab6486f302
To: /Users/irenex/Documents/GitHub/AI-4-Media-Rave-Audio-Generation/rav_techno_models.zip
100%|██████████| 593M/593M [00:19<00:00, 31.0MB/s] 


'rav_techno_models.zip'

In [6]:
# unzip and move models to models/
!unzip rav_techno_models.zip
!mv rav_techno_models/* models/
!rm rav_techno_models.zip
!rm -rf __MACOSX
!rm -rf rav_techno_models

Archive:  rav_techno_models.zip
   creating: rav_techno_models/
  inflating: rav_techno_models/rav_techno_3019.ts  
  inflating: __MACOSX/rav_techno_models/._rav_techno_3019.ts  
  inflating: rav_techno_models/rav_techno_1019.ts  
  inflating: rav_techno_models/rav_techno_2059.ts  
  inflating: rav_techno_models/rav_techno_4059.ts  
  inflating: __MACOSX/rav_techno_models/._rav_techno_4059.ts  
  inflating: rav_techno_models/rav_techno_119.ts  


### 4. Download Checkpoints for TensorBoard Metrics

Link - https://artslondon-my.sharepoint.com/:u:/g/personal/i_xu0320231_arts_ac_uk/EQs7GpVTkJVHqxhY9bEeN-sBvzvlNijxZ92FILbB0Y45IQ?e=sJ8QaN (10.5G) - only accessible if you hold UAL's account

In case there is a need to review all TensorBoard Metrics (Scalars), please download all the checkpoints for the two models from the above link and use Visual Studio Code (VSC) to run TensorBoard