# Introduction ✨



![](https://drive.google.com/uc?export=view&id=1TuesF83uT3BoShpMgIW5NN2itlNIwGX-)
![](https://drive.google.com/uc?export=view&id=11FoXiDS0XcQG5R9zUh6luvjfJxxQ3vYT)
![](https://drive.google.com/uc?export=view&id=145LaNxAZsxPzXoOuFOxJxak_1J90fs5l)

This notebook is a first draft for training conditioned models for the [AIDA-X](https://github.com/AidaDSP/aida-x) plugin. If run inside of Colab, it will automatically use a free Google Cloud GPU.

*** YOU WILL LIKELY NEED TO BUY UNITS TO BE ABLE TO COMPLETE THE TRAINING. ***

At the end, you'll have a custom-trained model that you can download and play directly on AIDA-X plugin.\
[DEMO VIDEO]() 🔊🔊🔊

---
This notebook relies on the worls by the [MOD Audio](https://mod.audio) and the [AIDA DSP](https://aidadsp.github.io) teams.\
Some of the code and workflow presented here is inspired by the [NAM](https://github.com/sdatkinson/neural-amp-modeler) training [colab](https://colab.research.google.com/github/sdatkinson/neural-amp-modeler/blob/main/bin/train/easy_colab.ipynb?authuser=1#scrollTo=5CQleTk7GJV8) notebook.

---  


## **Instructions** ([step-by-step video](https://www.youtube.com/watch?v=htpK0QLzeKA))

The goal is to capture your amp or pedal and the behaviour of two features, for example the Gain knob and the Tone knkob. To get there you will have to reamp your input.wav signal several times to provide the needed files listed below:

* target_gain0.0_tone1.0.wav
* target_gain0.1_tone1.0.wav
* target_gain0.2_tone1.0.wav
* target_gain0.3_tone1.0.wav
* target_gain0.4_tone1.0.wav
* target_gain0.5_tone0.0.wav
* target_gain0.5_tone0.1.wav
* target_gain0.5_tone0.2.wav
* target_gain0.5_tone0.3.wav
* target_gain0.5_tone0.4.wav
* target_gain0.5_tone0.5.wav
* target_gain0.5_tone0.6.wav
* target_gain0.5_tone0.7.wav
* target_gain0.5_tone0.8.wav
* target_gain0.5_tone0.9.wav
* target_gain0.5_tone1.0.wav
* target_gain0.6_tone1.0.wav
* target_gain0.7_tone1.0.wav
* target_gain0.8_tone1.0.wav
* target_gain0.9_tone1.0.wav
* target_gain1.0_tone1.0.wav

Correct results can be achieved with less files. At this point you will need to edit the code in the cells if you want to use less files.

Whenever you see `<- RUN CELL (►)`, you need to press the (►) next to it, to run the code that will fulfill that step.  

> The steps in this notebook are pretty straightforward:
0.   Deps 👾
1.   Set-up 👾
2.   Data 📑
3.   Model Training 🏋️‍♂️
4.   Model Evaluation 📈 (optional)
5.   Model Export ✅






# 0. Deps 👾

In [None]:
from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

if ram_gb < 20:
  print('Not using a high-RAM runtime')
else:
  print('You are using a high-RAM runtime!')
# Check PyTorch and CUDA versions
import torch
import re

pytorch_version = torch.__version__
cuda_version = torch.version.cuda

required_pytorch_version = "2.3.1"
required_cuda_version = "12.1"

def version_higher(version1, version2):
  def extract_numeric_version(version):
    return tuple(map(int, re.findall(r'\d+', version)))
  return extract_numeric_version(version1) > extract_numeric_version(version2)

if version_higher(pytorch_version, required_pytorch_version) or version_higher(cuda_version, required_cuda_version):
  print(f"WARNING: Your environment has PyTorch {pytorch_version} and CUDA {cuda_version}. This environment is not supported.")
  print("Proceeding to install required dependencies...")
  !pip3 uninstall --disable-pip-version-check -y torch torchvision torchaudio
  !pip3 install --disable-pip-version-check --no-cache-dir \
    torch==2.3.1+cu121 \
    torchvision==0.18.1+cu121 \
    torchaudio==2.3.1+cu121 \
    -f https://download.pytorch.org/whl/torch_stable.html
  print("PyTorch and CUDA versions have been set to the required versions. Please restart the runtime.")

# 1. Set-up 👾

In [None]:
#@markdown `<- RUN CELL (►)`

#@markdown This will check for GPU availability, prepare the code for you, and mount your drive.

import torch
import os
import numpy as np
import IPython
from time import sleep
import librosa

print("---")
if 'step' in locals():
  print("Ready! you can now move to step 1: DATA")
else:

  print("Checking GPU availability...", end=" ")
  if torch.cuda.is_available():
    device = torch.device("cuda")
    print("GPU available! ")
  else:
    device = torch.device("cpu")
    print("GPU unavailable, using CPU instead.")
    print("RECOMMENDED: You can enable GPU through \"Runtime\" -> \"Change runtime type\" -> \"Hardware accelerator:\" GPU -> Save")

  if any(key.startswith("COLAB_") for key in os.environ):
    if not os.path.exists("/content/Aida-x-ConditionedModelsTrainer"):
      print("Getting the code...")
      !git clone https://github.com/pilali/Aida-x-ConditionedModelsTrainer.git &>> /content/log.txt
      assert os.path.exists("/content/Aida-x-ConditionedModelsTrainer"), f"Error getting the code!"

      os.chdir('/content/Aida-x-ConditionedModelsTrainer')
      !git checkout badcat &>> /content/log.txt

#      print("Checking for code updates...")
#      !git submodule update --init --recursive &>> /content/log.txt

      print("Installing dependencies...")
      !pip3 install --disable-pip-version-check --no-cache-dir auraloss==0.4.0 &>> /content/log.txt

      print("Mounting google drive...")
      from google.colab import drive
      drive.mount('/content/drive')
    else:
      print("Code already exists. Skipping Google Drive mounting.")
  else:
    print("Not running on Google Colab. Skipping Colab-specific setup.")

  # Adjust the env
  os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:2"

  from colab_functions import wav2tensor, extract_best_esr_model, create_csv_aidax
  from prep_wav_alt import WavParse
  import plotly.graph_objects as go
  from CoreAudioML.networks import load_model
  import CoreAudioML.miscfuncs as miscfuncs
  if any(key.startswith("COLAB_") for key in os.environ):
    from google.colab import files
  import io
  import shutil
  os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

  step = 0
  print()
  print("Ready! you can now move to step 2: DATA")

# 2. The Data (upload + preprocessing) 📑

### Step 2.1: Download the capture signal
Download the pre-crafted "capture signal" called [input.wav](https://drive.google.com/file/d/1TNpaPPc9tdCu6OA1VETWvufc7wG2nQTJ/view?usp=sharing) from the provided link.

### Step 2.2 Reamp your gear
Use the downloaded capture signal to reamp the gear that you want to model. Record the output and save it as "target.wav".
For a detailed demonstration of how to reamp your gear using the capture signal, refer to this [video tutorial](https://youtu.be/lrvuODtk9W0?t=70) starting at 1:10 and ending at 3:44.

In [None]:
#@markdown `<- RUN CELL (►)`

#@markdown Step 2.3 upload
#@markdown ---
#@markdown * In drive, put the audio files with which you would like to train in a single folder.
#@markdown  * `input.wav` : contains the reference (dry/DI) sound.
#@markdown  * `target_gainX.X_toneX.X.wav` : contains the target (amped/with effects and selected 2 settins) sound.
#@markdown You will need 20 files, named as listed below: 
#@markdown  * target_gain0.0_tone1.0.wav
#@markdown  * target_gain0.1_tone1.0.wav
#@markdown  * target_gain0.2_tone1.0.wav
#@markdown  * target_gain0.3_tone1.0.wav
#@markdown  * target_gain0.4_tone1.0.wav
#@markdown  * target_gain0.5_tone0.0.wav
#@markdown  * target_gain0.5_tone0.1.wav
#@markdown  * target_gain0.5_tone0.2.wav
#@markdown  * target_gain0.5_tone0.3.wav
#@markdown  * target_gain0.5_tone0.4.wav
#@markdown  * target_gain0.5_tone0.5.wav
#@markdown  * target_gain0.5_tone0.6.wav
#@markdown  * target_gain0.5_tone0.7.wav
#@markdown  * target_gain0.5_tone0.8.wav
#@markdown  * target_gain0.5_tone0.9.wav
#@markdown  * target_gain0.5_tone1.0.wav
#@markdown  * target_gain0.6_tone1.0.wav
#@markdown  * target_gain0.7_tone1.0.wav
#@markdown  * target_gain0.8_tone1.0.wav
#@markdown  * target_gain0.9_tone1.0.wav
#@markdown  * target_gain1.0_tone1.0.wav
#@markdown * Use the file browser in the left panel to find a folder with your audio, right-click **"Copy Path", paste below**, and run the cell.
#@markdown  * ex. `/content/Aida-x-ConditionedModelsTrainer/Recordings`
DATA_DIR = '' #@param {type: "string"}

assert 'step' in locals(), "Please run the code in the introduction section first!"
print("---")
assert DATA_DIR != '', "Please input a path for your DATA_DIR"
assert os.path.exists(DATA_DIR), f"Drive Folder Doesn\'t Exists: {DATA_DIR}"
# assert set(["input.wav", "target.wav"]) <= set([x.lower() for x in os.listdir(DATA_DIR)]), \
#  "Make sure you have \"input.wav\" and \"target_*.wav\" inside your data folder"

# Copy the files to /content/ and overwrite if they already exist using bash commands
destination_dir = "/content/Aida-x-ConditionedModelsTrainer"
input_path = os.path.join(destination_dir, "input.wav")
# target_path = os.path.join(destination_dir, "target.wav")
targetlevel10edge10path = os.path.join(destination_dir, "target_level1.0_edge1.0.wav")
targetlevel10edge00path = os.path.join(destination_dir, "target_level1.0_edge0.0.wav")
targetlevel09edge09path = os.path.join(destination_dir, "target_level0.9_edge0.9.wav")
targetlevel09edge01path = os.path.join(destination_dir, "target_level0.9_edge0.1.wav")
targetlevel01edge01path = os.path.join(destination_dir, "target_level0.1_edge0.1.wav")
targetlevel02edge08path = os.path.join(destination_dir, "target_level0.2_edge0.8.wav")
targetlevel03edge03path = os.path.join(destination_dir, "target_level0.3_edge0.3.wav")
targetlevel04edge07path = os.path.join(destination_dir, "target_level0.4_edge0.7.wav")
targetlevel00edge05path = os.path.join(destination_dir, "target_level0.0_edge0.5.wav")
targetlevel05edge00path = os.path.join(destination_dir, "target_level0.5_edge0.0.wav")
targetlevel06edge02path = os.path.join(destination_dir, "target_level0.6_edge0.2.wav")
targetlevel05edge01path = os.path.join(destination_dir, "target_level0.5_edge0.1.wav")
targetlevel05edge02path = os.path.join(destination_dir, "target_level0.5_edge0.2.wav")
targetlevel05edge03path = os.path.join(destination_dir, "target_level0.5_edge0.3.wav")
targetlevel05edge04path = os.path.join(destination_dir, "target_level0.5_edge0.4.wav")
targetlevel05edge05path = os.path.join(destination_dir, "target_level0.5_edge0.5.wav")
targetlevel05edge06path = os.path.join(destination_dir, "target_level0.5_edge0.6.wav")
targetlevel05edge07path = os.path.join(destination_dir, "target_level0.5_edge0.7.wav")
targetlevel05edge08path = os.path.join(destination_dir, "target_level0.5_edge0.8.wav")
targetlevel05edge09path = os.path.join(destination_dir, "target_level0.5_edge0.9.wav")
targetlevel05edge10path = os.path.join(destination_dir, "target_level0.5_edge1.0.wav")

!cp -f "{os.path.join(DATA_DIR, 'input.wav')}" "{input_path}"
print(f"File copied: {input_path}")

!cp -f "{os.path.join(DATA_DIR, 'target_level1.0_edge1.0.wav')}" "{targetlevel10edge10path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level1.0_edge0.0.wav')}" "{targetlevel10edge00path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.9_edge0.9.wav')}" "{targetlevel09edge09path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.9_edge0.1.wav')}" "{targetlevel09edge01path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.1_edge0.1.wav')}" "{targetlevel01edge01path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.2_edge0.8.wav')}" "{targetlevel02edge08path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.3_edge0.3.wav')}" "{targetlevel03edge03path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.4_edge0.7.wav')}" "{targetlevel04edge07path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.0_edge0.5.wav')}" "{targetlevel00edge05path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.5_edge0.0.wav')}" "{targetlevel05edge00path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.6_edge0.2.wav')}" "{targetlevel06edge02path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.5_edge0.1.wav')}" "{targetlevel05edge01path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.5_edge0.2.wav')}" "{targetlevel05edge02path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.5_edge0.3.wav')}" "{targetlevel05edge03path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.5_edge0.4.wav')}" "{targetlevel05edge04path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.5_edge0.5.wav')}" "{targetlevel05edge05path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.5_edge0.6.wav')}" "{targetlevel05edge06path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.5_edge0.7.wav')}" "{targetlevel05edge07path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.5_edge0.8.wav')}" "{targetlevel05edge08path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.5_edge0.9.wav')}" "{targetlevel05edge09path}"
!cp -f "{os.path.join(DATA_DIR, 'target_level0.5_edge1.0.wav')}" "{targetlevel05edge10path}"

print(f"Files copied.")

#@markdown Choose the Model type you want to train:\
#@markdown Generally, the heavier the model the more accurate it is, but also the more CPU it consumes.
#@markdown Here's a list of approximate CPU consumption of each model type on a [MOD Dwarf](https://mod.audio/dwarf/):
#@markdown * Lightest: 25% CPU
#@markdown * Light: 30% CPU
#@markdown * Standard: 37% CPU
#@markdown * Heavy: 46% CPU
model_type = "Standard" #@param ["Lightest", "Light", "Standard", "Heavy", "BadCat"]

if model_type == "Lightest":
  config_file = "LSTM-8-1"
elif model_type == "Light":
  config_file = "LSTM-12-1"
elif model_type == "Standard":
  config_file = "LSTM-16-1"
elif model_type == "Heavy":
  config_file = "LSTM-20-1"
elif model_type == "BadCat":
  config_file = "Bad_Cat_CH1"

# Create the CSV and parse the WAV files
create_csv_aidax("/content/Aida-x-ConditionedModelsTrainer/Configs/Csv/modaudioug.csv")
WavParse(load_config=config_file, config_location='/content/Aida-x-ConditionedModelsTrainer/Configs', norm=False, denoise=False)

step = max(step, 1)
print()
print("Data prepared! You can now move to step 3: TRAINING")

# 3. Model Training 🏋️‍♂️

In [None]:
#@markdown `<- RUN CELL (►)`

#@markdown Training usually takes around 10 minutes,
#@markdown but this can change depending on the duration of
#@markdown the training data that you provided and the model_type
#@markdown you choose.\
#@markdown Note that training doesn't always lead to the same results.
#@markdown You may want to run it a couple of times and compare the results.

# #@markdown Choose the Model type you want to train:\
# #@markdown Generally, the heavier the model the more accurate it is, but also the more CPU it consumes.
# #@markdown Here's a list of approximate CPU consumption of each model type on a [MOD Dwarf](https://mod.audio/dwarf/):
# #@markdown * Lightest: 25% CPU
# #@markdown * Light: 30% CPU
# #@markdown * Standard: 37% CPU
# #@markdown * Heavy: 46% CPU
# model_type = "Standard" #@param ["Lightest", "Light", "Standard", "Heavy"]
#@markdown Some training hyper parameters
#@markdown (Recommended: ignore and continue with default values):
skip_connection = "OFF" #@param ["ON", "OFF"]
epochs = 200 #@param {type:"slider", min:100, max:2000, step:20}
print("---")

# if model_type == "Lightest":
#   config_file = "LSTM-8-1"
# elif model_type == "Light":
#   config_file = "LSTM-12-1"
# elif model_type == "Standard":
#   config_file = "LSTM-16-1"
# elif model_type == "Heavy":
#   config_file = "LSTM-20-1"

if skip_connection == "ON":
  skip_con = 1
else:
  skip_con = 0

assert 'step' in locals(), "Please run the code in the introduction section first!"
assert step>=1, "Please execute the \"1.DATA\" cell code to prepare the data for the training!"

!python3 dist_model.py -l "$config_file" -lm 0 -sc $skip_con -eps $epochs

sleep(1)
model_dir = f"/content/Aida-x-ConditionedModelsTrainer/Results/MOD-AUDIO-UG"
step = max(step, 2)
print("Training done!\nESR after training: ", extract_best_esr_model(model_dir)[1])
print("You can now move to step 4: EVALUATION or directly to step 5: EXPORT")

# 4. Model Evaluation 📈


In [None]:
#@markdown `<- RUN CELL (►)`

#@markdown Here you can visualize and listen to the output of your trained model on the data you provided earlier.

assert 'step' in locals(), "Please run the code in the introduction section first!"
assert step>=1, "Please execute the \"1.DATA\" cell code to prepare the data for the training!"
assert step>=2, "Please execute the \"2.TRAINING\" cell code to train a model for evaluation!"

print("---")
# Find the file with .full_name extension in model_dir
full_name_file = [f for f in os.listdir(model_dir) if f.endswith('.full_name')]
assert len(full_name_file) == 1, "There should be exactly one file with the .full_name extension in the model_dir."

# Remove the .full_name extension to create the model_filename
model_filename = os.path.splitext(full_name_file[0])[0] + '.aidax'

# Extract the best model available from training results
model_path, esr = extract_best_esr_model(model_dir)
model_data = miscfuncs.json_load(model_path)
model = load_model(model_data).to(device)

full_dry = wav2tensor(f"/content/Aida-x-ConditionedModelsTrainer/Data/test/aidadsp-auto-input.wav")
full_amped = wav2tensor(f"/content/Aida-x-ConditionedModelsTrainer/Data/test/aidadsp-auto-target.wav")

samples_viz = 24000
duration_audio = 5
seg_length = int(duration_audio * 48000)
start_sample = np.random.randint(len(full_dry)-duration_audio*48000)
dry = full_dry[start_sample:start_sample+seg_length]
amped = full_amped[start_sample:start_sample+seg_length]
with torch.no_grad():
  modeled = model(dry[:, None, None].to(device)).cpu().flatten().detach().numpy()

print(f"Current model: {model_filename}")
print(f"ESR:", esr)
# Visualization
fig = go.Figure()
fig.add_trace(
  go.Scatter(
    x=list(np.arange(len(dry[:samples_viz]))/48000), y=dry[:samples_viz],
    name="dry", mode='lines'
  )
)
fig.add_trace(
  go.Scatter(
    x=list(np.arange(len(amped[:samples_viz]))/48000), y=amped[:samples_viz],
    name="target", mode='lines'
  )
)
fig.add_trace(
  go.Scatter(
    x=list(np.arange(len(modeled[:samples_viz]))/48000), y=modeled[:samples_viz],
    name="prediction", mode='lines'
  )
)
fig.update_layout(
  title="Dry vs Target vs Predicted signal",
  xaxis_title="Time (s)",
  yaxis_title="Signal Amplitude",
  legend_title="Signal",
)
fig.show()

# Listen
print("DRY Signal:")
IPython.display.display(IPython.display.Audio(data=dry, rate=48000))

print("TARGET Signal:")
IPython.display.display(IPython.display.Audio(data=amped, rate=48000))

print("PREDICTED Signal:")
IPython.display.display(IPython.display.Audio(data=modeled, rate=48000))

print("Difference Signal:")
difference_signal = np.array(amped) - np.array(modeled)
IPython.display.display(IPython.display.Audio(data=difference_signal, rate=48000))

# Cleanup
del dry, amped, modeled, full_dry, full_amped, model
torch.cuda.empty_cache()

step = max(step, 3)

In [None]:
#@markdown `<- RUN CELL (►)`

#@markdown Here you can **upload** your own dry guitar files, and listen to the predicted output of the model.
assert 'step' in locals(), "Please run the code in the introduction section first!"
assert step>=1, "Please execute the \"1.DATA\" cell code to prepare the data for the training!"
assert step>=2, "Please execute the \"2.TRAINING\" cell code to train a model for evaluation!"

print("---")

if any(key.startswith("COLAB_") for key in os.environ):
  uploaded = files.upload()
print()
print("Running predictions:")

for k, v in uploaded.items():
  print("#####", k)
  dry = wav2tensor(io.BytesIO(v))
  with torch.no_grad():
    modeled = model(dry[:, None, None].to(device)).cpu().flatten().detach().numpy()

  print("DRY Signal:")
  IPython.display.display(IPython.display.Audio(data=dry, rate=48000))

  print("PREDICTED Signal:")
  IPython.display.display(IPython.display.Audio(data=modeled, rate=48000))

  # Cleanup
  del dry, modeled
  torch.cuda.empty_cache()

step = max(step, 3)

# 5. Model Export ✅

In [None]:
#@markdown `<- RUN CELL (►)`

#@markdown Download a .aidax file summarizing the model that you just trained.

#@markdown You can then upload it to AIDA-X model loader plugin and run it in real-time.

assert 'step' in locals(), "Please run the code in the introduction section first!"
assert step>=1, "Please execute the \"1.DATA\" cell code to prepare the data for the training!"
assert step>=2, "Please execute the \"2.TRAINING\" cell code to train a model for evaluation!"

print("---")
# Find the file with .full_name extension in model_dir
full_name_file = [f for f in os.listdir(model_dir) if f.endswith('.full_name')]
assert len(full_name_file) == 1, "There should be exactly one file with the .full_name extension in the model_dir."

# Remove the .full_name extension to create the model_filename
model_filename = os.path.splitext(full_name_file[0])[0] + '.aidax'

print("Generating model file:", model_filename)

# Extract the best model available from training results
model_path, esr = extract_best_esr_model(model_dir)
!python3 modelToRTNeural.py -l "$config_file" -ax

# Define the destination directory
destination_path = os.path.join(DATA_DIR, model_filename)

# Copy the generated file to the destination directory
!cp "{os.path.join(model_dir, 'model_rtneural.aidax')}" "{destination_path}"

if any(key.startswith("COLAB_") for key in os.environ):
  from google.colab import files
  files.download(destination_path)

print()
print("Model file saved to:", destination_path)
step = max(step, 4)