## How do we convert our audio files to image files?

We should now have some augmented audio files in `/workspace/data/augmented_audio`, but our model training notebooks work with images. This notebook will create these audio files for us and place them in a folder named `lakota_data`.  
  
The cell below provide the tools to create the necessary images. `Librosa` is a popular library for carrying out this task.

In [None]:
import librosa
import librosa.display
import matplotlib.pyplot as plt
import numpy as np
from pathlib import Path
import os

The function below `audio_to_melspectrogram_image()` is the function that will create the images for us, and we have some options that you may want to modify that will influence the image that gets produced.  

In particular, the parameter called `cmap` in the line  
`librosa.display.specshow(mel_db, sr=sr, fmax=fmax, cmap='viridis')`.  
The `cmap` parameter controls the color map of the images and an explanation of the options is below.  
  
The default option selected is `cmap='viridis'`, but you can use any of the options below to create different spectrograms for your model training.  

🔵 Perceptually Uniform Colormaps (Good for ML and clarity)
 - 'viridis' – (default you’re using) good contrast, colorblind-friendly.  
 - 'plasma' – vibrant and high-contrast.  
 - 'inferno' – dark background, high dynamic range.  
 - 'magma' – smoother than inferno, darker background.  
 - 'cividis' – designed for color vision deficiency.  

🌈 Traditional/Classic Colormaps
 - 'jet' – very colorful, but not perceptually uniform (can mislead perception).  
 - 'hot' – black to red to yellow to white.  
 - 'cool' – cyan to magenta gradient.  
 - 'spring', 'summer', 'autumn', 'winter' – seasonal gradient palettes.  

⚫ Grayscale and Inverted
 - 'gray' – grayscale (black = low, white = high).  
 - 'bone', 'binary', 'gist_yarg' – variations of grayscale.  
 - 'Greys', 'Purples', 'Blues', 'Oranges', 'Reds' – monochrome ramps.  
 - 'copper', 'pink', 'coolwarm' – alternative tones with visual flair.  

🔁 Inverted Colormaps

You can invert any colormap by appending `_r`, e.g.:
 - 'viridis_r'  
 - 'inferno_r'  
 - 'gray_r'

In [None]:
def audio_to_melspectrogram_image(audio_path, output_path, sr=16000, n_mels=128, fmax=8000):
    y, sr = librosa.load(audio_path, sr=sr)
    mel = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=n_mels, fmax=fmax)
    mel_db = librosa.power_to_db(mel, ref=np.max)

    plt.figure(figsize=(2.24, 2.24))  # For 224x224 px output
    librosa.display.specshow(mel_db, sr=sr, fmax=fmax, cmap='viridis')
    plt.axis('off')
    plt.tight_layout()
    plt.savefig(output_path, bbox_inches='tight', pad_inches=0)
    plt.close()
    print(f"Saved: {output_path}")

The cell below will take our augmented audio files and produce all of the images that we will need for training.

In [None]:
input_dir = "/workspace/data/augmented_audio"
output_dir = "/workspace/data/lakota_data"
os.makedirs(output_dir, exist_ok=True)

input_root = Path(input_dir)
input_files = list(input_root.rglob("*.wav"))
print(f"Found {len(input_files)} audio files in {input_dir}")

for root, _, files in os.walk(input_dir):
    for fname in files:
        if fname.endswith(".wav"):
            rel_path = os.path.relpath(root, input_dir)  # e.g., 'red' or 'yellow'
            input_path = os.path.join(root, fname)

            # Construct output path with same subdir structure
            target_dir = os.path.join(output_dir, rel_path)
            os.makedirs(target_dir, exist_ok=True)

            output_path = os.path.join(target_dir, fname.replace(".wav", ".png"))
            audio_to_melspectrogram_image(input_path, output_path)
            
print(f"\n✅ Done. Augmented files saved to: {output_dir}")

And finally, the cell below will create a zip file of all the data and place it in `/workspace/data` for us. This is useful for backing up and possibly sharing our data.

In [None]:
import shutil

shutil.make_archive("/workspace/data/lakota_data", 'zip', "/workspace/data/lakota_data")