<a href="https://colab.research.google.com/github/dusarp/dance-bits-experiments/blob/main/spectrogram_dataloader.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import os
import numpy as np
import torch
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms

In [None]:
class MelSpectrogramDataset(Dataset):
    def __init__(self, data_dir, transform=None):
        self.data_dir = data_dir
        self.transform = transform
        self.file_list = [f for f in os.listdir(data_dir) if f.endswith('.npy')]

    def __len__(self):
        return len(self.file_list)

    def __getitem__(self, idx):
        file_path = os.path.join(self.data_dir, self.file_list[idx])
        mel_spectrogram = np.load(file_path)

        # Convert to torch tensor and add channel dimension
        mel_spectrogram = torch.from_numpy(mel_spectrogram).float().unsqueeze(0)

        if self.transform:
            mel_spectrogram = self.transform(mel_spectrogram)

        # For this example, we'll use random labels. Replace this with your actual labels.
        label = torch.randint(0, 10, (1,)).item()

        return mel_spectrogram, label

# Define any transformations you want to apply to your data
transform = transforms.Compose([
    transforms.Normalize(mean=[0.5], std=[0.5])
])

# Create the dataset
dataset = MelSpectrogramDataset(data_dir='path/to/your/npy/files', transform=transform)

# Create the data loader
batch_size = 32
data_loader = DataLoader(dataset, batch_size=batch_size, shuffle=True, num_workers=4)

# Example usage in a training loop
for epoch in range(num_epochs):
    for batch_mels, batch_labels in data_loader:
        # Your training code here
        # batch_mels shape: (batch_size, 1, mel_freq_bins, time_steps)
        # batch_labels shape: (batch_size,)
        pass

Here's a breakdown of the code:

1. We define a custom `MelSpectrogramDataset` class that inherits from `torch.utils.data.Dataset`.

2. The `__init__` method initializes the dataset with the directory containing the .npy files and any transformations to be applied.

3. `__len__` returns the total number of samples in the dataset.

4. `__getitem__` loads a single mel spectrogram from a .npy file, converts it to a PyTorch tensor, adds a channel dimension, applies any transformations, and returns the mel spectrogram along with a label.

5. We create an instance of the dataset with a specified directory and transformations.

6. We create a `DataLoader` that will handle batching, shuffling, and parallel data loading.

7. In the example usage section, we show how you might use this data loader in a training loop.

To use this data loader with your CNN:

1. Replace `'path/to/your/npy/files'` with the actual path to your directory containing the .npy files.

2. Modify the label generation in `__getitem__` to use your actual labels instead of random ones.

3. Adjust the `batch_size`, `num_workers`, and other parameters of the `DataLoader` as needed for your specific use case.

4. Use the `data_loader` in your training loop to feed batches of mel spectrograms and labels to your CNN.

This data loader will efficiently load your mel spectrograms from .npy files and prepare them for training your convolutional neural network.