# Background

This notebook prepares the [UrbanSound 8K dataset](https://urbansounddataset.weebly.com/) as a Huggingface Dataset. You need to download the dataset yourself (filling out the form, acknowledging thier terms, etc).

See https://huggingface.co/docs/datasets/audio_dataset#audiofolder

Note that the authors of the dataset recommend 10-fold validation when using this set. See [Examples](https://huggingface.co/docs/datasets/v2.14.5/en/package_reference/builder_classes#datasets.ReadInstruction.example)

# Conversion

In [9]:
from pathlib import Path
import pandas as pd

# Input
DATASET_PATH = Path("data/urbansound8k/")
METADATA_PATH = DATASET_PATH / "metadata/UrbanSound8K.csv"

# Output.
NEW_METADATA_PATH = DATASET_PATH / "metadata.csv"


with METADATA_PATH.open() as infile:
    lines = infile.readlines()
    with NEW_METADATA_PATH.open('w') as outfile:
        # Write the header line
        header = lines.pop(0)
        header_parts = header.split(',')
        header_parts[0] = "file_name"
        outfile.write(",".join(header_parts))

        # Writ ethe rest of the content
        for line in lines:
            parts = line.split(',')
            fillename = parts[0]
            fold = parts[5]
            parts[0] = f"audio/fold{fold}/{fillename}"
            out_line = ",".join(parts)
            outfile.write(out_line)

# Verification

Verify the huggingface dataset works.

In [28]:
from datasets import load_dataset, ReadInstruction

# dataset = load_dataset("audiofolder", data_dir=str(DATASET_PATH), name="UrbanSound8K")
# print([ dataset['train'][i]['class'] for i in range(0, 10) ])

# 10 fold validation

trains = load_dataset("audiofolder", split=[
    ReadInstruction('train', to=k, unit='%') + ReadInstruction('train', from_=k+10, unit='%')
    for k in range(0, 100, 10)
], data_dir=str(DATASET_PATH), name="UrbanSound8K")

tests = load_dataset("audiofolder", split=[
    ReadInstruction('train', from_=k, to=k+10, unit='%')
    for k in range(0, 100, 10)
], data_dir=str(DATASET_PATH), name="UrbanSound8K")

print(trains[0][0]['class'])
print(trains[1][0]['class'])



Resolving data files:   0%|          | 0/8736 [00:00<?, ?it/s]

Resolving data files:   0%|          | 0/8736 [00:00<?, ?it/s]

car_horn
dog_bark
