# Features extraction
The dataset is created by using the Features class.
Each audio file il loaded in memory and the following features are extracted: 
- mfcc 
- chroma
- rms

Every feature array is then reduced with the following functions: 
- min
- max
- median
- mean

Results are concatenated and a total of 132 features are extracted from each audio.

## Structure

The dataset is organized in this structure
$$\mathit{class}, \; \mathit{feature}_1, \; \dots, \; \mathit{feature}_n$$

## Dask speed up
To speed up the computation Dask is used. 
A total of 4 workers works in parallel to extract features more efficiently, reducing the time on a single fold from about 70 seconds to just under 30.

## Training dataset
The first step is to get the training dataset, the considered folds are the first four and the sixth. The total number of samples in the obtained dataset is 4499.


In [1]:
import sys
sys.path.append("..")
from src.data import Features
import pandas as pd
import numpy as np

f = Features(metadata_path="../data/raw/metadata/UrbanSound8K.csv",
             audio_files_path="../data/raw/audio",
             save_path="../data/processed",
             save_name="train",
             folds=[1,2,3,4,6],
             workers=4)

training_dataframe = f.get_dataframe()

In [2]:
training_dataframe

Unnamed: 0,class,f_0,f_1,f_2,f_3,f_4,f_5,f_6,f_7,f_8,...,f_122,f_123,f_124,f_125,f_126,f_127,f_128,f_129,f_130,f_131
0,3.0,-592.183899,-162.788498,-344.231689,-329.880310,0.000000,229.762238,139.087418,140.868393,-148.199005,...,0.220979,0.297263,0.009965,1.0,0.303738,0.336505,0.000042,0.289709,0.015952,0.047359
1,3.0,-453.945038,-161.955887,-364.052826,-346.856628,82.226860,226.205078,131.974197,141.269028,-144.834259,...,0.203320,0.261760,0.008947,1.0,0.290658,0.342790,0.002271,0.289867,0.011760,0.047650
2,3.0,-448.205292,-176.065674,-373.497986,-350.400360,73.434738,221.718552,126.712044,135.250168,-133.286484,...,0.284472,0.369487,0.006988,1.0,0.268131,0.302224,0.002280,0.272627,0.009693,0.042375
3,3.0,-444.168701,-173.939423,-369.273560,-346.233246,70.035751,220.699463,126.784042,132.986328,-134.645599,...,0.247804,0.303319,0.003098,1.0,0.252041,0.303084,0.002420,0.341308,0.012339,0.052789
4,3.0,-713.720947,-98.949768,-242.909470,-255.045990,0.000000,191.614868,110.675262,111.643570,-165.848145,...,0.400219,0.447586,0.000000,1.0,0.505863,0.526202,0.000000,0.128447,0.027009,0.036163
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4494,9.0,-262.955444,-101.359467,-184.823181,-190.716583,134.389908,200.986298,174.159607,173.429230,-67.813835,...,1.000000,0.782015,0.003078,1.0,0.319475,0.338576,0.030073,0.423040,0.296085,0.263723
4495,9.0,-326.015259,-142.138199,-221.734146,-219.478958,121.905540,205.910110,179.783905,176.296677,-89.379761,...,1.000000,0.823710,0.000525,1.0,0.356601,0.377291,0.025899,0.379670,0.165430,0.161492
4496,9.0,-272.998810,-89.848122,-164.126770,-175.900131,158.483871,230.669327,187.851624,189.519730,-71.425461,...,0.292984,0.395287,0.005395,1.0,0.257566,0.344914,0.025470,0.357341,0.167323,0.160282
4497,9.0,-263.229767,-62.980148,-155.296738,-167.399094,145.786682,220.202484,183.713654,185.057480,-68.326485,...,0.170528,0.275586,0.002127,1.0,0.214822,0.298109,0.029145,0.383787,0.194611,0.175971


In [3]:
f.save_dataframe(training_dataframe)

## Test datasets

After getting the training set, multiple test sets are obtained from the other folds.

In [None]:

for fold in [5, 7, 8, 9, 10]:
    print(f"Processing fold {fold}")
    f = Features(metadata_path="../data/raw/metadata/UrbanSound8K.csv",
             audio_files_path="../data/raw/audio",
             save_path="../data/processed",
             save_name=f"test_{fold}",
             folds=[fold],
             workers=4)
    
    f.save_dataframe(f.get_dataframe())