# CNN version Laser Machine Listener: Preprocess

Run `download.sh` before starting this, to download dataset.

In [1]:
import sys
sys.path.append('../..')
from lib_data_preprocess import *
%matplotlib inline

conf['test_size'] = 0.2
print(conf)

Using TensorFlow backend.


{'sampling_rate': 16000, 'duration': 1, 'hop_length': 253, 'fmin': 20, 'fmax': 8000, 'n_mels': 64, 'n_fft': 1280, 'audio_split': 'dont_crop', 'labels': ['background', 'cutting_in_focus', 'cutting_not_in_focus', 'marking', 'sleeping', 'waiting'], 'folder': PosixPath('.'), 'n_fold': 1, 'normalize': 'samplewise', 'valid_limit': None, 'random_state': 42, 'test_size': 0.2, 'batch_size': 32, 'learning_rate': 0.0001, 'epochs': 50, 'verbose': 2, 'best_weight_file': 'best_model_weight.h5', 'label2int': {'background': 0, 'cutting_in_focus': 1, 'cutting_not_in_focus': 2, 'marking': 3, 'sleeping': 4, 'waiting': 5}, 'num_classes': 6, 'samples': 16000, 'dims': [64, 64, 1]}


## Make a list of training samples

As listed below, this dataset is quite small:

- 6 classes
- 12 audio samples

And this dataset is following basic structure of Keras classification project; `classname/filename.wav`.

In [2]:
DATAROOT = Path('laser-machine-listener/data')
data_files = list(DATAROOT.glob('*/*.wav'))
data_files

[PosixPath('laser-machine-listener/data/sleeping/sleeping.wav'),
 PosixPath('laser-machine-listener/data/background/background.wav'),
 PosixPath('laser-machine-listener/data/marking/acrylic_marking.wav'),
 PosixPath('laser-machine-listener/data/marking/paper_marking.wav'),
 PosixPath('laser-machine-listener/data/marking/mdf_marking.wav'),
 PosixPath('laser-machine-listener/data/waiting/waiting.wav'),
 PosixPath('laser-machine-listener/data/cutting_not_in_focus/paper_cutting_not_in_focus.wav'),
 PosixPath('laser-machine-listener/data/cutting_not_in_focus/mdf_cutting_not_in_focus.wav'),
 PosixPath('laser-machine-listener/data/cutting_not_in_focus/acrylic_cutting_not_in_focus.wav'),
 PosixPath('laser-machine-listener/data/cutting_in_focus/acrylic_cutting_in_focus.wav'),
 PosixPath('laser-machine-listener/data/cutting_in_focus/mdf_cutting_in_focus.wav'),
 PosixPath('laser-machine-listener/data/cutting_in_focus/paper_cutting_in_focus.wav')]

## Preprocess

`AudioDataPreprocessor` class can read `classname/filename.wav` style dataset, preprocess them, split them into train/valid set and finally write them to files.

Preprocess converts audio into log mel-spectrogram, then populate many training samples from the one big log mel-spectrogram raw sample by random cropping. This will be done for all the original samples.

In this case, there are only 12 original audio samples. Then 480 training & 120 validation samples are created.

- One sample duration is 1 second which is defined by config.py as conf.duration.
- Sample is a three dimentional, [64 n_mels, 64 time hops, 1 channel] vector.

### This preprocessing intentionally leaks data

Usually train/valid split shall be very clean, valid data is supposed be 'unseen' one.
But this data set is a special case, all classes has only one sample from sound variations.

Some class has 3 samples but these are ones from all different variations.
So there's no way making train/valid samples from one same original audio.
This shuffles data to do that.

```Python
dm.resuffle_train_valid()
```

Usually this should be fine.

```Python
dm = AudioDataPreprocessor(conf)
dm.convert_by_exclusive_split(data_files)
# dm.resuffle_train_valid()
dm.write_all()
```

In [3]:
dm = AudioDataPreprocessor(conf)
dm.convert_by_exclusive_split(data_files)
dm.resuffle_train_valid()
dm.write_all()

[background] has 1 files.
train_valid_split [(64, 3668)] [(64, 917)]
[cutting_in_focus] has 3 files.
train_valid_split [(64, 17622), (64, 6522)] [(64, 7177)]
[cutting_not_in_focus] has 3 files.
train_valid_split [(64, 7152), (64, 6526)] [(64, 17825)]
[marking] has 3 files.
train_valid_split [(64, 7108), (64, 7096)] [(64, 5248)]
[sleeping] has 1 files.
train_valid_split [(64, 3478)] [(64, 869)]
[waiting] has 1 files.
train_valid_split [(64, 2905)] [(64, 726)]
Set labels to config.py ['background', 'cutting_in_focus', 'cutting_not_in_focus', 'marking', 'sleeping', 'waiting']
Re-shuffled train/valid split with test size = 0.2.
Train set 480 samples.
Valid set 120 samples.
Wrote preprocessed training data.


### Shapes for all X & y

In [5]:
[a.shape for a in [dm.X_train, dm.X_valid, dm.y_train, dm.y_valid]]

[(480, 64, 64, 1), (120, 64, 64, 1), (480,), (120,)]

### idx_train & idx_valid

These are used in FSDKaggle2018 dataset, works nothing for this dataset.

In [7]:
dm.y_valid, np.load(dm.f('idx_valid.npy'))

(array([1, 5, 4, 0, 2, 3, 0, 5, 0, 4, 0, 1, 1, 5, 4, 2, 0, 2, 2, 1, 0, 5,
        2, 5, 4, 2, 0, 1, 3, 5, 5, 2, 1, 3, 2, 5, 1, 1, 5, 3, 4, 0, 1, 4,
        0, 2, 5, 1, 0, 0, 3, 3, 1, 3, 5, 4, 2, 2, 1, 2, 0, 2, 4, 2, 3, 0,
        4, 5, 1, 3, 3, 0, 5, 4, 5, 2, 3, 2, 3, 5, 3, 5, 5, 4, 0, 0, 1, 2,
        4, 2, 2, 3, 0, 3, 3, 3, 2, 0, 5, 1, 1, 2, 2, 4, 0, 5, 1, 3, 3, 5,
        2, 5, 4, 5, 4, 4, 4, 1, 0, 5]),
 array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,
         13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,
         26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,
         39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,
         52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,
         65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,
         78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,
         91,  92,  93,  94,  95,  96,  97,  98,  99, 100, 101, 102, 103,
      

In [8]:
dm.y_train

array([1, 0, 4, 2, 2, 1, 4, 3, 5, 5, 1, 5, 2, 1, 5, 2, 0, 1, 3, 0, 0, 2,
       1, 1, 0, 1, 4, 5, 0, 0, 5, 2, 4, 4, 2, 3, 0, 5, 1, 1, 4, 2, 0, 5,
       4, 1, 3, 3, 3, 4, 3, 3, 2, 5, 2, 0, 0, 0, 1, 3, 1, 3, 5, 5, 5, 4,
       4, 5, 4, 3, 0, 1, 0, 2, 5, 3, 5, 4, 1, 1, 1, 5, 5, 2, 2, 3, 3, 3,
       3, 5, 3, 3, 3, 1, 2, 1, 4, 0, 2, 0, 2, 5, 0, 3, 4, 5, 4, 3, 0, 4,
       0, 5, 2, 0, 3, 0, 5, 3, 5, 3, 4, 4, 5, 1, 1, 3, 1, 5, 2, 0, 1, 0,
       2, 3, 0, 0, 4, 2, 0, 3, 3, 3, 4, 2, 3, 2, 3, 1, 2, 4, 4, 3, 0, 0,
       4, 4, 5, 0, 0, 4, 0, 0, 0, 5, 4, 5, 3, 2, 0, 5, 3, 0, 2, 0, 1, 2,
       4, 2, 1, 4, 2, 1, 4, 1, 0, 3, 2, 4, 2, 2, 1, 1, 4, 0, 4, 1, 4, 0,
       5, 4, 1, 4, 5, 0, 2, 4, 4, 4, 5, 0, 4, 0, 1, 3, 3, 3, 0, 1, 4, 4,
       3, 2, 3, 1, 3, 1, 2, 0, 5, 3, 2, 4, 1, 5, 2, 1, 5, 4, 4, 1, 1, 1,
       1, 1, 2, 5, 1, 4, 3, 3, 5, 1, 2, 2, 4, 3, 1, 5, 5, 0, 2, 4, 1, 5,
       2, 4, 0, 2, 1, 3, 4, 5, 3, 4, 4, 5, 4, 5, 3, 1, 0, 5, 1, 2, 5, 5,
       5, 4, 3, 1, 5, 2, 3, 3, 2, 3, 1, 5, 2, 4, 3,