# Overview

## Goals

- Train a deep neural network!
- On not-a-toy-problem



### Bonus Goals
- Get `> 80%` accuracy, or
- Explain `0.687`

## Warnings

- Not much theory or explanation
- Requires motivation and inquisitiveness
- Reading and running someone else's code (aka Programming)

### Worse

Reading and running a researcher's code

## Tools

+ `docker`, `conda` & `git` - for reproducible setups
    * hopefully, without breaking your computer
+ `python3` - programming language of choice
+ `jupyter` - reproducible workflows
+ `keras` - frontend for deep learning
+ `tensorflow` - backend for ML, esp. deep learning
+ `numpy` - non-ML numerical processing 
+ `librosa` - audio processing
+ [`audioset`](https://research.google.com/audioset/) - for embeddings (more later)

### Programming

Transforming data from one form to another

```
INPUTS ---magic---> OUTPUTS
```

where:

`magic` is a very specific set of instructions

## Audio Segmentation


- `INPUT` - audio
- `OUTPUT` - frame-wise labels

### `magic`

- Metric based
- Model based (today)

## Model Based Audio Segmentation

- Acquire and Prepare Training Data
    + `TRN`, `VAL`, `TST`
    + Keep `TST` + `VAL` to be ~30% of total
- Train a suitable model with `TRN`, regularly validating with `VAL`
    + Classifier, Regressor, etc.
    + Algorithm specific preprocessing might be necessary
- Test on `TST` and in your final application scenario

## Acquire and Prepare Data

`01-data-acquisition-analysis-splitting.ipynb`

## Preprocessing 

### Feature Extraction

Extract [VGGish](https://github.com/tensorflow/models/tree/master/research/audioset) embeddings for each audio in the splits and save separate `tfrecord` for each split.
+ Load audio into 16KHz, normalized, single-channel array
+ Extract log-mel-spectograms on Hann windows of 0.025s, every 0.010s 
+ Prepare frames of size 0.96s for VGGish, every 0.12s 
+ Get 128-dimensional embeddings every 0.12s from VGGish
+ Relevant files: 
    * `02-feature-extraction-vggish_embedding.py`
    * `feat_ext.py` - for reusable functions/classes

## Supervised Training Workflow

What we will be doing today

[picture](https://goo.gl/images/J4V6mu)

- *Make sure your computer won't run out of power*
- Duplicate the base notebook `03-training-00-keras_2mlp_clsw0.ipynb`
    - Rename to `03-training-<nn>-keras_<model_info>.ipynb`
    - Modify and train a [Multi-Layer Perceptron (MLP)](https://www.google.de/search?q=multilayer+perceptron)
    - Evaluate on the validation split
    - Save learned weights