# Sound Demo 1
This demo illustrates how to train a binary and a multilabel sound classifier using your microphone

#### run global setup

In [None]:
try:
    with open("../global_setup.py") as setupfile:
        exec(setupfile.read())
except FileNotFoundError:
    print('Setup already completed')

#### run local setup

In [None]:
from notebooks.experiments.src.sound_demos.live_predictions import LivePredictions, run_livepred
from notebooks.experiments.src.sound_demos.multilabel_classifier import Recorder, SoundClassifier
from src.audio.mini_recorder import miniRecorder

# Binary classifier
A lot of existing models for training sound recognition...  
Use a pretrained model to construct a binary sound classifier...  
First we create or own dataset as basis for training. The below cell will start a recording process where you first record $n$ examples of the first class, followed by $n$ examples of the second class. Whenever a recording is finished the next one starts immediately after.  
Things to keep in mind
- You can decide to just make the sound of class 0 throughout the recording time for class 0, or you can try to match exactly one example of this sound for each recording. Whatever you choose, make sure to do the same for the second class. What do you think will happen if there is a lot of silence in class 0 recordings, but not in class 1 recordings?
- How many examples do we need in each class in order to get a good classifier?
- Will the length of the recorded examples affect performance?
- How will background noise affect performance?
- What happens (should happen) at test-time if sounds from both classes are present in a recording?

### Create dataset

In [None]:
recorder = Recorder(n_classes=2, n_files = 12, prefix='binary', wav_dir='/Users/nbip/proj/dtu/dtu-bach/dev/DataScienceVM/Audio/nbip_sounds3')
recorder.record(seconds=2)
data = recorder.create_dataset()

Or if you already recorded a dataset just use this

In [None]:
recorder = Recorder(n_classes=2, n_files = 12, prefix='binary', wav_dir='/Users/nbip/proj/dtu/dtu-bach/dev/DataScienceVM/Audio/nbip_sounds3')
data = recorder.create_dataset()

### Train the binary classifier

In [None]:
binary_classifier = SoundClassifier(mix=False)
binary_classifier.train(data=data)
binary_classifier.plot_training()

### Test the trained model
Now we have a model that is trained to discriminate between two sounds. Try to make a recording of sound from one of the classes (or something completely different) and see what it is classified as by the model

In [None]:
rec = miniRecorder(seconds=1.5)
_ = rec.record()

In [None]:
binary_classifier.predict(sound_clip=rec.sound)

In [None]:
rec.playback()

### Live Predictions
Let us use the live spectrogram to visualize the sound input to the computer microphone continuously and get running predictions from the model.
What happens?
- Does your binary classifier work?
- Does the model predict one of the classes even when there is silence / background noise? Why? Do you have any ideas how to mitigate this?
- What happens if you produce sound from both classes at the same time? What should ideally happen? 

In [None]:
run_livepred(predictor=binary_classifier)

### Between class examples
In order to help the model when sounds from both classes are present we can do some data augmentation:
- Augment the training data by adding two sounds from different classes at a random ratio so that $\text{mixed_sample} = rx_1 + (1-r)x_2$
- The onehot encoding of the new label is $[r, (1-r)]$

You can do this by changing mix=False to mix=True in the above specification of the SoundClassifier

# Multilabel classifier
We can of course have an arbitrary number of sounds from different classes present in one recording. We now train a multilabel classifier to identify which classes are present. Again, you create the dataset yourself, so maybe in the interest of time go for 3 or 4 classes and not 50.

### Create dataset

In [None]:
recorder = Recorder(n_classes=4, n_files = 12, prefix='multi', wav_dir='/Users/nbip/proj/dtu/dtu-bach/dev/DataScienceVM/Audio/nbip_sounds3')
recorder.record(seconds=2)
data = recorder.create_dataset()

Or if you already have recorded a dataset

In [None]:
recorder = Recorder(n_classes=4, n_files = 12, prefix='multi', wav_dir='/Users/nbip/proj/dtu/dtu-bach/dev/DataScienceVM/Audio/nbip_sounds3')
data = recorder.create_dataset()

### Train the multilabel classifier

In [None]:
multi_classifier = SoundClassifier(mix=False)
multi_classifier.train(data=data)
multi_classifier.plot_training()

In [None]:
run_livepred(predictor=multi_classifier)