# M5 Model

This notebook **heavily** references the tutorial: *Speech Command Classification with torchaudio*.
- [Tutorial](https://pytorch.org/tutorials/intermediate/speech_command_classification_with_torchaudio_tutorial.html)

## Derivations
We simplified the tutorial by:
- Wrapping the PyTorch model in PyTorch-Lightning, abstracting away all low-level nitty-gritty loops
- The M5 Model is simplified in code by wrapping common blocks to avoid redundant code: See `cnn_m5.py`
- Using PyTorch-Lightning `Trainer`, we limit the number of batches, to make it run quicker

## M5 Model Reference
[Research Paper](https://arxiv.org/pdf/1610.00087.pdf)
Dai, Wei, et al. "Very deep convolutional neural networks for raw waveforms." 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 2017.

## Results
Through these settings below, we're able to yield 77% accuracy after 5 epochs, see the TensorBoard for history.
The model has not reach states of overfit, instead, was cut short as it takes a long time.

In [1]:
import pytorch_lightning as pl
import torch

from src.model.cnn_m5 import CNN_M5
from src.model.lit_wrapper import LitWrapper
from src.speech_command_dataset import SpeechCommandDataset

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using {device} Backend!")

ds = SpeechCommandDataset(batch_size=256, num_workers=4)  #, dl_kwargs={'pin_memory': True})
train_dl, val_dl, test_dl = ds.dls()

model = LitWrapper(CNN_M5(len(ds.classes)), ds.classes, lr=0.01)

trainer = pl.Trainer(
    default_root_dir="cnn_m5/",
    max_epochs=5,
    limit_val_batches=192,
    limit_predict_batches=1,
    # profiler='simple'
    # fast_dev_run=True
)

trainer.fit(model, train_dataloaders=train_dl, val_dataloaders=val_dl)
# pred = trainer.predict(model, dataloaders=test_dl)

Using cuda Backend!


GPU available: True (cuda), used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
  rank_zero_warn(
`Trainer(limit_predict_batches=1)` was configured so 1 batch will be used.

  | Name  | Type   | Params
---------------------------------
0 | model | CNN_M5 | 26.9 K
---------------------------------
26.9 K    Trainable params
0         Non-trainable params
26.9 K    Total params
0.108     Total estimated model params size (MB)


Sanity Checking: 0it [00:00, ?it/s]

  rank_zero_warn("Detected KeyboardInterrupt, attempting graceful shutdown...")
Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x000001A8866BBDC0>
Traceback (most recent call last):
  File "C:\Users\johnc\anaconda3\envs\snn_voice\lib\site-packages\torch\utils\data\dataloader.py", line 1466, in __del__
    self._shutdown_workers()
  File "C:\Users\johnc\anaconda3\envs\snn_voice\lib\site-packages\torch\utils\data\dataloader.py", line 1424, in _shutdown_workers
    if self._persistent_workers or self._workers_status[worker_id]:
AttributeError: '_MultiProcessingDataLoaderIter' object has no attribute '_workers_status'
