# Summary of a NN model from `SMD anomaly detection` paper.

Implementation is [here](https://github.com/DongYuls/SMD_Anomaly_Detection) and paper is [here](https://www.mdpi.com/1424-8220/18/5/1308).  
The problem being solved is the anomaly detection based on environment noise. In particular, authors uses sound produced by Surface-Mounted Device machine (SMD).  
Authors propose the following approach. It's hard to work with sound in time domain. But going to a frequency domain may be a benefit. Thus, input to an anomaly detection model is a spectrogram, or in other words, an image. Authors apply STFT (short-time Fourier transform) with window size 2048 and stride equal to 512 points.  
> Sound time series -> STFT -> Spectrogram -> Image -> CNN-based autoencoder.

**Model summary**:  
(batch size = 1, FLOPs = approximate number of Multiply-Add operations)
1. Forward:   7.32 gFLOPs  
2. Backward: 14.64 gFLOPs
3. Training: 21.96 gFLOPs
4. Number of parameters: 4,812,288 (~ 18.3 MB)

The reason for this complexity is deep CNN autoencoder (some layers are not shown):
![SMD anomaly detection model (some layers are not shown)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5982511/bin/sensors-18-01308-g005.jpg)

In [1]:
from nns.nns import (estimate, printable_dataframe)
from nns.models.anomaly_detection import SMDAnomalyDetection

In [2]:
perf = []
estimate(SMDAnomalyDetection(), perf, perf)

Layer not recognized (type=<class 'tensorflow.python.keras.engine.input_layer.InputLayer'>, name=input)


Unnamed: 0,name,out_shape,gFLOPs,num_params,num_activations,params_mem (MB),activations_mem (MB)
0,input,"(1024, 32, 1)",0.0,0,32768,0.0,0.131072
1,enc/conv01/conv,"(512, 32, 64)",0.026214,1600,1048576,0.0064,4.194304
2,enc/conv01/bn,"(512, 32, 64)",0.0,128,1048576,0.000512,4.194304
3,enc/conv01/relu,"(512, 32, 64)",0.0,0,1048576,0.0,4.194304
4,enc/conv02/conv,"(256, 32, 64)",0.838861,102400,524288,0.4096,2.097152
5,enc/conv02/bn,"(256, 32, 64)",0.0,128,524288,0.000512,2.097152
6,enc/conv02/relu,"(256, 32, 64)",0.0,0,524288,0.0,2.097152
7,enc/conv03/conv,"(128, 32, 96)",0.629146,153600,393216,0.6144,1.572864
8,enc/conv03/bn,"(128, 32, 96)",0.0,192,393216,0.000768,1.572864
9,enc/conv03/relu,"(128, 32, 96)",0.0,0,393216,0.0,1.572864


In [3]:
printable_dataframe(perf, ignore_phase=False)

Unnamed: 0,Model,Phase,Input shape,#Parameters,Model size (MB) FP32,GFLOPs (multiply-add),Activation size (MB) FP32
0,SMDAnomalyDetection,inference,"(1024, 32, 1)",4812288,19.249152,6.676476,56.425216
1,SMDAnomalyDetection,training,"(1024, 32, 1)",4812288,19.249152,20.029428,112.850432
