<a href="https://githubtocolab.com/Eunseob/purdue_me597/blob/main/lab/lab9/L9_Colab1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab 10.1 Machine Learning 2 - Machine Learning Implementation to Edge Device

# Understanding and Analyzing Data

## Learning Goals

Students will be able to:

1. Implement a data collection process for TinyML
2. Deploy a TinyML model using Tensorflow
3. Build a monitoring system using TinyML



## 1.1 Introduction 

In this last lab, we will implement a machine learning model to Raspberry Pi as an application of IIoT smart monitoring system to predict running conditions of the axial flow fan (AFF). This lab is broken down into three main sections: **1) Understanding and analyzing data, 2) Machine learning implementation to Raspberry Pi**, and **3) Building up the entire monitoring system.** The entire monitoring system and the data pipeline are illustrated in Figure 1. 

<img src="https://github.com/Eunseob/purdue_me597/blob/main/lab/img/lab9_fig1.png?raw=true" width="100%">

*Figure 1 Schematic of entire monitoring system and data pipeline*

## 1.2 Acceleration of the AFF


In this section, we will go over the algorithm of the smart monitoring system and determine some variables for the smart monitoring system. The goal of the monitoring system for the AFF is to visualize in real-time 1) the vibration of the AFF, 2) the execution (‘ACTIVE’ or ‘STOPPED’), and 3) the prediction of running condition (‘NORMAL’ or ‘ABNORMAL’). Obviously, the prediction of running conditions based on the machine learning model should be only done when the AFF is in an ‘ACTIVE’ state. If the AFF is not running, the prediction of anomaly does not mean anything. 

The algorithm and the flowchart of the AFF monitoring system is illustrated in Figure 2. This data flow will be repeated in a loop in your program. In Lab8, we developed an anomaly detection model (Autoencoder) for the second decision (“Is AFF normal?”). But we do not have a model for the first decision (“Is AFF running?”). Of course, a model to determine the running state (‘ACTIVE’ or ‘STOPPED’) can be developed by using machine learning approach as we performed in Lab8. However, let’s make the simplest rule based on the measured vibration data. The idea is that when the AFF is running, the acceleration of each axis of the sensor must increase. Therefore, based on a simple logic such as a certain axis acceleration is lower than a specific acceleration value, a threshold on running, we can determine if the AFF is running or not. In other words, if the acceleration of a certain axis is higher than the threshold value, we can say the AFF is running, e.i., ‘Execution’ of AFF is ‘ACTIVE’. 

<img src="https://github.com/Eunseob/purdue_me597/blob/main/lab/img/lab9_fig2.png?raw=true" width="70%">

*Figure 2 Flow chart of AFF monitoring system*

The sample plots according to the various conditions of the AFF are in [APPENDIX](https://colab.research.google.com/drive/1DyKTOPtkSiUWEnzS9-gDQo_zrrfp1U7v#scrollTo=LfQ-rS3gD6Z3&line=1&uniqifier=1) at the end of this manual. On the time domain plots, you can see the RMS values according to the axis. As you can see, when the machine is stopped (Figure A. 2), the RMS (root mean square) accelerations are the minimum on the x- and y-axis. The sample data shows that it is reasonable to set the execution rms threshold of the x-axis as 1 m/s2. However, the rms threshold value may be different according to the AFF, and the sensor configurations. 
TASK 1

First, **deploy ADXL345 sensor to the AFF as the same to Lab9**. To determine the rms threshold for the execution of the AFF, perform TASK 1 below. 



### Task 1.1


Run ‘lab10_sample1.py’ on Raspberry Pi to check the rms values on each axis according to the executions (‘STOPPED’ and ‘ACTIVE’) of the AFF. 


### Task 1.2 
Determine the rms threshold value of a specific axis. What are the rms value and the axis to determine if the AFF is running? 


In [None]:
T2 = '' #@param {type:"string", display-mode:"form"}

### Task 1.3

Modify ‘lab10_sample1.py’ so that you can see the execution (running state) of the AFF. 
  a. Use ‘if’ and ‘else’ statement. 

  b.	The example of the result is shown in Figure 3. 

  c.	By turning the AFF on and off repeatedly, confirm if your logic and the rms threshold is effective. 

  d.	Attach the capture of Terminal or Thonny Shell to the report. 

   ---

  Place your screenshot here.

  ---

![picture](https://github.com/hewp84/tinyml/blob/main/img/L10_Figure3.png?raw=true)

*Figure 3 Check the rms values and the execution of AFF*




## 1.3 Finding Minimum and Maximum Value of your Training Data Set

In Lab9, you developed your own autoencoder model for the anomaly detection of the AFF. When training and validating the model, we used the normalized data set. Other than the threshold (mae, mean absolute error, loss) for the autoencoder model, when you implement the model to Raspberry Pi, you need to use the minimum and the maximum to normalize the extracted features in order to employ your model. The data flow in real time to predict running conditions of the AFF is illustrated in Figure 4. To normalize the input feature, we should know the minimum and the maximum values when training. Likewise, for the anomaly detection based on the autoencoder model, we should know the MAE loss threshold. 

To get the minimum and maximum values from the collected data, perform TASK 2. The ‘*lab10_sample2.py*’ is prepared based on ‘*lab9_ML_sample.ipynb*’ to check the minimum and maximum values of the input features. 

![picture](https://github.com/hewp84/tinyml/blob/main/img/L10_Figure4.png?raw=true)

*Figure 4 Real-time data flow for machine learning model to determine running conditions of AFF*


### Task 1.4

1.	Run ‘lab10_sample2.py’ on laptop to check the minimum and the maximum values of the extracted input feature of the model. 

  a.	The collected data sets must be in the same directory as the Python script. 
2.	The script is incomplete as is, so you need to complete the script. 

  a.	Variables you need to complete: 
    * normal_data_file (line 43) 
    * abnormal_data_file (line 44) 
    * DIMENSION (line 57) iv. input_feature (line 79) 
 


In [None]:
# lab10_sample2.py

import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from scipy import stats, fft

# Time-domain data
def timeFeatures(data):
    feature = [] # initialize feature list
    for i in range(len(data)):
        mean = np.mean(data[i]) # mean
        std = np.std(data[i]) # standard deviation
        rms = np.sqrt(np.mean(data[i] ** 2)) # root mean squre
        peak = np.max(abs(data[i])) # peak
        skew = stats.skew(data[i]) # skewness
        kurt = stats.kurtosis(data[i]) # kurtosis
        cf = peak/rms # crest factor
        # number of feature of each measurement = 7
        feature.append(np.array([mean,std,rms,peak,skew,kurt,cf], dtype=float))
    feature = np.array(feature)
    return feature # feature list, each element is numpy array with datatype float

# DFT magnitude data
def freqFeatures(data):
    feature = []
    for i in range(len(data)):
        N = len(data[i]) # number of data
        yf = 2/N*np.abs(fft.fft(data[i])[:N//2]) # yf is DFT signal magnitude
        feature.append(np.array(yf))
    feature = np.array(feature)
    return feature

def tensorNormalization(data): # data as numpy array
    min_val = tf.reduce_min(data) # get min val
    max_val = tf.reduce_max(data) # get max val    
    data_normal = (data - min_val) / (max_val - min_val) # get normalized data as numpy array
    minVal = min_val.numpy() # convert minimum tensor to numpy
    maxVal = max_val.numpy() # convert maximum tensor to numpy
    return tf.cast(data_normal, tf.float32), minVal, maxVal # tensorarray, float 32 datatype, min, max

## data loading
# All files should be in the same directory (folder)
normal_data_file = "" # normal condition filename: You must change this!
abnormal_data_file = "" # abnormal condition filename: You must chage this!

df_normal = pd.read_csv(normal_data_file) # normal dataframe
df_abnormal = pd.read_csv(abnormal_data_file) # abnormal dataframe

frames = [df_normal, df_abnormal] # frame list to merge two dataframes into one

df = pd.concat(frames) # new concatenated dataframe

## Data Transformation
# X-axis: 'Xacc array [m/s2]'
# Y-axis: 'Yacc array [m/s2]'
# Z-axis: 'Zacc array [m/s2]'
DIMENSION = # Select one (for your model) of above axes

# Exploding the values contained in selected column and converting the string values into float values
df = pd.concat([df['Condition'],df[DIMENSION].str.split(' ', expand=True).astype(float)], axis=1) # transform space delimited array to each value
ds = df.copy() # make ds by copying df

#Converting the Classifier into binary values
ds.loc[df['Condition'] == 'Normal', 'Status'] = 1 # if Condition column is 'Normal', Give 'Status' Column 1
ds.loc[df['Condition'] == 'Abnormal', 'Status'] = 0 # if Condition column is 'Abnormal', Give 'Status' Column 0
ds.drop('Condition', axis=1, inplace=True) # drop 'Condition' column (the first column)

data = ds.values
# Define Raw data W/O signal processing
raw_data = data[:,:-1]
# Labels: The last column
labels = data[:,-1]

time_data = timeFeatures(raw_data) # define time domain feature
freq_data = freqFeatures(raw_data) # define frequency domain feature (DFT)

## Data (feature) selection and Split training and validation dataset
# Feature selection
input_feature = # raw_data, time_data, or freq_data

## finally print out the min value and the max values of the input feature
print("The minimum is {} and the maximum is {}.".format(tensorNormalization(input_feature)[1], tensorNormalization(input_feature)[2]))

SyntaxError: ignored

In [None]:
#@title Task 1.5 Answer the questions below regarding Task 1.4 above about lab10_sample2.py. {display-mode:"form"} 

#@markdown #### 1. What is the input feature? (Axis and type of feature among raw, time feature, frequency feature).  
T5_1 = '' #@param {type:"string"}

#@markdown #### 2. What is the minimum value of the input feature? 
T5_2 = '' #@param {type:"string"}

#@markdown #### 3. What is the maximum value of the input feature? 
T5_3 = '' #@param {type:"string"}

#@markdown #### 4. What is the threshold (MAE loss) value from Lab9 for the trained model?
T5_4 = '' #@param {type:"string"}

Please continue to [Lab 10.2 here](L10_Colab2.ipynb).