<a href="https://colab.research.google.com/github/hewp84/tinyml/blob/main/FA22_Lab9.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab 9: Machine Learning 1 – Artificial Neural Network, Classification 

##Introduction

In this lab, we will create a machine learning (ML) model based on the accelerometer (ADXL345) signals to predict the running conditions of the axial flow fan (AFF) which we used in lab 3. An autoencoder we practiced in Prelab9 will be employed to determine whether AFF is in a normal or abnormal condition. Lab9 is broken down into two main sections: 1) Data collection, and 2) Training ML model. In the data collection section, we will collect accelerometer data when the machine is in normal and abnormal conditions using Raspberry Pi. The abnormal condition will be set of increasing eccentric force by adding a mass on a blade of the AFF as Lab3. In the training ML model section, we will utilize the given scripts in Prelab9 to train an autoencoder. In addition, we will perform feature engineering by doing signal processing to see the effects of input feature on the performance of the ML models. And the, we will save the selected and trained model to local disk to use it in the next lab. The schematic of Lab9 is illustrated in Figure 1.

![picture](https://github.com/hewp84/tinyml/blob/main/img/L9_Figure1.png?raw=true)

*Figure 1 Lab9: Schematic of anomaly detection model training for axial flow fan*

## Data Collection for Machine Learning

### Part 1: Data collection practice

First, make a wire connection between ADXL345 sensor and Raspberry Pi. If you are having trouble with the connection, please look at the instructions in Lab3 manual. After you make a connection, you are ready to collect data. Before attaching the sensor to the target placement, let’s try to get data and understand the output data format. 

The sample Python script (‘Lab9_data_collector.py’) below to collect ADXL345 accelerometer data is on Brightspace. In the script, what you need to pay attention to is ‘condition_identifier’ and ‘duration’ variables in the middle of the script.

---

**Python - Python3 (Lab9_data_collector.py)**

```
import time
import board
import busio
import adafruit_adxl34x
from micropython import const
import csv
import datetime

i2c = busio.I2C(board.SCL, board.SDA) # i2c variable defines I2C interfaces and GPIO pins using busio and board modules

acc = adafruit_adxl34x.ADXL345(i2c) # acc object is instantiation using i2c of Adafruit ADXL34X library

acc.data_rate = const(0b1111) # change sampling rate as 3200 Hz

# ratedict=output rate dictionary
# See Table5 of Lab3 manual key=rate code (decimal), value=output data rate (Hz)
ratedict = {15:3200,14:1600,13:800,12:400,11:200,10:100,9:50,8:25,7:12.5,6:6.25,5:3.13,4:1.56,3:0.78,2:0.39,1:0.2,0:0.1}

print("Output data rate is {} Hz".format(ratedict[acc.data_rate])) # printing out data rate

def getData(sensor:object, N:int): # sensor: ADXL sensor object, N: The number of sample in each timestamp.
    t1 = time.time()
    data_x = [] # initialize data_x to contain x-axis acceleration
    data_y = [] # initialize data_y to contain y-axis acceleration
    data_z = [] # initialize data_z to contain z-axis acceleration
    for i in range(N):
        x_acc, y_acc, z_acc = sensor.acceleration
        data_x.append(str(x_acc))
        data_y.append(str(y_acc))
        data_z.append(str(z_acc))
    x_data = ' '.join(data_x)
    y_data = ' '.join(data_y)
    z_data = ' '.join(data_z)
    return x_data, y_data, z_data # each data returns space delimited string each element is measurement of acceleration

condition_identifier = "Test" # condition identifier
duration = 10 # data collection duration in second unit

filename = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")+"_"+condition_identifier+"_lab9_data.csv"
header = ["Condition", "Xacc array [m/s2]", "Yacc array [m/s2]", "Zacc array [m/s2]"]
start = time.time()

print("== Data collection for {} measurements started. ==".format(duration))

with open(filename, 'w') as f: # Make and open file object
    write = csv.writer(f) # write object for the created file
    write.writerow(header) # write the first row (header)
    for j in range(duration): # for measurement durations
        x, y, z = getData(acc, 1000) # get x-, y-, z-axis acceleration array of 1000 data points (1 second) for each
        print('======= {}th of {} collection ======='.format(j+1, duration)) # Print out the progress 
        write.writerow([condition_identifier, x, y, z])
f.close()

print("== Data saving is done. == takes", time.time() - start)

```

---

Let’s try to collect data as a practice. Please set ‘condition_identifier’ as “test” and ‘duration’ as 10. If you run the script, your Raspberry Pi will collect accelerations for 10 seconds. According to Raspberry Pi specifications, the total time may be longer than the ‘duration’ you set. The output filename must be 
“YYYYMMDD_HHmmSS_Test_lab9_data.csv”. The date-time in the first part of the filename is the date-time when the script starts to be run. The “Test” in this case is the ‘conditon_identifier’. If you open the saved CSV file, you will see the collected data as Table 1. The first column (Condition column) indicates 
‘condition_identifier’. The second, third, and last columns indicate the measured accelerations for 1 second (1000 data points) for each axis, respectively. Because the sampling frequency is 1000Hz, each row means data for 1 second. Each array is space delimited float array. To practice data loading and transformation, perform TASK1. 

<br></br>
*Table 1 Example of collected data*

<table width="658">
<tbody>
<tr>
<td width="164">
<p>Condition</p>
</td>
<td width="165">
<p>X acc array [m/s2]</p>
</td>
<td width="30">
<p>&nbsp;</p>
</td>
<td width="135">
<p>Y acc array [m/ss]</p>
</td>
<td width="165">
<p>Z acc array [m/s2]</p>
</td>
</tr>
<tr>
<td width="164">
<p>&nbsp;</p>
</td>
<td width="165">
<p>&nbsp;</p>
</td>
<td width="30">
<p>&hellip;</p>
</td>
<td width="135">
<p>&nbsp;</p>
</td>
<td width="165">
<p>&nbsp;</p>
</td>
</tr>
<tr>
<td width="164">
<p>Test</p>
</td>
<td width="165">
<p>x1 x2 &hellip; x999 x1000</p>
</td>
<td width="30">
<p>&nbsp;</p>
</td>
<td width="135">
<p>y1 y2 &hellip; y999 y1000</p>
</td>
<td width="165">
<p>z1 z2 &hellip; z999 z1000</p>
</td>
</tr>
<tr>
<td width="164">
<p>Test</p>
</td>
<td width="165">
<p>x1 x2 &hellip; x999 x1000</p>
</td>
<td width="30">
<p>&nbsp;</p>
</td>
<td width="135">
<p>y1 y2 &hellip; y999 y1000</p>
</td>
<td width="165">
<p>z1 z2 &hellip; z999 z1000</p>
</td>
</tr>
<tr>
<td width="164">
<p>Test</p>
</td>
<td width="165">
<p>x1 x2 &hellip; x999 x1000</p>
</td>
<td width="30">
<p>&nbsp;</p>
</td>
<td width="135">
<p>y1 y2 &hellip; y999 y1000</p>
</td>
<td width="165">
<p>z1 z2 &hellip; z999 z1000</p>
</td>
</tr>
<tr>
<td width="164">
<p>&nbsp;</p>
</td>
<td width="165">
<p>&nbsp;</p>
</td>
<td width="30">
<p>&hellip;</p>
</td>
<td width="135">
<p>&nbsp;</p>
</td>
<td width="165">
<p>&nbsp;</p>
</td>
</tr>
<tr>
<td width="164">
<p>Test</p>
</td>
<td width="165">
<p>x1 x2 &hellip; x999 x1000</p>
</td>
<td width="30">
<p>&nbsp;</p>
</td>
<td width="135">
<p>y1 y2 &hellip; y999 y1000</p>
</td>
<td width="165">
<p>z1 z2 &hellip; z999 z1000</p>
</td>
</tr>
<tr>
<td width="164">
<p>Test</p>
</td>
<td width="165">
<p>x1 x2 &hellip; x999 x1000</p>
</td>
<td width="30">
<p>&nbsp;</p>
</td>
<td width="135">
<p>y1 y2 &hellip; y999 y1000</p>
</td>
<td width="165">
<p>z1 z2 &hellip; z999 z1000</p>
</td>
</tr>
<tr>
<td width="164">
<p>Test</p>
</td>
<td width="165">
<p>x1 x2 &hellip; x999 x1000</p>
</td>
<td width="30">
<p>&nbsp;</p>
</td>
<td width="135">
<p>y1 y2 &hellip; y999 y1000</p>
</td>
<td width="165">
<p>z1 z2 &hellip; z999 z1000</p>
</td>
</tr>
<tr>
<td width="164">
<p>&nbsp;</p>
</td>
<td width="165">
<p>&nbsp;</p>
</td>
<td width="30">
<p>&hellip;</p>
</td>
<td width="135">
<p>&nbsp;</p>
</td>
<td width="165">
<p>&nbsp;</p>
</td>
</tr>
</tbody>
</table>


#### TASK 1

After running ‘Lab9_data_collector.py’ with variables, ‘condition_identifier’=”Test” and ‘duration’=10, plot each axis data in both time-domain and frequency-domain as Figure 2.  

1.	Capture the plot and attach it to the report. 

  a.	You must load the CSV file in Python script. 

  b.	Add ‘condition_identifier’ and your name at the end of the title of each plot (e.g., ‘Time domain, Test, John Doe’). 

  c.	Your plots must include each axis label and units. 

2.	Upload your entire Python script and the CSV file on Brightspace. 

  a.	Make the Python script name as “Lab9_TASK1_yourname.py”. 

  b.	Make the CSV file name as generated. 
 
* Refer to Lab3 and Prelab9. You have already done this work before. 
* Your data must have 10 rows except the header row because you collected data for 10 seconds. Please select one of the data rows randomly. 
* You can use either Raspberry Pi or laptop to plot the data. 

![picture](https://github.com/hewp84/tinyml/blob/main/img/L9_Figure2.jpg?raw=true)

*Figure 2 Data plot: Time domain (left) and Frequency domain (right)*


### Part 2: Data collection in normal condition

Before we deploy the sensor to the AFF, let’s check the hardware and the speed controller. The hardware configuration and the speed controller are shown in Figure 3. For safety reasons, the base part is fixed on the table using tapes. Do not remove the tapes because it maybe moves when you run the AFF. To turn on the fan, rotate the knob of the speed controller clockwise. The relationship between knob pointer placements and the actual rotational speed of AFF is shown in Table 2. When you rotate the pointer of the knob to L (Speed 1 in Figure 3 (right)), for example, the AFF rotates around 1800 rpm. By adjusting the control knob, you can increase and decrease the rotating speed of the fan. Different from the vacuum pump case of Prelab9, it is obvious that changing rotational speed makes an ML model development hard. On the other hand, it may be more interesting. For instance, if you attached unbalanced mass to a blade of the fan at the lowest rotational speed, does the amplitude of vibration be bigger than the maximum rotational speed without any attached mass? We don’t know and the answer may be no. Try to turn the fan on and adjust the knob to see if the rotation speed changes well. 

![picture](https://github.com/hewp84/tinyml/blob/main/img/L9_Figure3.png?raw=true)

*Figure 3 Configuration of axial flow fan (left) and speed controller (right)*
<br></br>
*Table 2 Controller indicator vs. rotational speed of AFF*

<table width="347">
<tbody>
<tr>
<td width="173">
<p>Speed indicator</p>
</td>
<td width="173">
<p>Rotational speed</p>
</td>
</tr>
<tr>
<td width="173">
<p>1 (L, Low)</p>
</td>
<td width="173">
<p>1800 rpm</p>
</td>
</tr>
<tr>
<td width="173">
<p>2</p>
</td>
<td width="173">
<p>2150 rpm</p>
</td>
</tr>
<tr>
<td width="173">
<p>3 (M, Medium)</p>
</td>
<td width="173">
<p>2500 rpm</p>
</td>
</tr>
<tr>
<td width="173">
<p>4</p>
</td>
<td width="173">
<p>2750 rpm</p>
</td>
</tr>
<tr>
<td width="173">
<p>5 (H, High)</p>
</td>
<td width="173">
<p>3000 rpm</p>
</td>
</tr>
</tbody>
</table>

Deploy the accelerometer (ADXL345) to the top of the fan as Figure 4. You should remember the sensor placement and the axis configuration for the next lab again.  If you are not sure how to set up the accelerometer to the AFF, please look at Part 7 of Lab3 manual. To collect acceleration data when the machine is in normal conditions for 5 minutes, perform TASK 2. When you collect the data, try to change the rotational speed. For example, while collecting, set Speed 1 (1800 rpm) for the first 1 minute, and then Speed 2 (2150 rpm) between 1 and 2 minutes from the start, and so on. The example of the experimental table is shown in Table 3. You can see the progress of your data collector in the Shell of Thonny or Terminal window of Raspberry Pi as Figure 5. For this, you must perform data collection at once with other members sharing the table and the AFF. 

![picture](https://github.com/hewp84/tinyml/blob/main/img/L9_Figure4.jpg?raw=true)

*Figure 4 Sensor placement on top of the AFF*
<br></br>
*Table 3 Experimental table for data collection of AFF*

<table width="671">
<tbody>
<tr>
<td width="224">
<p>Measurement</p>
</td>
<td width="224">
<p>Time (Approximated)</p>
</td>
<td width="224">
<p>Speed</p>
</td>
</tr>
<tr>
<td width="224">
<p>1 &ndash; 60</p>
</td>
<td width="224">
<p>0 &ndash; 60 seconds</p>
</td>
<td width="224">
<p>1 (L, 1800 rpm)</p>
</td>
</tr>
<tr>
<td width="224">
<p>61 &ndash; 120</p>
</td>
<td width="224">
<p>61 &ndash; 120 seconds</p>
</td>
<td width="224">
<p>2 (2150 rpm)</p>
</td>
</tr>
<tr>
<td width="224">
<p>121 &ndash; 180</p>
</td>
<td width="224">
<p>121 &ndash; 180 seconds</p>
</td>
<td width="224">
<p>3 (M, 2500 rpm)</p>
</td>
</tr>
<tr>
<td width="224">
<p>181 &ndash; 240</p>
</td>
<td width="224">
<p>181 &ndash; 240 seconds</p>
</td>
<td width="224">
<p>4 (2750 rpm)</p>
</td>
</tr>
<tr>
<td width="224">
<p>241 &ndash; 300</p>
</td>
<td width="224">
<p>241 &ndash; 300 seconds</p>
</td>
<td width="224">
<p>5 (3000 rpm)</p>
</td>
</tr>
</tbody>
</table>
<p>&nbsp;</p>

![picture](https://github.com/hewp84/tinyml/blob/main/img/L9_Figure5.png?raw=true)

*Figure 5 Data collection progress check: Thonny (left) and Terminal (right)*

#### TASK 2

Run ‘Lab9_data_collector.py’ while you run the AFF with variables, ‘condition_identifier’=”Normal” and ‘duration’=300, plot each axis data in both time-domain and frequency-domain as Figure 2. 

1. Capture the plots according to the rotational speed and attach these to the report. 

  a.	You must load the CSV file in Python script. 

  b.	Add ‘condition_identifier’, rotational speed, and your name at the end of the title of each plot (e.g., ‘Time domain, Normal, 1800 rpm, John Doe’). 

  c.	Your plots must include each axis label and units. 

  d.	You need total 5 plots (5 different rotational speed) 

2. Upload the CSV file on Brightspace. 

  a.	You don’t need to upload the Python script. 

  b.	Make the CSV file name as generated. 
 
* Your data must have 300 rows except the header row because you collected data for 300 seconds. Please select one of the data rows randomly within one-speed range. For example, if you plot Speed 2, you need to select a data row between 61 and 120. 


### Part 3: Data collection in abnormal condition

Next step is to collect abnormal condition data by adding mass on the blade for the ML model. There will be an adhesive putty on top of the AFF. Using the putty, make the fan unbalanced as Figure 6. If you are not sure how to set up the abnormal condition by adding mass, please follow the steps of Part8 in Lab3 manual. Please safely disassemble and assemble the AFF. If you are not sure about your setup, please ask TA. After setup for the abnormal condition of the AFF, perform TASK 3.

![picture](https://github.com/hewp84/tinyml/blob/main/img/L9_Figure6.png?raw=true)

*Figure 6 Unbalanced fan blade for abnormal condition of AFF*

#### TASK 3 (Repeat TASK 2 in abnormal condition)

Run ‘Lab9_data_collector.py’ while you run the AFF with variables, ‘condition_identifier’=”Abnormal” and ‘duration’=300, plot each axis data in both time-domain and frequency-domain as Figure 2. 

3. Capture the plots according to the rotational speed and attach these to the report. 

  a.	You must load the CSV file in Python script. 

  b.	Add ‘condition_identifier’, rotational speed, and your name at the end of the title of each plot (e.g., ‘Time domain, Abnormal, 1800 rpm, John Doe’). 

  c.	Your plots must include each axis label and units. 

  d.	You need total 5 plots (5 different rotational speed) 

4. Upload the CSV file on Brightspace. 

  a.	You don’t need to upload the Python script. 

  b.	Make the CSV file name as generated. 
 
* Your data must have 300 rows except the header row because you collected data for 300 seconds. Please select one of the data rows randomly within one-speed range. For example, if you plot Speed 2, you need to select a data row between 61 and 120. 


## Training ML model for anomaly detection

### Part 4: Run Autoencoder sample

Now, you are ready to train ML model. Please download a Jupyter Notebook sample file (‘Lab9_ML_sample.ipynb’) from Brightspace to your laptop. Follow the steps below to run the sample file. 

1.	Create a folder you want to work for this lab. 

2.	Copy files below in the folder. Please make sure the collected data and the Jupyter Notebook file must be in the same directory. 

  a.	‘Lab9_ML_sample.ipynb’ 

  b.	‘DATE_TIME_Normal_lab9_data.csv’ 

  c.	‘DATE_TIME_Abnormal_lab9_data.csv’ 
    
    ![picture](https://github.com/hewp84/tinyml/blob/main/img/L9_Image1.png?raw=true)
  
3.	Open the sample Jupyter Notebook file (Lab9_ML_sample.ipynb) using any IDEs. In this example, it was opened by Jupyter Notebook as capture below. 
4.	Run each code cell of the sample. 

  a.	The basic structure of ML training is the same as Prelab9. 

  b.	Please read comments in each line. 

  c.	The sample code is set for X-axis acceleration data and raw data feature input. 

  ![picture](https://github.com/hewp84/tinyml/blob/main/img/L9_Image2.png?raw=true)


#### TASK 4

Run ‘Lab9_ML_sample.ipynb’ with the conditions below. 

  a.	Acceleration axis: X-axis 

  b.	Input feature: Raw data (without signal processing as Prelab9) 

  c.	Embedding size = 16 

  d.	Please note if you change any hyper-parameters or variables. 
<br></br>
After finish to run the sample code, answer the questions below. 
1.	What is performance of the test data? Evaluate and analyze your model by including below. 
  a. ROC curve 

  b.	Training reconstruction histogram 

  c.	Threshold, Accuracy, Precision, Recall 

  d.	Reconstruction error plots 

2.	Do you think your model is good enough for anomaly detection of the AFF? 

### Part 5: Save ML model

After you train a model, you should save the model and assets for use in the future. As a practice, follow the steps below to save the model you created in TASK4. 
1.	At the end of the sample code after you finish TASK4, add a Code Cell. 
2.	Write the code in the cell below. 

---

**Jupyter - Save model and assets**

```
## Save the model 
model_folder = "model" # directory name to save the model
if not os.path.exists(model_folder): # create the directory if not exist     
  os.mkdir(model_folder) 
 
t = datetime.datetime.now().strftime("%Y%m%d_%H%M%S") # define the time the model saved 
model_identifier = "_lab9_anomaly_TASK4" # model identifier, if you want to make identifier to tell models, please change this 
export_path = './model/'+t+model_identifier # export path 
autoencoder.save(export_path) # model will be saved in "export_path" 


```

---

3. If you don't see any errors as below, you are done!

  ![picture](https://github.com/hewp84/tinyml/blob/main/img/L9_Image3.png?raw=true)

4. You can see the saved model and assets’ files in the ‘model’ folder as captured below. 
 
 ![picture](https://github.com/hewp84/tinyml/blob/main/img/L9_Image4.png?raw=true)

The last step of Lab9 is to load the saved model. Perform TASK 5 to load and check the loaded model. 

#### TASK 5

1.	At the end of the sample code after you save the model, add a Cell code. 
2.	Write the code in the cell below. 

---

**Jupyter - Load and check hte model**

```
## reload your model 
reloaded = tf.keras.models.load_model(export_path) 
## Test your data! 
check_data = tensorNormalization(input_feature) 
predict(reloaded, check_data, threshold) 

```

---

3.	If you perform all correctly, you will see the output you will see the output cell as Figure 7. 

After performing the steps above, attach the output cell as Figure 7 to the report. Explain the output cell. What are the outputs meaning? Are they as expected? 

![picture](https://github.com/hewp84/tinyml/blob/main/img/L9_Figure7.png?raw=true)

*Figure 7 Output after loading and checking saved model*

Based on the sample code given, try different input features, hyper-parameters, and variables to create the best model! 


## Deliverable

1.	Perform all Tasks and submit your Lab9 report on Brightspace before Lab 10. 
2.	Summarize Lab 9 what you performed and learned.
  * Use any photos, figures, tables, and equations if needed. 
3.	Which axis acceleration data is the best for anomaly detection? 

  a.	Justify your decision based on the performance of the model and in terms of a mechanical approach. 
4.	Which feature among raw, time domain, and frequency domain is the best for anomaly detection? 

  a.	Justify your decision based on the performance of the model and in terms of a mechanical approach. 
5.	How to improve the model? 

  a.	Try to change Embedding size, and other hyper-parameters and then explain the results. 
6.	Save the finally selected model and evaluate it. 
  
  a.	You must note the created folder and the threshold for your final model to use in Lab10 on Raspberry Pi. 
7.	Further discussion: Explain how to utilize all axis data to a model. 
