# Report for 2D Project Physical World and Digital World

Cohort: #5

Team No.: #8

Members:
* Loh Jian An Lionell (Student ID)
* Celine Yee Se Lin (Student ID)
* Claire Tan 
* Zhao Wen Qiang 
* Cheow Yi Jian


# Introduction

Temperature sensors need to reach quasi-thermal equilibrium with the environment in order to measure its temperature accurately. The time taken to reach thermal equilibrium however can be a long wait. For instance, the sensor originally at room temperature takes more than 70 seconds before it is able to reflect the temperature of a water bath at 60 deg celsius. This 70 seconds was the time taken for the sensor to reach quasi-thermal equilibrium with the water bath. 

In order to expedite this process, we augment the hardware with predictive programmes. More specfically, we build a predictive model with training data with Machine Learning algorithms. Using this predictive model, we are able to take raw sensor data from the first 10 seconds to give an accurate temperature of the water bath, with an average standard deviation of 1%.  

Write your introduction here. Describe the problem you want to solve and a brief of your approach and summary of your result.


# Description of Data from Experiment

## Data Collection

### Concerns about Time and Space
The collection of data was done with the temperature sensor attached to a Raspberry Pi in the classroom.
The location for data collection is not trivial as each location has unique environment variables that must be kept as consistent as possible at the point of prediction. 

For example, the ambient temperature of the classroom is 0.7 degree cooler than the temperature beside the pantry. Even within the classroom, the temperature at night is 0.4 degree warmer than the temperature in the day, as the air conditioners are switched off at night. 

Since the conditions for data collection can be controlled, our group decided to make special arrangements to collect our data in the day, in the classroom, to keep external variables as consistent as possible. While deviation of such variables might be small, if they are not kept constant, we might still have to factor them in as "features" to train our model, which complicates the process.

### Collection Method 
We printed the temperature on the Python3 Shell and saved the Shell outputs as a text file for processing thereafter. We opted for this method as we realised that writing the values to a csv file somehow slowed down the temperature reading interval by 0.08 seconds. This could be due to the additional processing overhead when writing data to files that is especially significant considering the limited RAM of the RPi3. 






## Data Preparation





In [4]:
import matplotlib.pyplot as plt

f = open("./Data_54.8.txt", "r")
count = 0
time_s = []
temp = []
for line in v:
	if count < 30:

		count+=1
		time_s.append(count)
		t = line.split(" ")
		# print(t)
		T_float = float(t[0])
		temp.append(T_float)


plt.scatter(time_s, temp)
plt.show()


FileNotFoundError: [Errno 2] No such file or directory: './Data_54.8.txt'

## Data Format

Describe your data and its features. Include any codes or visualization of data.

# Training Model

Describe how you train your model. Include any code and output

# Verification and Accuracy

Describe how you check the accuracy of your model and its result. State any analysis you have and the steps you have taken to improve its accuracy.

# Testing Using Instructor's Data

Instruction:

* Store your trained model into a pickle object which can be loaded. 
* Read an excel file with the following format:
```
time (s)	reading
0.00	    25.812
0.90	    28.562
1.79	    31.875
2.68	    35.062
3.55	    37.937
4.43	    40.687
5.30	    43.25
```
where the first column indicates the time in seconds and the second column indicates the sensor reading in Celsius. 
* The number of rows in the instructors' data can be of any number. If your code has a minimum number of rows, your code must be able to handle and exit safely when the data provided is less than the required minimum.
* Write a code to prepare the data for prediction.
* Write a code to predict the final temperature.



In [None]:
# write a code to load your trained model from a pickle object
import pickle
filename = '' # enter your pickle file name containing the model
with open(filename,'rb') as f:
    model = pickle.load(f)


In [None]:
# write a code to read an excel file
import pandas as pd
num_test = 9
filename = 'temp_' 
filekey = [] # instructors will key in this
dataframe = {} # this is to store the data for different temperature, the keys are in filekey
for idx in range(num_test):
    dataframe[filekey[idx]] = pd.read_excel(filename+filekey[idx]+'.xlsx')


In [None]:
# write a code to prepare the data for predicting
def preprocess(df):
    # use this function to extract the features from the data frame
    pass

data_test = {}
for key in filekey:
    data_test[key]=preprocess(dataframe[key])

In [None]:
# write a code to predict the final temperature
# store the predicted temperature in a variable called "predicted"
# predicted is a dictionary where the keys are listed in filekey

predicted = {}
for key in filekey:
    predicted[key]=model.predict(data_test[key])

In [None]:
# checking accuracy

# first instructor will load the actual temp from a pickle object
import pickle
error_d = {}
accuracy_percent_d = {}

for test in range(num_test):
    filename = 'data_'+filekey[test]+'.pickle'
    with open(filename,'rb') as f:
        final_temp, worst_temp = pickle.load(f)

    # then calculate the error
    error_final = abs(final_temp-predicted[filekey[test]])
    accuracy_final_percent = 100-error_final/final_temp*100
    error_worst = abs(worst_temp-predicted[filekey[test]])
    accuracy_worst_percent = 100-error_worst/worst_temp*100
    
    error_d[filekey[test]] = (error_final, error_worst)
    accuracy_percent_d[filekey[test]] = (accuracy_final_percent, accuracy_worst_percent)

    # displaying the error
    print('===================================')
    print('Testing: {}'.format(filekey[test]))
    print('Predicted Temp: {:.2f}'.format(predicted[filekey[test]]))
    print('Final Sensor Temp: {:.2f}, Alcohol Temp:{:.2f}'.format(final_temp, worst_temp))
    print('Error w.r.t Final Sensor Temp: {:.2f} deg, {:.2f}% accuracy'.format(error_final, accuracy_final_percent))
    print('Error w.r.t Alcohol Temp: {:.2f} deg, {:.2f}% accuracy'.format(error_worst, accuracy_worst_percent))
    
avg_final = sum([ final for final, worst in accuracy_percent_d.values()])/len(error_d.values())
avg_worst = sum([ worst for final, worst in accuracy_percent_d.values()])/len(error_d.values())
print('==============================')
print('Average accuracy for final sensor temp: {:.2f}'.format(avg_final))
print('AVerage accuracy for alcohol temp: {:.2f}'.format(avg_worst))
