# **HW1: Regression**
In *assignment 1*, you need to finish:

1.  Basic Part: Implement two regression models to predict the Systolic blood pressure (SBP) of a patient. You will need to implement **both Matrix Inversion and Gradient Descent**.


> *   Step 1: Split Data
> *   Step 2: Preprocess Data
> *   Step 3: Implement Regression
> *   Step 4: Make Prediction
> *   Step 5: Train Model and Generate Result

2.  Advanced Part: Implement one regression model to predict the SBP of multiple patients in a different way than the basic part. You can choose **either** of the two methods for this part.

# **1. Basic Part (55%)**
In the first part, you need to implement the regression to predict SBP from the given DBP


## 1.1 Matrix Inversion Method (25%)


*   Save the prediction result in a csv file **hw1_basic_mi.csv**
*   Print your coefficient


### *Import Packages*

> Note: You **cannot** import any other package

In [1]:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import csv
import math
import random

### *Global attributes*
Define the global attributes

In [2]:
training_dataroot = 'hw1_basic_training.csv' # Training data file file named as 'hw1_basic_training.csv'
testing_dataroot = 'hw1_basic_testing.csv'   # Testing data file named as 'hw1_basic_training.csv'
output_dataroot = 'hw1_basic_mi.csv' # Output file will be named as 'hw1_basic.csv'

training_datalist =  [] # Training datalist, saved as numpy array
testing_datalist =  [] # Testing datalist, saved as numpy array

output_datalist =  [] # Your prediction, should be 20 * 3 matrix and saved as numpy array
                      # The format of each row should be ['subject_id', 'charttime', 'sbp']

You can add your own global attributes here


In [3]:
mape = [] # Mean absolute percentage error for each subject

### *Load the Input File*
First, load the basic input file **hw1_basic_training.csv** and **hw1_basic_testing.csv**

Input data would be stored in *training_datalist* and *testing_datalist*

In [4]:
# Read input csv to datalist
with open(training_dataroot, newline='') as csvfile:
  training_datalist = np.array(list(csv.reader(csvfile)))

with open(testing_dataroot, newline='') as csvfile:
  testing_datalist = np.array(list(csv.reader(csvfile)))

### *Implement the Regression Model*

> Note: It is recommended to use the functions we defined, you can also define your own functions


#### Step 1: Split Data
Split data in *training_datalist* into training dataset and validation dataset
* Validation dataset is used to validate your own model without the testing data



In [5]:
def SplitData():

    splitidx = int(len(training_datalist)*0.8)
    train_dataset = training_datalist[1:splitidx,0]
    valid_dataset = training_datalist[1:splitidx,1]
    testing1 = training_datalist[splitidx:,0]
    answer1 = training_datalist[splitidx:,1]

    return train_dataset, valid_dataset, testing1, answer1


#### Step 2: Preprocess Data
Handle the unreasonable data
> Hint: Outlier and missing data can be handled by removing the data or adding the values with the help of statistics  

In [6]:
def PreprocessData(train_dataset, valid_dataset):
    # Remove the Valid_data with value > 250
    valid_mean = np.mean(valid_dataset.astype(float))
    valid_std = np.std(valid_dataset.astype(float))
    zscore = (valid_dataset[:].astype(float) - valid_mean) / valid_std
    outlier_mask = np.abs(zscore) > 2

    # valid_mask = valid_dataset[:].astype(float) <= 250
    train_dataset = train_dataset[~outlier_mask]
    valid_dataset = valid_dataset[~outlier_mask]

    #Remove the Train dataset with value > 90
    train_mask = train_dataset[:].astype(float) <= 120

    train_dataset = train_dataset[train_mask]
    valid_dataset = valid_dataset[train_mask]

    train_mask = train_dataset[:].astype(float) >= 60

    train_dataset = train_dataset[train_mask]
    valid_dataset = valid_dataset[train_mask]

    #Remove the Train dataset with value > 160
    valid_mask = valid_dataset[:].astype(float) <= 160
    train_dataset = train_dataset[valid_mask]
    valid_dataset = valid_dataset[valid_mask]



    return train_dataset, valid_dataset







#### Step 3: Implement Regression
> use Matrix Inversion to finish this part




In [7]:
def MatrixInversion(train_dataset, valid_dataset):

    x = train_dataset
    x = x.astype(float) # convert data type to float
    x = x.reshape((len(x), 1))
    ones = np.ones((len(x), 1)) # create an array of 1s with the same number of rows as x
    x = np.hstack((ones, x))


    y = valid_dataset
    y = y.astype(float) # convert data type to float
    y = y.reshape((len(y), 1))

    x_trans = np.transpose(x)
    mult_x = np.dot(x_trans, x)
    inv_mult_x = np.linalg.inv(mult_x)
    final = np.dot(np.dot(inv_mult_x, x_trans), y) #coeefficents


    return final,x,y




#### Step 4: Make Prediction
Make prediction of testing dataset and store the value in *output_datalist*
The final *output_datalist* should look something like this
> [ [100], [80], ... , [90] ] where each row contains the predicted SBP

In [8]:

def MakePrediction(data, coeffs):
    prediction = np.dot(data, coeffs)
    return prediction


#### Step 5: Train Model and Generate Result

> Notice: **Remember to output the coefficients of the model here**, otherwise 5 points would be deducted
* If your regression model is *3x^2 + 2x^1 + 1*, your output would be:
```
3 2 1
```





In [26]:
train, valid, soal, answer = SplitData()
train, valid = PreprocessData(train, valid)
coeffs, X, Y = MatrixInversion(train, valid)
# newpred=MakePrediction(X, coeffs) #Test Mape

#Testing Data (Output Datalist)
test_col = testing_datalist[:, 0]
test_col = test_col[1:].astype(float)
test_col = test_col.reshape((len(test_col), 1))
ones = np.ones((test_col.shape[0], 1)) # create an array of 1s with the same number of rows as x
test_col = np.hstack((ones,test_col))
output_datalist = MakePrediction(test_col, coeffs) #Final Result

######Mape Calculation##########################################################################################
# cekTrain , cekValid = SplitData()
cekTrain = soal.astype(float)
cekValid = answer.astype(float)
cekTrain = cekTrain.reshape((len(cekTrain), 1))
ones = np.ones((cekTrain.shape[0], 1)) # create an array of 1s with the same number of rows as x
cekTrain = np.hstack((ones,cekTrain))
newpred=MakePrediction(cekTrain, coeffs)
mape=[]
for i in range(len(newpred)):
    mape.append(abs((cekValid[i]-newpred[i]))/abs(cekValid[i]))

hasil = np.mean(mape) * 100

print("MAPE :"+ str((hasil))+"%")
print("Coefficients: "+ str(round(coeffs[1][0]))+" "+str(round(coeffs[0][0])))
##############################################################################################################

TypeError: MakePrediction() missing 1 required positional argument: 'c'

### *Write the Output File*
Write the prediction to output csv
> Format: 'sbp'




In [17]:
with open(output_dataroot, 'w', newline='', encoding="utf-8") as csvfile:
  writer = csv.writer(csvfile)
  for row in output_datalist:
    writer.writerow(row)

## 1.2 Gradient Descent Method (30%)


*   Save the prediction result in a csv file **hw1_basic_gd.csv**
*   Output your coefficient update in a csv file **hw1_basic_coefficient.csv**
*   Print your coefficient





### *Global attributes*

In [18]:
output_dataroot = 'hw1_basic_gd.csv' # Output file will be named as 'hw1_basic.csv'
coefficient_output_dataroot = 'hw1_basic_coefficient.csv'

training_datalist =  [] # Training datalist, saved as numpy array
testing_datalist =  [] # Testing datalist, saved as numpy array

output_datalist =  [] # Your prediction, should be 20 * 3 matrix and saved as numpy array
                      # The format of each row should be ['subject_id', 'charttime', 'sbp']

coefficient_output = [] # Your coefficient update during gradient descent
                   # Should be a (number of iterations * number_of coefficient) matrix
                   # The format of each row should be ['w0', 'w1', ...., 'wn']

### *Load the Input File*
First, load the basic input file **hw1_basic_training.csv** and **hw1_basic_testing.csv**

Input data would be stored in *training_datalist* and *testing_datalist*

In [19]:
# Read input csv to datalist
with open(training_dataroot, newline='') as csvfile:
  training_datalist = np.array(list(csv.reader(csvfile)))

with open(testing_dataroot, newline='') as csvfile:
  testing_datalist = np.array(list(csv.reader(csvfile)))

Your own global attributes

In [20]:
mapes=[]
coefficient_output = []


### *Implement the Regression Model*


#### Step 1: Split Data

In [21]:
def SplitData():


    splitidx = int(len(training_datalist)*0.8)
    train_dataset = training_datalist[1:splitidx,0]
    valid_dataset = training_datalist[1:splitidx,1]
    testing1 = training_datalist[splitidx:,0]
    answer1 = training_datalist[splitidx:,1]

    return train_dataset, valid_dataset, testing1, answer1


#### Step 2: Preprocess Data

In [22]:
def PreprocessData(train_dataset, valid_dataset):
    # Remove the Valid_data with value > 250
    valid_mean = np.mean(valid_dataset.astype(float))
    valid_std = np.std(valid_dataset.astype(float))
    zscore = (valid_dataset[:].astype(float) - valid_mean) / valid_std
    outlier_mask = np.abs(zscore) > 2

    # valid_mask = valid_dataset[:].astype(float) <= 250
    train_dataset = train_dataset[~outlier_mask]
    valid_dataset = valid_dataset[~outlier_mask]

    #Remove the Train dataset with value > 90
    train_mask = train_dataset[:].astype(float) <= 90

    train_dataset = train_dataset[train_mask]
    valid_dataset = valid_dataset[train_mask]

    valid_mask = valid_dataset[:].astype(float) <= 140
    train_dataset = train_dataset[valid_mask]
    valid_dataset = valid_dataset[valid_mask]



    return train_dataset, valid_dataset


#### Step 3: Implement Regression
> use Gradient Descent to finish this part

In [23]:

def GradientDescent(X, y, alpha, iterations, n , m , c):
    for i in range(iterations):
        newpred = m*X + c
        d_m = (-2/n) * sum(X * (y - newpred))
        d__c = (-2/n) * sum(y - newpred)
        m = m - alpha * d_m
        c = c - alpha * d__c
        coefficient_output.append([m,c])

    return m,c




#### Step 4: Make Prediction

Make prediction of testing dataset and store the values in *output_datalist*
The final *output_datalist* should look something like this
> [ [100], [80], ... , [90] ] where each row contains the predicted SBP

Remember to also store your coefficient update in *coefficient_output*
The final *coefficient_output* should look something like this
> [ [1, 0, 3, 5], ... , [0.1, 0.3, 0.2, 0.5] ] where each row contains the [w0, w1, ..., wn] of your coefficient





In [24]:
def MakePrediction(m, X, c):
    prediction = m*X+c
    return prediction


#### Step 5: Train Model and Generate Result

> Notice: **Remember to output the coefficients of the model here**, otherwise 5 points would be deducted
* If your regression model is *3x^2 + 2x^1 + 1*, your output would be:
```
3 2 1
```



In [27]:
coefficient_output = []
train, valid, cekTrain, cekValid = SplitData()
X, Y = PreprocessData(train, valid)
X = X.astype(float) # convert data type to float
Y= Y.astype(float) # convert data type to float
length = float(len(X))
coeff,bias = GradientDescent(X, Y, 0.0001, 200000, length, 0, 0)

cekTrain = cekTrain.astype(float) # convert data type to float
cekValid = cekValid.astype(float) # convert data type to float

newpred = MakePrediction(coeff, cekTrain, bias) #Test Mape


test_col = testing_datalist[:, 0]
test_col = test_col[1:].astype(float)

output_datalist = MakePrediction(coeff, test_col, bias) #Final Result
output_datalist = output_datalist.reshape((len(output_datalist),1))

######Mape Calculation##########################################################################################
mapes=[]
for i in range(len(newpred)):
    mapes.append(abs((cekValid[i]-newpred[i]))/abs(cekValid[i]))

hasil = np.mean(mapes) * 100

print("MAPE :"+ str((hasil))+"%")
print("Coefficients: "+ str(round(coeff))+" "+str(round(bias)))
##############################################################################################################

MAPE :5.447731385014828%
Coefficients: 1 34


### *Write the Output File*

Write the prediction to output csv
> Format: 'sbp'

**Write the coefficient update to csv**
> Format: 'w0', 'w1', ..., 'wn'
>*   The number of columns is based on your number of coefficient
>*   The number of row is based on your number of iterations

In [28]:
with open(output_dataroot, 'w', newline='', encoding="utf-8") as csvfile:
  writer = csv.writer(csvfile)
  for row in output_datalist:
    writer.writerow(row)

with open(coefficient_output_dataroot, 'w', newline='', encoding="utf-8") as csvfile:
  writer = csv.writer(csvfile)
  for row in coefficient_output:
    writer.writerow(row)

# **2. Advanced Part (40%)**
In the second part, you need to implement the regression in a different way than the basic part to help your predictions of multiple patients SBP.

You can choose **either** Matrix Inversion or Gradient Descent method.

The training data will be in **hw1_advanced_training.csv** and the testing data will be in **hw1_advanced_testing.csv**.

Output your prediction in **hw1_advanced.csv**

Notice:
> You cannot import any other package other than those given



### Input the training and testing dataset

In [29]:
training_dataroot = 'hw1_advanced_training.csv' # Training data file file named as 'hw1_basic_training.csv'
testing_dataroot = 'hw1_advanced_testing.csv'   # Testing data file named as 'hw1_basic_training.csv'
output_dataroot = 'hw1_advanced.csv' # Output file will be named as 'hw1_basic.csv'

training_datalist =  [] # Training datalist, saved as numpy array
testing_datalist =  [] # Testing datalist, saved as numpy array

output_datalist =  [] # Your prediction, should be 220 * 1 matrix and saved as numpy array
                      # The format of each row should be ['sbp']

### Your Implementation

In [30]:
mape =[] # Mean absolute percentage error for each subject
tabungan =[]

Load Input

In [31]:
with open(training_dataroot, newline='') as csvfile:
  training_datalist = pd.read_csv(csvfile)

with open(testing_dataroot, newline='') as csvfile:
  testing_datalist = pd.read_csv(csvfile)

In [32]:
def encode():

    for i,encodedMap in enumerate(training_datalist['subject_id'].unique()):
        training_datalist['subject_id'].replace(encodedMap,i+1,inplace=True)
        testing_datalist['subject_id'].replace(encodedMap,i+1,inplace=True)



In [33]:
def iqr(datasetku): #IQR
    q1 = np.percentile(datasetku, 25)
    q3 = np.percentile(datasetku, 75)
    iqr = q3 - q1
    lower = q1 -(1.5 * iqr)
    upper = q3 +(1.5 * iqr)
    return upper, lower

In [34]:
def SplitPreProData():
    encode() #Encode the subject_id
    dataset = training_datalist[['subject_id','temperature', 'heartrate', 'resprate', 'o2sat', 'sbp']]
    test_dataset = testing_datalist[['subject_id','temperature', 'heartrate', 'resprate', 'o2sat']]
    dataset = dataset.dropna(subset=['temperature', 'heartrate', 'resprate', 'o2sat'])

    upper_resp, lower_resp = iqr(dataset['resprate'])
    upper_hr, lower_hr = iqr(dataset['heartrate'])
    upper_o2, lower_o2 = iqr(dataset['o2sat'])
    upper_temp, lower_temp = iqr(dataset['temperature'])
    dataset = dataset[(dataset['resprate'] <= upper_resp) & (dataset['resprate'] >= lower_resp)]
    dataset = dataset[(dataset['heartrate'] <= upper_hr) & (dataset['heartrate'] >= lower_hr)]
    dataset = dataset[(dataset['o2sat'] >= lower_o2) & (dataset['o2sat'] <= upper_o2)]
    dataset = dataset[(dataset['temperature'] >= lower_temp ) & (dataset['temperature'] <= upper_temp)]
    train_dataset = dataset[['subject_id','temperature', 'heartrate', 'resprate', 'o2sat']]
    valid_dataset = dataset[['sbp']]

    train_dataset.insert(0, 'bias', 1)
    test_dataset.insert(0, 'bias', 1)


    train_data ={}
    valid_data = {}
    test_data = {}
    cekTrain ={}
    cekValid = {}
   

    for subject_id in training_datalist['subject_id'].unique():
        
        subject_data = train_dataset[train_dataset['subject_id'] == subject_id].drop(columns=['subject_id'])
        subject_valid = valid_dataset[train_dataset['subject_id'] == subject_id]
        splitidx = int(len(subject_data)*0.8)

        train_data[subject_id] = subject_data[:splitidx].values
        valid_data[subject_id] = subject_valid[:splitidx].values
        cekTrain[subject_id] = subject_data[splitidx:].values #Dataset for Testing MAPE 20%
        cekValid[subject_id] = subject_valid[splitidx:].values #Dataset for Testing MAPE 20%

    for subject_id in testing_datalist['subject_id'].unique():
        test_sub = test_dataset[test_dataset['subject_id'] == subject_id].drop(columns=['subject_id'])
        test_data[subject_id] = test_sub.values

    
        
   


    return train_data, valid_data, test_data, cekTrain, cekValid


In [35]:
def MatrixInversion(train_dataset, valid_dataset):

    x = train_dataset
    y = valid_dataset

    x_trans = np.transpose(x)
    mult_x = np.dot(x_trans, x)
    inv_mult_x = np.linalg.inv(mult_x)
    final = np.dot(np.dot(inv_mult_x, x_trans), y) #coeefficents

    return final,x,y




In [36]:

def MakePrediction(data, coeffs):
    prediction = np.dot(data, coeffs)
    return prediction


In [21]:
train, valid, test, cekTrain, cekValid = SplitPreProData() #Split data dan Hasil Preprocess data yg dipotong outlier
model_coeffs = {}
Y_valid = {}
new_result = {}
tabungan = []
output_datalist = []

for subject_id in training_datalist['subject_id'].unique():
    train_ku = train[subject_id].astype(float)
    valid_ku = valid[subject_id].astype(float)
    model_coeffs[subject_id],X_train,Y_valid[subject_id] = MatrixInversion(train_ku, valid_ku)

#Testing Datalist (Output Datalist)
for subject_id in testing_datalist['subject_id'].unique():
    tabungan.append(MakePrediction(test[subject_id], model_coeffs[subject_id])) #Final Result

for i in range(len(tabungan)):
    for j in range(len(tabungan[i])):
        output_datalist.append(tabungan[i][j])
############################################################################################################


# Test 20% dataset （MAPE）
for subject_id in training_datalist['subject_id'].unique():
    new_result[subject_id]=MakePrediction(cekTrain[subject_id], model_coeffs[subject_id]) #Test Mape

rata2 = 0
for subject_id in training_datalist['subject_id'].unique():
    mape = []
    for i in range(len(new_result[subject_id])):
        mape.append(abs((cekValid[subject_id][i]-new_result[subject_id][i]))/abs(cekValid[subject_id][i]))

    hasil = np.mean(mape) * 100
    rata2 += hasil
    print("Mape ", subject_id, ":", hasil, "%")

rata2 = rata2/len(training_datalist['subject_id'].unique())
print("Rata-rata MAPE :", rata2, "%")
############################################################################################################




Mape  1 : 10.941775306321471 %
Mape  2 : 9.480207500298791 %
Mape  3 : 15.847571816471028 %
Mape  4 : 11.615722886697426 %
Mape  5 : 7.957093790008299 %
Mape  6 : 15.231450541720209 %
Mape  7 : 8.720614405590746 %
Mape  8 : 13.839379865620751 %
Mape  9 : 13.767378930905746 %
Mape  10 : 10.316288730324652 %
Mape  11 : 11.286465485305417 %
Rata-rata MAPE : 11.727631750842233 %


Gradient Descent

In [37]:
def Gradient_Descent(x , y , iterations , Alpha, m) :
    theta = np.zeros((x.shape[1],1))
    for i in range(iterations) :
        Prediction = np.dot(x , theta)
        Error = Prediction - y
        gradient = (1/m) * np.dot(x.T , Error)
        theta = theta - Alpha * gradient
        
    return theta

In [39]:
def SplitPreProData():
    encode() #Encode the subject_id
    dataset = training_datalist[['subject_id','temperature', 'heartrate', 'resprate', 'o2sat', 'sbp']]
    test_dataset = testing_datalist[['subject_id','temperature', 'heartrate', 'resprate', 'o2sat']]
    dataset = dataset.dropna(subset=['temperature', 'heartrate', 'resprate', 'o2sat'])
   
    upper_resp, lower_resp = iqr(dataset['resprate'])
    upper_hr, lower_hr = iqr(dataset['heartrate'])
    upper_o2, lower_o2 = iqr(dataset['o2sat'])
    upper_temp, lower_temp = iqr(dataset['temperature']) 
    dataset = dataset[(dataset['resprate'] <= upper_resp) & (dataset['resprate'] >= lower_resp)]
    dataset = dataset[(dataset['heartrate'] <= upper_hr) & (dataset['heartrate'] >= lower_hr)]
    dataset = dataset[(dataset['o2sat'] >= lower_o2) & (dataset['o2sat'] <= upper_o2)]
    dataset = dataset[(dataset['temperature'] >= lower_temp ) & (dataset['temperature'] <= upper_temp)]
    train_dataset = dataset[['subject_id','temperature', 'heartrate', 'resprate', 'o2sat']]
    valid_dataset = dataset[['sbp']]

    train_dataset.insert(0, 'bias', 1)
    test_dataset.insert(0, 'bias', 1)


    train_data ={}
    valid_data = {}
    test_data = {}
    cekTrain ={}
    cekValid = {}

    for subject_id in training_datalist['subject_id'].unique():
        subject_data = train_dataset[train_dataset['subject_id'] == subject_id].drop(columns=['subject_id'])
        subject_valid = valid_dataset[train_dataset['subject_id'] == subject_id]
        splitidx = int(len(subject_data)*0.8)
        train_data[subject_id] = subject_data[:splitidx].values
        valid_data[subject_id] = subject_valid[:splitidx].values
        cekTrain[subject_id] = subject_data[splitidx:].values
        cekValid[subject_id] = subject_valid[splitidx:].values

    for subject_id in testing_datalist['subject_id'].unique():
        test_sub = test_dataset[test_dataset['subject_id'] == subject_id].drop(columns=['subject_id'])
        test_data[subject_id] = test_sub.values
        

    


    return train_data, valid_data, test_data, cekTrain, cekValid


In [40]:

def MakePrediction(data, coeffs):
    prediction = np.dot(data, coeffs)
    return prediction

13.32


In [41]:
train, valid, test, cekTrain, cekValid = SplitPreProData() #Split data dan Hasil Preprocess data yg dipotong 
alpha = 0.000001
myTheta ={}
new_result = {}
for subject_id in training_datalist['subject_id'].unique():
    m = len(train[subject_id])
    myTheta[subject_id] = Gradient_Descent(train[subject_id], valid[subject_id],500000, alpha, m)

#Testing Datalist (Output Datalist)
for subject_id in testing_datalist['subject_id'].unique():
    tabungan.append(MakePrediction(test[subject_id], myTheta[subject_id])) #Final Result

for i in range(len(tabungan)):
    for j in range(len(tabungan[i])):
        output_datalist.append(tabungan[i][j])


#Make Prediction (MAPE Calculation)
############################################################################################################

for subject_id in training_datalist['subject_id'].unique():
    new_result[subject_id]=MakePrediction(cekTrain[subject_id], myTheta[subject_id]) #Test Mape

rata2 = 0
for subject_id in training_datalist['subject_id'].unique():
    mape = []
    hasil = 0
    for i in range(len(new_result[subject_id])):
        mape.append(abs((cekValid[subject_id][i]-new_result[subject_id][i]))/abs(cekValid[subject_id][i]))

    hasil = np.mean(mape) * 100
    rata2 += hasil
    print("Mape ", subject_id, ":", hasil, "%")

rata2 = rata2/len(training_datalist['subject_id'].unique())
print("Rata-rata MAPE :", rata2, "%")
############################################################################################################




Mape  1 : 11.02448313717266 %
Mape  2 : 9.664178182621333 %
Mape  3 : 16.141628275984218 %
Mape  4 : 11.751044630760315 %
Mape  5 : 7.795393528074119 %
Mape  6 : 14.979934518836993 %
Mape  7 : 8.533093063216443 %
Mape  8 : 13.200718902650504 %
Mape  9 : 13.744032411460392 %
Mape  10 : 10.320047059151815 %
Mape  11 : 11.41715892428236 %
Rata-rata MAPE : 11.688337512201015 %


### Output your Prediction

> your filename should be **hw1_advanced.csv**

In [42]:
with open(output_dataroot, 'w', newline='', encoding="utf-8") as csvfile:
  writer = csv.writer(csvfile)
  for row in output_datalist:
    writer.writerow(row)

# Report *(5%)*

Report should be submitted as a pdf file **hw1_report.pdf**

*   Briefly describe the difficulty you encountered
*   Summarize your work and your reflections
*   No more than one page






# Save the Code File
Please save your code and submit it as an ipynb file! (**hw1.ipynb**)