Name: **Luong Nguyen**  
Student ID: **1504210**  

## Introduction to Deep Learning 

### Session01: Linear model for grading
____

**Import the packages needed for this assignment**

In [93]:
# import packages
import numpy as np
import keras

**Load data**

In [94]:
with np.load("grading.npz") as data:
    x = data[ 'x' ]
    y = data[ 'y' ]

**Explore data**

In [95]:
print("x shape: " + str(x.shape))
print("y shape: " + str(y.shape))
print("An example of points from the course: " + str(x[5]))
print("And the corresponding grade: " + str(y[5]))
print()

mean_x = np.mean(x, axis=0)
std_x = np.std(x, axis=0)
max_x = np.max(x, axis=0)
min_x = np.min(x, axis=0)
print("Mean of x: " + str(mean_x))
print("Min of x: " + str(min_x))
print("Max of x: " + str(max_x))
print()

mean_y = np.mean(y)
max_y = np.max(y)
min_y = np.min(y)
print("Mean of y: " + str(mean_y))
print("Min of y: " + str(min_y))
print("Max of y: " + str(max_y))

x shape: (4000, 3)
y shape: (4000,)
An example of points from the course: [66. 27. 48.]
And the corresponding grade: 2.0

Mean of x: [74.01725 55.40175 61.82125]
Min of x: [ 7.  0. 25.]
Max of x: [100. 100.  98.]

Mean of y: 3.03275
Min of y: 0.0
Max of y: 5.0


**Standardize data**

In [96]:
x -= mean_x
x /= std_x # your example x_train /= (2 * std)

print("A standardized row data: " + str(x[5]))

A standardized row data: [-0.44488305 -1.448874   -0.63900849]


**Split data into train and test sets**

In [97]:
# https://stackoverflow.com/questions/3674409/how-to-split-partition-a-dataset-into-training-and-test-datasets-for-e-g-cros
indices = np.random.permutation(x.shape[0])
m_train = 3000
m_test = 1000

train_idx, test_idx = indices[:m_train], indices[m_train:]
train_x, test_x = x[train_idx,:], x[test_idx,:]
train_y, test_y = y[train_idx], y[test_idx]

print ("train_x shape: " + str(train_x.shape))
print ("train_y shape: " + str(train_y.shape))
print ("test_x shape: " + str(test_x.shape))
print ("test_y shape: " + str(test_y.shape))

train_x shape: (3000, 3)
train_y shape: (3000,)
test_x shape: (1000, 3)
test_y shape: (1000,)


We randomly assigned 4000 samples into a train set with 3000 samples and a test set with 1000 samples.

**Build model**

In [98]:
model = keras.models.Sequential()
model.add(keras.layers.Dense(1, input_shape=(3,)))

model.compile(loss='mse', optimizer='sgd')
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_2 (Dense)              (None, 1)                 4         
Total params: 4
Trainable params: 4
Non-trainable params: 0
_________________________________________________________________


**Train model**

In [99]:
def predict(epochs, batch_size):
    """
    Helper function - takes in epochs and batch_size arguments, train the model and print out prediction accuracy and model weight
    """
    print("#############################################################################################")
    print("Training model with epochs = {0}, batch_size = {1}".format(epochs, batch_size))
    hist = model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size)

    prediction_test = model.predict(test_x).reshape(m_test,)
    prediction_train = model.predict(train_x).reshape(m_train,)

    print("\tTrain accuracy: {} %".format(np.sum(np.round(prediction_train) == train_y) / m_train * 100))
    print("\tTest accuracy: {} %".format(np.sum(np.round(prediction_test) == test_y) / m_test * 100))

    print("\tWeight: ")
    print(model.get_weights())

In [102]:
# Does training with more epochs help?
epochs_list = [5, 10, 25, 30]
batch_size = 20

for epochs in epochs_list:
    predict(epochs, batch_size)

#############################################################################################
Training model with epochs = 5, batch_size = 20
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
	Train accuracy: 78.53333333333333 %
	Test accuracy: 77.10000000000001 %
	Weight: 
[array([[1.314534  ],
       [0.31233355],
       [0.35144585]], dtype=float32), array([3.015653], dtype=float32)]
#############################################################################################
Training model with epochs = 10, batch_size = 20
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
	Train accuracy: 78.83333333333333 %
	Test accuracy: 76.8 %
	Weight: 
[array([[1.318964 ],
       [0.3190328],
       [0.3460534]], dtype=float32), array([3.026141], dtype=float32)]
#############################################################################################
Training model with epochs = 25, batch_size = 20
Epoch 1/25
Epoch 2/25
Epoch 3/2

We reach the accuracy of around **78%** on the test set.   
The effect of epochs on the accuracy is not clear in this example. When I first ran the code I thought more epochs would increase the accuracy. But after I rerun the code block I got different effects.

In [101]:
# Does batch_size have an effect
epochs = 25
batch_size_list = [5, 10, 25, 50, 100]

for batch_size in batch_size_list:
    predict(epochs, batch_size)

#############################################################################################
Training model with epochs = 25, batch_size = 5
Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
	Train accuracy: 77.26666666666667 %
	Test accuracy: 75.3 %
	Weight: 
[array([[1.3582518 ],
       [0.31986377],
       [0.35444602]], dtype=float32), array([3.016034], dtype=float32)]
#############################################################################################
Training model with epochs = 25, batch_size = 10
Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21

Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
	Train accuracy: 79.86666666666666 %
	Test accuracy: 78.7 %
	Weight: 
[array([[1.3047019 ],
       [0.32717434],
       [0.35809228]], dtype=float32), array([3.035168], dtype=float32)]
#############################################################################################
Training model with epochs = 25, batch_size = 100
Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
	Train accuracy: 79.86666666666666 %
	Test accuracy: 78.7 %
	Weight: 
[array([[1.3034441 ],
       [0.32521862],
       [0.3573995 ]], dtype=float32), array([3.034707], dtype=float32)]


The smaller the batch_size the longer it took to train the model. Other than that, the effect of batch_size on the accuracy is not clear because we got different effects everytime we reran the code. 

*Do the weights you get make sense?*

From the weights I get, I could see that the 1st feature has the most effect on the grade, maybe it's the exam grade.