## Task1 


Logistic Regression is easy to implement but still has great training 
efficiency and is one of the simplest machine learning algorithms. It is also easy to train model with 
this algorithm because it doesn’t require high computation power and the training time of this 
algorithm is also far less than other complex algorithms. One of the biggest advantage over other 
models which only gives the final classification as results is that logistics regression also outputs 
well-calibrated probabilities along with classification results and we can also update model to easily 
reflect new data, unlike decision tree. So, if a training example gives us 90% probability for a class, 
and another training example gives us 70% probability for same class, we can inference about which
training examples are more accurate. 

In [2]:
# Import libraries
import numpy as np
import pandas as pd
#import sklearn
import sklearn.datasets
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score


In [89]:
#Activation functions used : Done by Anshul

def sigmoid(x):
    return 1/(1+np.exp(-x))

# Derivative of sigmoid function :
def sigmoid_der(x):
    return sigmoid(x)*(1-sigmoid(x))

#Hyperbolic Arctan function
def arctan(x):
    return np.arctan(x)

#Relu
def relu(Z):  
    return np.maximum(Z, 0)

In [3]:
#optimization with backpropagation : Done by Kashika
def optimize(x, y,learning_rate,iterations,weight,bias): 
    size = x.shape[0]
    for i in range(iterations): 
        output = sigmoid(np.dot(x, weight) + bias)
        loss = -1/size * np.sum(y * np.log(output)) + (1 - y) * np.log(1-output)
        dW = 1/size * np.dot(x.T, (output - y))
        db = 1/size * np.sum(output - y)
        weight -= learning_rate * dW
        bias -= learning_rate * db 
    return weight,bias

# Function to train and optimize the model
def train(x, y, learning_rate,iterations):
    weight_new,bias_new = optimize(x, y, learning_rate, iterations ,weight,bias)
    return weight_new,bias_new

In [5]:
#Used the code created in Machine Learning Assignment 2 

# Find the min and max values for each column
def minmax_finder(data):
    minmax = list()
    for i in range(len(x.columns)):
            col_vals = [row[i] for row in data]
            min_val = min(col_vals)
            max_val = max(col_vals)
            minmax.append([min_val, max_val])
    return minmax
 
# Rescale dataset columns to the range 0-1
def normalize_data(data, minmax):
    for row in data:
        for i in range(len(row)):
            row[i] = (row[i] - minmax[i][0]) / (minmax[i][1] - minmax[i][0])

## Task2

After running the above logistic regression model on both of these datasets the accuracy for 'blobs250' is 100% in both validation and test dataset as it is a linear dataset and for 'moons400' is 90% and 81% respectively as it is a non linear dataset.

Dataset 1 : blobs250 

In [4]:
df = pd.read_csv("blobs250.csv")
#separating target features from other features 
y = df['Class'].values
x = df.drop('Class', axis =1)
x_1 = x.values.astype(float)
#  Normalize columns
#minmax = minmax_finder(x_1)
#normalize_data(x_1, minmax)
#x = pd.DataFrame(x_1)
df.head()

Unnamed: 0,X0,X1,X2,Class
0,0.9614,5.677191,11.40702,0
1,2.372228,5.335292,9.460564,0
2,2.022249,7.501127,9.072816,0
3,4.464773,7.819388,9.183951,0
4,1.191087,5.880269,10.119531,0


In [8]:
#Using traintestsplit to split the data into training, validation and trsting
x_dum, X_test, y_dum, y_test = train_test_split(x,y,test_size=0.15,train_size=0.85)
X_train, X_val, y_train, y_val = train_test_split(x_dum,y_dum,test_size = 0.15,train_size =0.85)

#Initialize weights and bias
weight = np.zeros(x.shape[1])
bias = 0
#call train function
weight,bias = train(X_train, y_train, learning_rate = 0.02, iterations = 1000)

# Predict the validation dataset using the trained model
output_values = np.dot(X_val, weight) + bias
predictions = sigmoid(output_values) >= 1/2
accuracy_score(predictions, y_val)

1.0

In [9]:
# Predict the test dataset using the trained model
output_values = np.dot(X_test, weight) + bias
predictions = sigmoid(output_values) >= 1/2
accuracy_score(predictions, y_test)

1.0

Dataset 2 : moons400

In [12]:
df = pd.read_csv("moons400.csv")
#separating target features from other features 
y = df['Class'].values
x = df.drop('Class', axis =1)
x_1 = x.values.astype(float) # returns a numpy array of type float
#  Normalize columns
minmax = minmax_finder(x_1)
normalize_data(x_1, minmax)
x = pd.DataFrame(x_1)

In [13]:
x_dum, X_test, y_dum, y_test = train_test_split(x,y,test_size=0.15,train_size=0.85)
X_train, X_val, y_train, y_val = train_test_split(x_dum,y_dum,test_size = 0.15,train_size =0.85)

#Initialize weights and bias
weight = np.zeros(x.shape[1])
bias = 0

weight,bias = train(X_train, y_train, learning_rate = 0.02, iterations = 1000)

# Predict the validation dataset using the trained model
output_values = np.dot(X_val, weight) + bias
predictions = sigmoid(output_values) >= 1/2
accuracy_score(predictions, y_val)

0.9019607843137255

In [14]:
# Predict the test dataset using the trained model
output_values = np.dot(X_test, weight) + bias
predictions = sigmoid(output_values) >= 1/2
accuracy_score(predictions, y_test)

0.8166666666666667

## Task 3 - Shallow Neural Network

The below shallow NN has 1 hidden layer(sigmoid function) and output layer(Hyperbolic tangent function).


##### Results:
Blobs250 : 53% Accuracy

Moons400 : 51% Accuracy

In [80]:
#Function to calculate and update the weights with 1 Hidden layer : Done by Anshul
def NN_train(x,y,weight_layer1,weight_output,bias_layer1,bias_out,lr,iterations):
    for epoch in range(iterations):

     # Input and Output for Layer1
        input_layer1 = np.dot(x, weight_layer1)
        output_layer1 = sigmoid(input_layer1)
     # Input and output for output layer :
        input_output = np.dot(output_layer1, weight_output)
        output_output = arctan(input_output)

     # Calculate the Mean Squared Error :
        MSE = ((1 / 2) * (np.power((output_output - y), 2)))

     # Derivatives
        error_ou = output_output - y   
        der_1 = sigmoid_der(input_output)
        ino_wo = output_layer1
        error_out = np.dot(ino_wo.T, error_ou * der_1)
        error_ino = error_ou * der_1
        wt_out = weight_output
        error_outh = np.dot(error_ino , wt_out)
        outh_inh = sigmoid_der(input_layer1) 
        wt_in = x
        error_in = np.dot(wt_in.T, outh_inh * error_outh)
    # Update the weights
        weight_layer1 = weight_layer1 - lr * error_in
        weight_output -= lr * error_out 
    #Update the biases    
        z_delta = error_ou*der_1
        for num in z_delta:
            bias_out -= lr*num
        z_delta = error_ou*outh_inh
        for num in z_delta:
            bias_layer1 -= lr*num

    return weight_layer1,weight_output,bias_layer1,bias_out

#Function to predict the values 
def predictions(X,weight_hidden,weight_output,bias_layer1,bias_out,cutoff):
    # Predictions :
    part1 = sigmoid(np.dot(X, weight_hidden)+bias_layer1)
    part2 = arctan(np.dot(part1,weight_output) + bias_out)  >= cutoff
    return part2

Dataset : Blobs250

In [81]:
df = pd.read_csv("blobs250.csv")
y = df['Class'].values
x = df.drop('Class', axis =1)
x_1 = x.values.astype(float)
minmax = minmax_finder(x_1)
normalize_data(x_1, minmax)
x = pd.DataFrame(x_1)

In [82]:
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=101)
#input_features = X_train
# Define target output :
target_output = y_train
y_train = y_train.reshape(len(target_output),1)
weight_layer1 = np.random.rand(X_train.shape[1],1)
weight_output = 0
bias_out =0
bias_layer1 = 0

In [83]:
weight_hidden,weight_output,bias_layer1,bias_out = NN_train(X_train,y_train,weight_layer1,weight_output,bias_layer1,bias_out,lr=0.01,iterations=500)
prediction = predictions(X_test,weight_layer1,weight_output,bias_layer1,bias_out,1/2)
accuracy_score(prediction, y_test)

0.5301204819277109


Dataset : Moons400

In [84]:
df = pd.read_csv("moons400.csv")
y = df['Class'].values
x = df.drop('Class', axis =1)
x_1 = x.values.astype(float) # returns a numpy array of type float
minmax = minmax_finder(x_1)
normalize_data(x_1, minmax)
x = pd.DataFrame(x_1)

In [85]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=101)
# Define target output :
target_output = y_train
y_train = y_train.reshape(len(target_output),1)
weight_layer1 = np.random.rand(X_train.shape[1],1)
weight_output = 0
bias_out =0
bias_layer1 = 0

In [86]:
weight_hidden,weight_output,bias_layer1,bias_out = NN_train(X_train,y_train,weight_layer1,weight_output,bias_layer1,bias_out,lr=0.01,iterations=5000)
prediction = predictions(X_test,weight_layer1,weight_output,bias_layer1,bias_out,1/2)
accuracy_score(prediction, y_test)

0.5151515151515151

## Task 4 - Image Recognition

##### Classes : Automobile and Deers

The model is trained with 0.01 learning rate and 500 epochs.
##### Results: 
Validation Accuracy :52%

Testing Accuracy    :49%

In [87]:
#The below code to load the dataset was already provided : Code already provided
# This function taken from the CIFAR website
def unpickle(file):
    import pickle
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict
def loadbatch(batchname):
    folder = 'cifar-10-batches-py'
    batch = unpickle(folder+"/"+batchname)
    return batch
def loadlabelnames():
    folder = 'cifar-10-batches-py'
    meta = unpickle(folder+"/"+'batches.meta')
    return meta[b'label_names']
batch1 = loadbatch('data_batch_1')
print("Number of items in the batch is", len(batch1))

# Display all keys, so we can see the ones we want
print('All keys in the batch:', batch1.keys())
data = batch1[b'data']
labels = batch1[b'labels']
print ("size of data in this batch:", len(data), ", size of labels:", len(labels))
print (type(data))
print(data.shape)

names = loadlabelnames()

Number of items in the batch is 4
All keys in the batch: dict_keys([b'batch_label', b'labels', b'data', b'filenames'])
size of data in this batch: 10000 , size of labels: 10000
<class 'numpy.ndarray'>
(10000, 3072)


In [25]:
#Filter the classes 1- Automobile, 4-Deer : Done by Kashika
index = []
for i,label in enumerate(labels):
    if label==1 or label ==4:
        index.append(i)
data_train = data[index]
label_train =np.array(labels)
label_final = label_train[index]

#Selecting only 1 color
X = np.array(data_train[:,:1024])
from sklearn import preprocessing
X_norm = preprocessing.normalize(X)

In [88]:
#X_train, X_test, y_train, y_test = train_test_split(X_norm, label_final, test_size=0.33, random_state=101) : Done by Kashika
x_dum, X_test, y_dum, y_test = train_test_split(X_norm,label_final,test_size=0.15,train_size=0.85)
X_train, X_val, y_train, y_val = train_test_split(x_dum,y_dum,test_size = 0.15,train_size =0.85)

input_features = X_train
# Define target output :
target_output = y_train
# Reshaping our target output into vector :
target_output = target_output.reshape(len(target_output),1)
weight_hidden = np.random.rand(X_train.shape[1],1)
weight_output = 0
bias_layer1 = 0
bias_out = 0

In [78]:
weight_hidden,weight_output,bias_layer1,bias_out = NN_train(X_train,target_output,weight_hidden,weight_output,bias_layer1,bias_out,lr=0.01,iterations=500)
weight_hidden,weight_output,bias_layer1,bias_out

(array([[0.26939114],
        [0.02918426],
        [0.90231431],
        ...,
        [0.93140208],
        [0.09968117],
        [0.86308507]]),
 array([[9.60621588]]),
 array([0.01164695]),
 array([9.60623341]))

In [60]:
#Validation
prediction = predictions(X_val,weight_hidden,weight_output,bias_layer1,bias_out, 1/2)  
accuracy_score(prediction, y_val)

0.5198412698412699

In [61]:
#Testing
prediction = predictions(X_test,weight_hidden,weight_output,bias_layer1,bias_out, 1/2)  
accuracy_score(prediction, y_test)

0.4966216216216216

## Task5 - Enhancement 1 - Change the Activation function in Neural Network

##### Results: 
Validation Accuracy :52%

Testing Accuracy    :49%

In [62]:
def NN_train_En1(x,y,weight_layer1,weight_output,bias_layer1,bias_out,lr,iterations):
    
    for epoch in range(iterations):

     # Input and Output for Layer1
        input_layer1 = np.dot(x, weight_layer1)
        output_layer1 = sigmoid(input_layer1)
     # Input and output for output layer :
        input_output = np.dot(output_layer1, weight_output)
        output_output = relu(input_output)
     # Calculate the Mean Squared Error :
        MSE = ((1 / 2) * (np.power((output_output - y), 2)))

     # Derivatives
        error_ou = output_output - y   
        der_1 = sigmoid_der(input_output)
        ino_wo = output_layer1
        error_out = np.dot(ino_wo.T, error_ou * der_1)
        error_ino = error_ou * der_1
        wt_out = weight_output
        error_outh = np.dot(error_ino , wt_out)
        outh_inh = sigmoid_der(input_layer1) 
        wt_in = x
        error_in = np.dot(wt_in.T, outh_inh * error_outh)
    # Update the weights
        weight_layer1 = weight_layer1 - lr * error_in
        weight_output -= lr * error_out 
    #Update the biases    
        z_delta = error_ou*der_1
        for num in z_delta:
            bias_out -= lr*num
        z_delta = error_ou*outh_inh
        for num in z_delta:
            bias_layer1 -= lr*num

    return weight_layer1,weight_output,bias_layer1,bias_out


In [63]:
def predictions_En1(X_test,weight_hidden,weight_output,bias_layer1,bias_out,cut_off):
    # Predictions :
    part1 = sigmoid(np.dot(X_test, weight_hidden)+bias_layer1)
    part2 = relu(np.dot(part1,weight_output)+bias_out) >= cut_off
    return part2

In [75]:
weight_hidden = np.random.rand(X_train.shape[1],1)
weight_output = 0
bias_layer1 = 0
bias_out = 0

In [76]:
weight_hidden,weight_output,bias_layer1,bias_out = NN_train_En1(X_train,target_output,weight_hidden,weight_output,
                                                                bias_layer1,bias_out,lr=0.01,iterations=500)
weight_hidden,weight_output,bias_layer1,bias_out

(array([[0.36242696],
        [0.17065629],
        [0.24237598],
        ...,
        [0.68994841],
        [0.58921243],
        [0.90836143]]),
 array([[2.53473968]]),
 array([-0.01596263]),
 array([2.53451942]))

In [66]:
#Validation
prediction = predictions_En1(X_val,weight_hidden,weight_output,bias_layer1,bias_out, 1/2)  
accuracy_score(prediction, y_val)

0.5198412698412699

In [67]:
#Testing
prediction = predictions_En1(X_test,weight_hidden,weight_output,bias_layer1,bias_out, 1/2) 
accuracy_score(prediction, y_test)

0.4966216216216216

## Task5 Enhancement 2 - L2 Regularization

L2 Regularization is used to reduce the overfitting by decreasing the value of weights. Although there is no change in the accuracy of the model it will still remain the same for this dataset. Although the weights and biases are different.

##### Results: 
Validation Accuracy :52%

Testing Accuracy    :49%

In [68]:
def NN_train_En2(x,y,weight_layer1,weight_output,bias_layer1,bias_out,lr,iterations):
    
    for epoch in range(iterations):

     # Input and Output for Layer1
        input_layer1 = np.dot(x, weight_layer1)
        output_layer1 = sigmoid(input_layer1)
     # Input and output for output layer :
        input_output = np.dot(output_layer1, weight_output)
        output_output = arctan(input_output)
     # Calculate the Mean Squared Error :
        MSE = ((1 / 2) * (np.power((output_output - y), 2)))

     # Derivatives
        error_ou = output_output - y   
        der_1 = sigmoid_der(input_output)
        ino_wo = output_layer1
        error_out = np.dot(ino_wo.T, error_ou * der_1)
        error_ino = error_ou * der_1
        wt_out = weight_output
        error_outh = np.dot(error_ino , wt_out)
        outh_inh = sigmoid_der(input_layer1) 
        wt_in = x
        error_in = np.dot(wt_in.T, outh_inh * error_outh)
        l2_lambda = 0.7
        m = x.shape[0]
    # Update the weights
        weight_layer1 = weight_layer1 - lr * (error_in + ((l2_lambda/m) * weight_layer1))
        weight_output -= lr * error_out 
    #Update the biases    
        z_delta = error_ou*der_1
        for num in z_delta:
            bias_out -= lr*num
        z_delta = error_ou*outh_inh
        for num in z_delta:
            bias_layer1 -= lr*num
    return weight_layer1,weight_output,bias_layer1,bias_out


In [69]:
def predictions_En2(X_test,weight_hidden,weight_output,bias_layer1,bias_out,cut_off):
    # Predictions :
    part1 = sigmoid(np.dot(X_test, weight_hidden)+bias_layer1)
    part2 = arctan(np.dot(part1,weight_output)+bias_out) >= cut_off
    return part2

In [70]:
weight_hidden = np.random.rand(X_train.shape[1],1)
weight_output = 0
bias_layer1 = 0
bias_out = 0

In [71]:
weight_hidden,weight_output,bias_layer1,bias_out = NN_train_En2(input_features,target_output,weight_hidden,bias_layer1,bias_out,weight_output,lr=0.01,iterations=500
                                            )
weight_hidden,weight_output,bias_layer1,bias_out

In [72]:
#Validation
prediction = predictions_En2(X_val,weight_hidden,weight_output,bias_layer1,bias_out, 1/2)  
accuracy_score(prediction, y_val)

0.5198412698412699

In [73]:
#Testing
prediction = predictions_En2(X_test,weight_hidden,weight_output,bias_layer1,bias_out, 1/2) 
accuracy_score(prediction, y_test)

0.4966216216216216

## References :

[1] https://ml-cheatsheet.readthedocs.io/en/latest/activation_functions.html#relu

[2] https://towardsdatascience.com/building-a-neural-network-with-a-single-hidden-layer-using-numpy-923be1180dbf

[3] https://visualstudiomagazine.com/articles/2017/09/01/neural-network-l2.aspx

[4] Week3 - Deep learning Lecture slides