# Assignment-4 (Perceptron, SVM, and Neural Networks)

This part of the assignment shall require you to code the perceptron classifier from the ground up. The perceptron classifier takes uses the weighted sum of input features, uses a threshold to classify between two classes. Usually these classes are -1 and 1 while the threshold is 0

In [193]:
# imports
import pandas as pd
import numpy as np
import math
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

%matplotlib inline

# 1. Prepare the dataset

In [194]:
# import the assigned dataset
features = ['RI', 'Na','Mg', 'Al','Si','K','Ca','Ba']
df = pd.read_csv('glass.csv')

In [195]:
# Preprocess the dataset
#we only need to scale

In [196]:
# Make a train-test split with 80% to training and 20% to testing.
X_train, X_test, y_train, y_test = train_test_split(df[features], df['targets'], test_size=0.2)

In [197]:
# Normalize numerical features and encode the categorical features (if any)
scaler = StandardScaler() #

train_X = scaler.fit_transform(train_X)
test_X = scaler.fit_transform(test_X)

As this is a classification problem, the target variables must be categorical. Using the feature and target variable information, preprocess the dataset to use given features.

This step might require students to scale the features, one-hot encode categorical FEATURES (if any).



For feature scaling and one-hot encoding, go through:

https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html
https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html

# 2: Creating a perceptron model for the processed dataset

The perceptron.py file in the resources provides functions to code the perceptron model from scratch.

Using the file as reference, write the functions:

1. cross_validation_split
2. accuracy_metric
3. evaluate_algorithm
4. predict
5. train_weights
6. perceptron

This step is aimed at providing a comprehensive understanding of the internal functioning of a perceptron model.

In [186]:
from random import randrange


# Split a dataset into k folds
def cross_validation_split(dataset, n_folds):
    dataset_split = list()
    dataset_copy = list(dataset)
    fold_size = int(len(dataset) / n_folds)
    for i in range(n_folds):
        fold = list()
        while len(fold) < fold_size:
            index = randrange(len(dataset_copy))
            fold.append(dataset_copy.pop(index))
        dataset_split.append(fold)
    return dataset_split

# Calculate accuracy percentage
def accuracy_metric(actual, predicted):
    correct = 0
    for i in range(len(actual)):
        if actual[i] == predicted[i]:
            correct += 1
    return correct / float(len(actual)) * 100.0

# Evaluate an algorithm using a cross validation split
def evaluate_algorithm(dataset, algorithm, n_folds, *args):
    folds = cross_validation_split(dataset, n_folds)
    scores = list()
    for i, fold in enumerate(folds):
        train_set = [f for j, f in enumerate(folds) if j != i]
        train_set = sum(train_set, [])
        test_set = list()
        for row in fold:
            row_copy = list(row)
            test_set.append(row_copy)
            row_copy[-1] = None
        predicted = algorithm(train_set, test_set, *args)
        actual = [row[-1] for row in fold]
        accuracy = accuracy_metric(actual, predicted)
        scores.append(accuracy)
    return scores

# Make a prediction with weights
def predict(row, weights):
    # Calculate the dot product of the input features and the weights
    # for each class.
    dot_products = [0.0 for i in range(len(weights))]
    for i in range(len(row)-1):
        for j in range(len(weights)):
            dot_products[j] += weights[j][i] * row[i]

    # Find the index of the maximum dot product, which corresponds
    # to the predicted class.
    prediction = dot_products.index(max(dot_products))
    return prediction + 1

# Estimate Perceptron weights using stochastic gradient descent
def train_weights(train, l_rate, n_epoch):
    # Initialize weights for each class with zeros.
    num_classes = len(set([row[-1] for row in train]))
    num_features = len(train[0]) - 1
    weights = [[0.0 for i in range(num_features)] for j in range(num_classes)]

    for epoch in range(n_epoch):
        for row in train:
            prediction = predict(row, weights)
            error = row[-1] - prediction
            # Update the weights for the correct class, if the target value is within the valid range.
            # Decrease the target value by 1 to get a valid index for the weights list.
            if 1 <= row[-1] <= num_classes:
                weights[int(row[-1]) - 1][0] = weights[int(row[-1]) - 1][0] + l_rate * error
                for i in range(len(row)-1):
                    # Update the weight for the correct class and feature index.
                    # Decrease the target value by 1 to get a valid index for the weights list.
                    weights[int(row[-1]) - 1][i] = weights[int(row[-1]) - 1][i] + l_rate * error * row[i]
    return weights

# Perceptron Algorithm With Stochastic Gradient Descent
def perceptron(train, test, l_rate, n_epoch):
    predictions = list()
    weights = train_weights(train, l_rate, n_epoch)
    for row in test:
        prediction = predict(row, weights)
        predictions.append(prediction)
    print(predictions)
    return(predictions)




n_folds = 3
l_rate = 0.0005
n_epoch = 5000


scores = evaluate_algorithm(np.concatenate((train_X, train_Y), axis=1), perceptron, n_folds, l_rate, n_epoch)
scores

[4, 2, 2, 1, 2, 4, 5, 2, 2, 2, 2, 2, 2, 4, 2, 2, 6, 1, 2, 2, 4, 1, 2, 2, 2, 2, 4, 1, 4, 2, 1, 2, 2, 2, 1, 6, 5, 2, 5, 4, 4, 2, 1, 2, 1, 6, 1, 2, 1, 2, 2, 5, 2, 1, 1, 2, 2]
[3, 4, 2, 1, 1, 2, 2, 2, 2, 2, 4, 2, 3, 1, 6, 2, 6, 2, 2, 4, 2, 2, 2, 1, 2, 2, 4, 2, 1, 5, 2, 5, 2, 5, 2, 2, 5, 6, 2, 2, 2, 1, 5, 2, 2, 2, 6, 4, 4, 1, 5, 2, 2, 2, 5, 4, 5]
[2, 2, 2, 2, 2, 2, 4, 6, 1, 1, 2, 1, 2, 1, 1, 6, 1, 2, 2, 5, 1, 1, 4, 2, 2, 6, 2, 2, 6, 2, 2, 2, 2, 2, 1, 4, 2, 2, 1, 6, 2, 6, 1, 2, 4, 2, 6, 2, 5, 2, 1, 2, 2, 6, 4, 2, 2]


[35.08771929824561, 38.59649122807017, 26.31578947368421]

# (Bonus) Perceptron model with relaxation

Using Relaxation (the descent theorem), compare the performance of perceptron model with and without relaxation (refer class lectures, slides for details on relaxation).

Make modifications to the loss function in the perceptron model.

In [None]:
# code for relaxation for the perceptron model

# 3: Batch size for perceptron model

Experiment with different batch sizes in the perceptron model (eg: 1, 4, 8).

Report (with figures) the difference in performance when using different batch sizes. Inferences without plots might not be awarded points.

Report the accuracies for various combinations of batch sizes.

In [None]:
# code for step 3

In [None]:
# code for step 3

In [None]:
# code for step 3

# 4. SVM's

### **Note : You are allowed to use sklearn's SVC classifier for steps 4.1 through 4.3**

https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html

---

# 4.1 Linear SVM


---


# Step 1: Implement a linear SVM model to classify the data points. (Look into the 'kernel' parameter).

In [198]:
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split

# linear SVM
svm = SVC(kernel='linear')

# Step 2: Train the model

In [199]:
# training - linear SVM
svm.fit(X_train, y_train)

# Step 3: Predict for the test points using the model trained in the previous step

In [204]:
# predict - linear SVM
y_pred = svm.predict(X_test)
y_pred

array([2, 2, 1, 1, 2, 1, 5, 2, 5, 5, 1, 2, 1, 7, 2, 2, 1, 1, 2, 2, 2, 2,
       7, 7, 1, 1, 1, 2, 2, 2, 7, 2, 2, 2, 1, 2, 1, 6, 2, 2, 2, 1, 2])

# 4.2 Kernel SVM - Polynomial kernel

---



# Step 1: Implement a kernel SVM model with a polynomial kernel to classify the data points.

In [223]:
# kernel SVM - polynomial kernel
clf = SVC(kernel='poly', degree=30)

# Step 2: Train the model

In [224]:
# training - kernel SVM
clf.fit(X_train, y_train)

# Step 3: Predict for the test points using the model trained in the previous step

In [225]:
# predict - kernel SVM
clf.predict(X_test)

array([2, 2, 1, 1, 2, 1, 2, 2, 2, 5, 1, 2, 1, 7, 2, 2, 1, 1, 1, 2, 1, 2,
       7, 7, 1, 1, 1, 2, 2, 2, 7, 2, 2, 6, 1, 2, 1, 6, 2, 2, 2, 1, 2])

# 4.3 Kernel SVM - Gaussian kernel

---



# Step 1: Implement a kernel SVM model with a gaussian (Radian Basis function) kernel to classify the data points.

In [236]:
# kernel SVM - gaussian kernel
clf = SVC(kernel='rbf', C=1000)

# Step 2: Train the model

In [237]:
# training - kernel SVM
clf.fit(X_train, y_train)

# Step 3: Predict for the test points using the model trained in the previous step

In [238]:
# predict - kernel SVM
clf.predict(X_test)

array([2, 2, 1, 1, 2, 1, 5, 2, 5, 2, 1, 2, 1, 7, 2, 2, 1, 1, 2, 2, 2, 2,
       7, 7, 1, 1, 1, 2, 2, 2, 7, 2, 2, 2, 1, 2, 1, 6, 2, 2, 2, 1, 2])

# 4.4 - Evaluation


---


# Take the results from each predict step under sections 4.1, 4.2 and 4.3. Consider accuracy as the evaulation metric. Print the accuracies for each of the 3 SVM models.

Note: Do not use functions from sklearn.metrics

In [239]:
def calculate_accuracy(predicted, actual):
    assert len(predicted) == len(actual)
    count = 0
    for (pred, actual) in zip(predicted, actual):
        if pred == actual:
            count += 1
    return count / len(predicted)

calculate_accuracy(y_pred, y_test)

0.6511627906976745

In [242]:
# space for any imports for the following steps

#libraries used for neural networks
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

ModuleNotFoundError: No module named 'tensorflow'

# 5. Neural Networks



---



# 5.1 Single layer neural network

You can use either PyTorch or Tensorflow for the implementation

---

# Step 1: Implement a single layer neural network to classify the data points

In [241]:
#you don't specify what we can and cannot use from tensorflow, so I will just use keras

# Define the model
model = Sequential()
model.add(Dense(1, input_dim=n_inputs, activation='sigmoid'))

# Compile the model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

NameError: name 'Sequential' is not defined

# Step 2: Train the model

In [None]:
# Train the model
model.fit(train_X, train_y, epochs=n_epochs, batch_size=batch_size)

# Step 3: Predict for the test points using the model trained in the previous step

In [240]:
# Make predictions on new data
predictions = model.predict(test_X)

predictions

NameError: name 'model' is not defined

# 5.2 Multi - Layer neural network

---

# Step 1: Implement a multi - layer neural network to classify the data points

Additional note: use methods to avoid overfitting appropriately

# Step 2: Train the model

# Step 3: Predict for the test points using the model trained in the previous step

# 5.3 Multi - Layer neural network

---

# Step 1: Implement a multi - layer neural network to classify the data points

**Note :** This must have a different network architecture from the model under section 5.2

**Additional note:** use methods to avoid overfitting appropriately

# Step 2: Train the model

# Step 3: Predict for the test points using the model trained in the previous step

# 5.4 - Evaluation


---


# Take the results from each predict step under sections 5.1, 5.2 and 5.3. Consider accuracy as the evaulation metric. Print the accuracies for each of the 3 neural network architectures.

Note: Do not use functions from sklearn.metrics

# 6 - Paragraph questions

---



# Q1 : Briefly explain the methods you used to prevent overfitting for the models under section 5.2 and 5.3

***space for Q1's answer***

# Q2: Compare the performances of models under sections 2, 3, 4 and 5 namely perceptron, descent procedure, SVM's and neural networks. List down few points on why you think certain models performed better than others

***space for Q2's answer***