# Welcome to ECE421: Introduction to Machine Learning

This is your first assignment of ECE421. In this assignment, wou will
* familiarize yourself with Google colab, NumPy, and scikitlearn
* Implement a simple Perceptron
* Implement linear regression

This file is a Jupyter Notebook. You can double-click on section headers to show code and run each section with Shift+Enter.

## Setup

**IMPORTANT:** You will need to make a copy of this notebook in your Google Drive before you can edit the homework files. You can do so with **File &rarr; Save a copy in Drive**.

In [1]:
#@title mount your Google Drive
#@markdown Your work will be stored in a folder called `ece421_f2024` by default to prevent Colab instance timeouts from deleting your edits.

import os
from google.colab import drive
from importlib import reload
drive.mount('/content/gdrive', force_remount=True)

MessageError: Error: credential propagation was unsuccessful

In [None]:
#@title set up mount symlink

DRIVE_PATH = '/content/gdrive/MyDrive/ece421_f2024'
DRIVE_PYTHON_PATH = DRIVE_PATH.replace('\\', '')
if not os.path.exists(DRIVE_PYTHON_PATH):
  %mkdir $DRIVE_PATH

## make a symlink
SYM_PATH = '/content/ece421_f2024'
if not os.path.exists(SYM_PATH):
  !ln -s $DRIVE_PATH $SYM_PATH

In [None]:
#@title apt install requirements

!apt update
!apt install -y --no-install-recommends \
        build-essential \
        curl \
        git \
        gnupg2 \
        make \
        cmake \
        ffmpeg \
        swig \
        libz-dev \
        unzip \
        zlib1g-dev \
        libglfw3 \
        libglfw3-dev \
        libxrandr2 \
        libxinerama-dev \
        libxi6 \
        libxcursor-dev \
        libgl1-mesa-dev \
        libgl1-mesa-glx \
        libglew-dev \
        libosmesa6-dev \
        lsb-release \
        ack-grep \
        patchelf \
        wget \
        xpra \
        xserver-xorg-dev \
        ffmpeg
!apt-get install python-opengl -y
!apt install xvfb -y

In [None]:
#@title clone homework repo

%cd $SYM_PATH
!git clone https://github.com/erfanmeskar/ece421fall24_assignments.git
ASSIGNMENT_PATH = '/content/gdrive/MyDrive/ece421_f2024/ece421fall24_assignments/A1'
%cd $ASSIGNMENT_PATH
%pip install -r requirements_colab.txt

## A little bit of practice with scikitlearn and numpy

In [None]:
ASSIGNMENT_PATH = '/content/gdrive/MyDrive/ece421_f2024/ece421fall24_assignments/A1'
%cd $ASSIGNMENT_PATH

import os
from IPython.display import display, Markdown
from importlib import reload

from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
import numpy as np

import PerceptronImp
import LinearRegressionImp

/content/gdrive/MyDrive/ece421_f2024/ece421fall24_assignments/A1


In [None]:
# using `sklearn`, we load the *Iris dataset* and split it into a train set and
# a test set.
X_train, y_train = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X_train[50:], y_train[50:],
                                                    test_size=0.2,
                                                    random_state=42)

# we modify the labels into +1 and -1,
# so that it would be suitable for binary classification.
y_train[y_train != 1] = -1
y_test[y_test != 1] = -1
y_train[y_train == 1] = 1
y_test[y_test == 1] = 1

### Investigate the dataset

Let's investigate the dataset by taking a look at the shape of the training dataset and the its first datapoint.

Your dataset must contain 80 datapoints in 4-dimensional space.

In [None]:
print(f"X_train is of type {type(X_train)}, with shape {X_train.shape}")
print(f"y_train is of type: {type(y_train)}, with shape {y_train.shape}")
display(Markdown(rf'Hence, $N={X_train.shape[0]}$ and $d={X_train.shape[1]}$'))

print("\nThe first datapoint:")
display(Markdown(rf'$(\underline{{x}}_1, y_1) = ({X_train[0,:]}, {y_train[0]})$'))


X_train is of type <class 'numpy.ndarray'>, with shape (80, 4)
y_train is of type: <class 'numpy.ndarray'>, with shape (80,)


Hence, $N=80$ and $d=4$


The first datapoint:


$(\underline{x}_1, y_1) = ([7.6 3.  6.6 2.1], -1)$

# Part 1.1: Implementing Pocket Algorithm Using `Numpy`

## Looking into the `pred` function

A Perceptron decision rule is specified by a weight vector of size (d+1)., i.e. $h_{\underline{w}}(\underline{x})=\text{sign}(\underline{w}^T\underline{x})$, where

$\begin{align}
\underline{w}&=(w_0, w_1, \ldots, w_d)\\
\underline{x}&=(x_0=1, x_1, \ldots, x_d).
\end{align}$

In what follows, we first generate three random datapoints with $d$ cordinates. Then augment the datapoints by adding one more cordinate which is set to 1. Next, we generate a random weight vector and use `perceptronImp.pred` function to see the predicted labels for each datapoint.

In [None]:
N, d = 4, 2
np.random.seed(42)

X = np.random.normal(size=(N, d))
print(f"input datapoint = \n{X}")
y = np.array([-1, -1, 1, 1])
print(f"and their true labels = \n{y}")

X_aug = np.hstack((np.ones(shape=(N, 1)), X))
print(f"\ninput datapoint after augmenting = \n{X_aug}")

w = np.random.normal(size=(d+1,))
print(f"\nweight vector = \n{w}")

print("\nPredicted labels:")
for i in range(N):
  display(Markdown(rf'$\hat{{y}}_{i} = {PerceptronImp.pred(X_aug[i, :], w)}$'))

input datapoint = 
[[ 0.49671415 -0.1382643 ]
 [ 0.64768854  1.52302986]
 [-0.23415337 -0.23413696]
 [ 1.57921282  0.76743473]]
and their true labels = 
[-1 -1  1  1]

input datapoint after augmenting = 
[[ 1.          0.49671415 -0.1382643 ]
 [ 1.          0.64768854  1.52302986]
 [ 1.         -0.23415337 -0.23413696]
 [ 1.          1.57921282  0.76743473]]

weight vector = 
[-0.46947439  0.54256004 -0.46341769]

Predicted labels:


$\hat{y}_0 = -1$

$\hat{y}_1 = -1$

$\hat{y}_2 = -1$

$\hat{y}_3 = 1$

## $E_{\text{in}}(\underline{w})$

Now it is your turn to implement the `errorPer` function in the file `perceptronImp.py` to find the in-sample error, *i.e.*, the average number of points that are missclasified.

\\
**TODO:**

functions to edit:
* `errorPer` in `perceptronImp.py`

\\
**NOTE:** Don't forget to consider the case of the datapoint being on the hyperplan. In this case, you should have a missclasification.

In [None]:
#@title test 1
#@markdown for the example above, your errorPer function output must be 0.25

# reloding the perceptronImp module to implement your changes to the file
reload(PerceptronImp)

# for the example above, your errorPer function output must be 0.25
if PerceptronImp.errorPer(X_aug, y, w) == 0.25:
  print("test 1 result: Good Job!")
else:
  print("test 1 result: Incorrect")

test 1 result: Good Job!


In [None]:
#@title test 2
#@markdown this cell tests if you could successfully handle the case in which a point is on the hyperplane.

# reloding the perceptronImp module to implement your changes to the file
reload(PerceptronImp)

X = np.array([[1, 2],
              [1, 3]])

y = np.array([1, -1])
yp = np.array([-1, -1])

w = np.array([-2, 1])

# for this example, your errorPer function output must be 1. Note that the first
# point is exactly on the hyperplane. Thus, this point must be considered as a
# missclassification, regardless of its true label.
if PerceptronImp.errorPer(X, y, w) == 1 and PerceptronImp.errorPer(X, yp, w) == 1:
  print("test 2 result: Good Job!")
else:
  print("test 2 result: Incorrect")

test 2 result: Good Job!


## Fit your Perceptron

**TODO:**

functions to edit:
* `fit_perceptron` in `perceptronImp.py`


In [None]:
#@title test 3
#@markdown this cell tests if your Perceptron can be trained over a simple mode.

reload(PerceptronImp)

X = np.array([[2],
              [3]])
y = np.array([1, -1])

w = PerceptronImp.fit_perceptron(X, y)

if -w[0]/w[1] < X[1, 0] and -w[0]/w[1] > X[0, 0]:
  print("test 3 result: Good Job!")
else:
  print("test 3 result: Incorrect")


test 3 result: Good Job!


## Confusion Matrix

**TODO:**

functions to edit:
* `confMatrix` in `perceptronImp.py`

In [None]:
#@title test 4
#@markdown this cell is simple tests for your confMatrix.

reload(PerceptronImp)

X = np.array([[1, 1],
              [1, -1],
              [-1, 1],
              [-1, -1]])
y = np.array([1, 1, -1, -1])

conf = PerceptronImp.confMatrix(X, y, np.array([0, 0, 1]))

if np.sum(conf == np.ones(2)) == 4:
  print("test 4 result: Good Job!")
else:
  print("test 4 result: Incorrect")

test 4 result: Good Job!


# Part 1.2: Pocket Algorithm Using `scikit-learn`

In this part, you will use the `scikit-learn` library to train the binary linear classification model.

**TODO:**

functions to edit:
* `test_SciKit` in `perceptronImp.py`

In [None]:
#@title test 5
#@markdown this cell tests if your scikit Perceptron works for a simple linearly separable dataset.

reload(PerceptronImp)

X = np.array([[1, 1],
              [1, -1],
              [-1, 1],
              [-1, -1]])
y = np.array([1, 1, -1, -1])

conf = PerceptronImp.test_SciKit(X, X, y, y)

if np.sum(conf == 2*np.eye(2)) == 4:
  print("test 5 result: Good Job!")
else:
  print("test 5 result: Incorrect")

test 5 result: Good Job!


# Comparing Your Pocket Algorithm with `scikit-learn`

Let's see how your model and the one from `scikit-learn` perform with Iris dataset.

In [None]:
reload(PerceptronImp)
# Pocket algorithm using Numpy
w = PerceptronImp.fit_perceptron(X_train, y_train)
my_conf_mat = PerceptronImp.confMatrix(X_test, y_test, w)

# Pocket algorithm using scikit-learn
scikit_conf_mat = PerceptronImp.test_SciKit(X_train, X_test, y_train, y_test)

# Print the result
print(f"{12*'-'}Test Result{12*'-'}")
print("Confusion Matrix from Part 1a is: \n", my_conf_mat)
print("\nConfusion Matrix from Part 1b is: \n", scikit_conf_mat)

------------Test Result------------
Confusion Matrix from Part 1a is: 
 [[8. 0.]
 [3. 9.]]

Confusion Matrix from Part 1b is: 
 [[ 8  0]
 [ 2 10]]


# Part 2.1: Linear Regression Using `NumPy`

## Mean Squared Error (MSE)

**TODO:** edit the function `mse` to find $E_{\text{in}}(\underline{w})=\frac{1}{N}||\underline{y}-\underline{\hat{y}}||^2$. You find the `pred` function in `LinearRegressionImp.py` useful.

functions to edit:
* `mse` in `LinearRegressionImp.py`

## Fit Your Model

**TODO:** edit the function `mse` to find $E_{\text{in}}(\underline{w})=\frac{1}{N}||\underline{y}-\underline{\hat{y}}||^2$. You may find the `pred` function in `LinearRegressionImp.py` useful. Modify the function `fit_LinRegr` to implement the exact computation of the solution for linear regression
using the NumPy library functions via the least squares method.

functions to edit:
* `mse` in `LinearRegressionImp.py`
* `fit_LinRegr` in `LinearRegressionImp.py`

In [None]:
#@title test 6
#@markdown When we input a singular matrix, the function linalg.inv often returns an error message.

#@markdown In this example, we constrcuted a simple but trouble making $X$. With this X, in your fit_LinRegr(X, y) implementation, is your input to the function linalg.inv a singular
#@markdown matrix? why?

#@markdown Replacing the function `linalg.inv` with `linalg.pinv`, you should get the model’s weight and the “NO
#@markdown ERROR” message. Explain the difference between `linalg.inv` and `linalg.pinv`.

reload(LinearRegressionImp)

X = np.asarray([[1, 2],
                [2, 4],
                [3, 6],
                [4, 8]])
y = np.asarray([1, 2, 3, 4])

try:
  w = LinearRegressionImp.fit_LinRegr(X, y)
  print("weights: ", w)
  print("NO ERROR")
except:
  print("ERROR")


LinearRegressionImp.subtestFn()

weights:  [1.04360964e-14 2.00000000e-01 4.00000000e-01]
NO ERROR
weights:  [1.04360964e-14 2.00000000e-01 4.00000000e-01]
NO ERROR


# Part 2.2: Linear Regression Using `scikit-learn`

In this part, you will use the `scikit-learn` library to train the linear regression model.

**TODO:**

functions to edit:
* `test_SciKit` in `LinearRegressionImp.py`

# Comparing Your Linear Regression Implementation with `scikit-learn`

Let's see how your model and the one from `scikit-learn` perform with diabetes dataset.

In [None]:
reload(LinearRegressionImp)

from sklearn.datasets import load_diabetes
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

X_train, y_train = load_diabetes(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X_train, y_train, test_size=0.2)

w = LinearRegressionImp.fit_LinRegr(X_train, y_train)

#Testing Part 2a
e = LinearRegressionImp.mse(X_test, y_test, w)

#Testing Part 2b
scikit = LinearRegressionImp.test_SciKit(X_train, X_test, y_train, y_test)

print(f"Mean squared error from Part 2a is {e}")
print(f"Mean squared error from Part 2b is {scikit}")

Mean squared error from Part 2a is 3210.193407339104
Mean squared error from Part 2b is 3210.193407339092
