In [None]:
import pandas as pd
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
import numpy as np
from sklearn.metrics import accuracy_score

In [None]:
pickleFile = open("hw2_p3.pkl","rb")
data = pd.read_pickle(pickleFile)
#data

In [None]:
x_train = data['x_train']
x_test = data['x_test']

y_train = data['y_train']
y_test = data['y_test']

In [None]:
X_train_scaled = StandardScaler().fit_transform(x_train)
X_test_scaled = StandardScaler().fit_transform(x_test)

##Choice of Basis (Part-1)

For the given data, we can choose a polynomial basis function of degree $2$, i.e if $x$ = $[$$x_{1}$, $x_{2}$$]$, then the basis function is: ϕ = $[$$1$, $x_{1}$, $x_{2}$, $x_{1}^2$, $x_{2}^2$, $x_{1} x_{2}$$]$.

Therefore, the equation to fit is:
$y = c_0 + c_1 x_{1} + c_2 x_{2} + c_3 x_{1}^2 + c_4 x_{2}^2 + c_5 x_{1} x_{2}$

The reasons for choosing polynomial basis function for nonlinear logistic regression are as follows:
1. Polynomial basis functions allow logistic regression to establish non-linear decision boundaries, making it possible to deal with datasets that have a non-linear structure and where classes are not linearly separable.

2. Polynomial transformations generate interaction terms and higher-order terms of the original features, enabling the model to capture more complex relationships and interactions between different features, which can be critical in understanding the underlying data patterns.

3. Introducing polynomial features increases the flexibility of the logistic regression model, enabling it to adapt better to the underlying patterns in the data, potentially leading to better classification accuracy.


In [None]:
poly = PolynomialFeatures(degree=2)
X_train_poly = poly.fit_transform(X_train_scaled)
X_test_poly = poly.transform(X_test_scaled)

##Choosing Epochs and Learning Rate; Regression helper functions

In [None]:
epochs = 150
lr = 0.1

In [None]:
def sigmoid(x):
  return np.array([sigmoid_function(value) for value in x])

def sigmoid_function(x):
  if x >= 0:
      z = np.exp(-x)
      return 1 / (1 + z)
  else:
      z = np.exp(x)
      return z / (1 + z)

In [None]:
def compute_gradients(x, y_true, y_pred):
  diff =  y_pred - y_true
  gradients_w = np.matmul(x.transpose(), diff)
  gradients_w = np.array([np.mean(grad) for grad in gradients_w])
  return gradients_w

#Binary Classification (Part-2)

In [None]:
def binaryFit(x, y, x_test):
  weights = np.zeros(x.shape[1])
  for i in range(epochs):
    pred = sigmoid(np.matmul(weights, x.transpose()))
    error_w = compute_gradients(x, y, pred)
    weights -= lr * error_w
  probabilities = sigmoid(np.matmul(x_test, weights.transpose()))
  return [1 if p >= 0.5 else 0 for p in probabilities]

In [None]:
y_pred = binaryFit(X_train_poly, y_train, X_test_poly)
accuracy = accuracy_score(y_test, y_pred)
print("Test accuracy for binary classification is",accuracy)

Test accuracy for binary classification is 0.3333333333333333


#Multi-Class Classification (Part-3)

To adapt the binary logistic regression to a multi-class classification problem, we can use the One-vs-Rest (OvR) strategy. In this approach, for $k$ classes, we train $k$ binary classifiers. For each classifier, one class is treated as the positive class $($$1$$)$, and all other classes are treated as the negative class $($$0$$)$. When making predictions, we choose the class that corresponds to the classifier with the highest output probability.

In [None]:
def multiFit(x, y, x_test):
  classes = np.unique(y)
  weights_list = []
  for cls in classes:
    binary_y = np.where(y == cls, 1, 0)
    weights = np.zeros(x.shape[1])
    for i in range(epochs):
      pred = sigmoid(np.matmul(weights, x.transpose()))
      error_w = compute_gradients(x, binary_y, pred)
      weights -= lr * error_w
    weights_list.append(weights)
  probabilities_list = [sigmoid(np.matmul(x_test, weights.transpose())) for weights in weights_list]
  probabilities = np.vstack(probabilities_list).T
  predictions = np.argmax(probabilities, axis=1)
  return predictions

In [None]:
y_pred = multiFit(X_train_poly, y_train, X_test_poly)
accuracy = accuracy_score(y_test, y_pred)
print("Test accuracy for multi-class classification is",accuracy)

Test accuracy for multi-class classification is 0.9933333333333333


#Final Test Accuracy (Part-4)

As we can see, the final test accuracy for multi-class classification is $99$%. The accuracy was much lower at $33$% for binary classification as we attempted to fit data with more than two classes into a binary classification model.
