

# Multi-class Classification
### Introduction

In the second part of this assignment, you will implement, train and evaluate several single layer neural networks to predict the correct category of a previously unseen example.
Machine learning models can be trained to analyze various financial indicators such as company earnings, macroeconomic factors, and market trends to predict stock price movements. This can be used by investors to make informed decisions about buying or selling stocks. We will employ the stock price movements dataset for this task. The dataset contains 50 instances each, of upwards, downwards and sideways stock price movement of a specific stock. The goal is to predict the stock price movement on unseen examples using the 4 input features derived from company earnings and macroeconomic factors.


Before starting, we first import NumPy and PyTorch libraries.

In [1]:
import pandas as pd
import torch
import numpy as np

### Only for Google Colab users

If you are using Google Colab, please upload the notebook and the dataset file to DS405B folder on your Google Drive. Then, open the notebook file with Google Colab by right-clicking the *.ipynb file. The code in the following cell mounts your google drive directories in your google Colab environment.

In [2]:
## only for google colab users
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


### Load Dataset

The code for reading the `stock_price.csv` file is already implemented for you in the following code cell. Furthermore, in the code below we map the textual categories (upwards, downwards and sideways) to numerical class labels (0, 1 or 2) and split of the dataset into training (80%) and test (20%) datasets. The input features are also normalised to have zero mean and unit standard deviation.

In [3]:
df = pd.read_csv('/content/stock_price.csv', index_col=None, header=None) # change the path to your own path
df.columns = ['x1', 'x2', 'x3', 'x4', 'y']

d = {'upwards': 1,
     'downwards': 2,
     'sideways': 0,
}

df['y'] = df['y'].map(d)

# Assign features and target

X = torch.tensor(df[['x1', 'x2', 'x3', 'x4']].values, dtype=torch.float)
y = torch.tensor(df['y'].values, dtype=torch.int)

# Shuffling & train/test split

torch.manual_seed(123)
shuffle_idx = torch.randperm(y.size(0), dtype=torch.long)

X, y = X[shuffle_idx], y[shuffle_idx]

percent80 = int(shuffle_idx.size(0)*0.8)

X_train, X_test = X[shuffle_idx[:percent80]], X[shuffle_idx[percent80:]]
y_train, y_test = y[shuffle_idx[:percent80]], y[shuffle_idx[percent80:]]

# Normalize (mean zero, unit variance)

mu, sigma = X_train.mean(dim=0), X_train.std(dim=0)
X_train = (X_train - mu) / sigma
X_test = (X_test - mu) / sigma

**Task 1.Logistic Regression for Multiclass Classification**

In [9]:
import torch.nn as nn
import torch.optim as optim

input_size = X.shape[1]
output_size = len(torch.unique(y))

class LogisticRegression(nn.Module):
    def __init__(self, input_size, output_size):
        super(LogisticRegression, self).__init__()
        self.linear = nn.Linear(input_size, output_size)

    def forward(self, x):
        logits = self.linear(x)
        return torch.sigmoid(logits)

logistic_model = LogisticRegression(input_size, output_size)

criterion = nn.CrossEntropyLoss() # for multi class classification
optimizer = optim.Adam(logistic_model.parameters(), lr=0.01) # Stochastic Gradient Descent

# Training loop
num_epochs = 1000
batch_size = 50
for epoch in range(num_epochs):
    for i in range(0, len(X_train), batch_size):
        batch_X = X_train[i:i+batch_size]
        batch_y = y_train[i:i+batch_size]

        outputs = logistic_model(batch_X)  #fw pass
        loss = criterion(outputs, batch_y.long())

        optimizer.zero_grad() #zero gradients after loss
        loss.backward() #bw
        optimizer.step()

with torch.no_grad():
    logistic_model.eval()
    outputs = logistic_model(X_test)
    _, predicted = torch.max(outputs, 1)
    correct = (predicted == y_test).sum().item()
    total = y_test.size(0)
    accuracy = correct / total
    print(f'Accuracy on test set: {accuracy:.2%}')

Accuracy on test set: 90.00%


**Task 2.Softmax Regression with custom implementation of Cross Entropy
Loss**

In [11]:
class SoftmaxRegression(nn.Module):
    def __init__(self, input_size, num_classes):
        super(SoftmaxRegression, self).__init__()
        self.fc = nn.Linear(input_size, num_classes)
    def forward(self, x):
        x = self.fc(x)
        return x

def one_hot_encoding(labels, num_classes):
    one_hot = torch.zeros(labels.size(0), num_classes) #one hot tensor first fill with zeroes
    for i in range(len(labels)):  #in labels
        one_hot[i, labels[i]] = 1  # replace true key with 1
    return one_hot

def softmax(logits):
    exp_logits = torch.exp(logits) # logit exp for softmax
    return exp_logits / torch.sum(exp_logits, dim=1, keepdim=True) #probability calculation

def cross_entropy_loss(outputs, targets):
    log_probs = torch.log(outputs)  #log of softmax prob
    return -torch.mean(torch.sum(targets * log_probs, dim=1)) # compute loss

#params
input_size = X_train.shape[1]
num_classes = len(df['y'].unique())
learning_rate = 0.07
num_epochs = 30

model = SoftmaxRegression(input_size, num_classes)
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

# Training loop
for epoch in range(num_epochs):
    logits = model(X_train) # fw pass
    outputs = softmax(logits) # find probability
    targets = one_hot_encoding(y_train, num_classes) # find loss
    loss = cross_entropy_loss(outputs, targets)

    optimizer.zero_grad() #zero gradients after loss
    loss.backward() #bw pass
    optimizer.step()

    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')


with torch.no_grad():
    logits = model(X_test)
    outputs = softmax(logits)
    _, predicted = torch.max(outputs, 1)
    accuracy = (predicted == y_test).sum().item() / len(y_test)
    print(f'Accuracy: {accuracy * 100:.2f}%')


Epoch [1/30], Loss: 1.0931
Epoch [2/30], Loss: 0.9104
Epoch [3/30], Loss: 0.7713
Epoch [4/30], Loss: 0.6709
Epoch [5/30], Loss: 0.5978
Epoch [6/30], Loss: 0.5415
Epoch [7/30], Loss: 0.4962
Epoch [8/30], Loss: 0.4588
Epoch [9/30], Loss: 0.4276
Epoch [10/30], Loss: 0.4016
Epoch [11/30], Loss: 0.3799
Epoch [12/30], Loss: 0.3616
Epoch [13/30], Loss: 0.3459
Epoch [14/30], Loss: 0.3323
Epoch [15/30], Loss: 0.3201
Epoch [16/30], Loss: 0.3091
Epoch [17/30], Loss: 0.2989
Epoch [18/30], Loss: 0.2895
Epoch [19/30], Loss: 0.2806
Epoch [20/30], Loss: 0.2723
Epoch [21/30], Loss: 0.2644
Epoch [22/30], Loss: 0.2568
Epoch [23/30], Loss: 0.2495
Epoch [24/30], Loss: 0.2426
Epoch [25/30], Loss: 0.2360
Epoch [26/30], Loss: 0.2298
Epoch [27/30], Loss: 0.2240
Epoch [28/30], Loss: 0.2186
Epoch [29/30], Loss: 0.2135
Epoch [30/30], Loss: 0.2088
Accuracy: 96.67%


**Task 3.Softmax Regression with torch.nn.functional.nll_loss**

In [14]:
import torch.nn.functional as F
model = SoftmaxRegression(input_size, num_classes)
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

# Training loop
for epoch in range(num_epochs):
    logits = model(X_train)  # compute predicted logits
    outputs = F.log_softmax(logits, dim=1)  # apply log(softmax) to logits
    loss = F.nll_loss(outputs, y_train)  # nll

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()  #update

    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')  # Print the loss for each epoch

with torch.no_grad():
    logits = model(X_test)
    outputs = F.log_softmax(logits, dim=1)
    _, predicted = torch.max(outputs, 1)
    accuracy = (predicted == y_test).sum().item() / len(y_test)
    print(f'Accuracy: {accuracy * 100:.2f}%')

Epoch [1/30], Loss: 1.8783
Epoch [2/30], Loss: 1.5530
Epoch [3/30], Loss: 1.2752
Epoch [4/30], Loss: 1.0521
Epoch [5/30], Loss: 0.8822
Epoch [6/30], Loss: 0.7556
Epoch [7/30], Loss: 0.6600
Epoch [8/30], Loss: 0.5863
Epoch [9/30], Loss: 0.5290
Epoch [10/30], Loss: 0.4845
Epoch [11/30], Loss: 0.4504
Epoch [12/30], Loss: 0.4240
Epoch [13/30], Loss: 0.4034
Epoch [14/30], Loss: 0.3868
Epoch [15/30], Loss: 0.3731
Epoch [16/30], Loss: 0.3614
Epoch [17/30], Loss: 0.3513
Epoch [18/30], Loss: 0.3423
Epoch [19/30], Loss: 0.3343
Epoch [20/30], Loss: 0.3270
Epoch [21/30], Loss: 0.3204
Epoch [22/30], Loss: 0.3142
Epoch [23/30], Loss: 0.3085
Epoch [24/30], Loss: 0.3029
Epoch [25/30], Loss: 0.2975
Epoch [26/30], Loss: 0.2921
Epoch [27/30], Loss: 0.2866
Epoch [28/30], Loss: 0.2812
Epoch [29/30], Loss: 0.2757
Epoch [30/30], Loss: 0.2702
Accuracy: 86.67%


**Task 4.Softmax Regression with torch.nn.functional.cross_entropy**

In [13]:
for epoch in range(num_epochs):
    logits = model(X_train)  # compute predicted logits
    outputs = F.log_softmax(logits, dim=1)
    loss = F.cross_entropy(logits, y_train)  #logits as input

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()  #update

    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

    with torch.no_grad():
      logits = model(X_test)
      outputs = F.log_softmax(logits, dim=1)
      _, predicted = torch.max(outputs, 1)
      accuracy = (predicted == y_test).sum().item() / len(y_test)
      print(f'Accuracy: {accuracy * 100:.2f}%')

Epoch [1/30], Loss: 0.2811
Accuracy: 83.33%
Epoch [2/30], Loss: 0.2755
Accuracy: 83.33%
Epoch [3/30], Loss: 0.2701
Accuracy: 86.67%
Epoch [4/30], Loss: 0.2648
Accuracy: 93.33%
Epoch [5/30], Loss: 0.2595
Accuracy: 93.33%
Epoch [6/30], Loss: 0.2543
Accuracy: 90.00%
Epoch [7/30], Loss: 0.2492
Accuracy: 90.00%
Epoch [8/30], Loss: 0.2442
Accuracy: 90.00%
Epoch [9/30], Loss: 0.2395
Accuracy: 90.00%
Epoch [10/30], Loss: 0.2349
Accuracy: 90.00%
Epoch [11/30], Loss: 0.2307
Accuracy: 93.33%
Epoch [12/30], Loss: 0.2267
Accuracy: 93.33%
Epoch [13/30], Loss: 0.2230
Accuracy: 96.67%
Epoch [14/30], Loss: 0.2196
Accuracy: 96.67%
Epoch [15/30], Loss: 0.2165
Accuracy: 96.67%
Epoch [16/30], Loss: 0.2135
Accuracy: 96.67%
Epoch [17/30], Loss: 0.2107
Accuracy: 96.67%
Epoch [18/30], Loss: 0.2080
Accuracy: 96.67%
Epoch [19/30], Loss: 0.2055
Accuracy: 96.67%
Epoch [20/30], Loss: 0.2031
Accuracy: 96.67%
Epoch [21/30], Loss: 0.2008
Accuracy: 96.67%
Epoch [22/30], Loss: 0.1986
Accuracy: 96.67%
Epoch [23/30], Loss

**Task 5.Softmax Regression with Mean Squared Error Loss**

In [24]:
class MulticlassClassifier(nn.Module):
    def __init__(self, input_size, num_classes):
        super(MulticlassClassifier, self).__init__()
        self.fc1 = nn.Linear(input_size, num_classes)

    def forward(self, x):
        x = self.fc1(x)
        return x

input_size = 4
num_classes = 3
model = MulticlassClassifier(input_size, num_classes)
optimizer = optim.Adam(model.parameters(), lr=0.06)

num_epochs = 20
for epoch in range(num_epochs):
    outputs = model(X_train)
    # Apply softmax activation
    probs = F.softmax(outputs, dim=1)
    y_train_onehot = F.one_hot(y_train, num_classes).float() # i do one hot encoding for mse loss
    loss = F.mse_loss(probs, y_train_onehot) #mse loss

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

with torch.no_grad():
    outputs = model(X_test)
    probs = F.softmax(outputs, dim=1)
    _, predicted = torch.max(probs, 1)
    correct = (predicted == y_test).sum().item()
    total = y_test.size(0)
    accuracy = correct / total
    print(f'Accuracy: {accuracy * 100:.2f}%')


Epoch [1/20], Loss: 0.2144
Epoch [2/20], Loss: 0.1806
Epoch [3/20], Loss: 0.1496
Epoch [4/20], Loss: 0.1237
Epoch [5/20], Loss: 0.1053
Epoch [6/20], Loss: 0.0942
Epoch [7/20], Loss: 0.0883
Epoch [8/20], Loss: 0.0851
Epoch [9/20], Loss: 0.0830
Epoch [10/20], Loss: 0.0813
Epoch [11/20], Loss: 0.0796
Epoch [12/20], Loss: 0.0779
Epoch [13/20], Loss: 0.0762
Epoch [14/20], Loss: 0.0746
Epoch [15/20], Loss: 0.0731
Epoch [16/20], Loss: 0.0716
Epoch [17/20], Loss: 0.0703
Epoch [18/20], Loss: 0.0691
Epoch [19/20], Loss: 0.0679
Epoch [20/20], Loss: 0.0668
Accuracy: 76.67%


**Task 6.Linear Regression with Mean Squared Error Loss**

In [40]:
class LinearRegression(nn.Module):
    def __init__(self, input_size, output_size):
        super(LinearRegression, self).__init__()
        self.linear = nn.Linear(input_size, output_size)

    def forward(self, x):
        return self.linear(x)

# input and output sizes
input_size = X_train.shape[1]
output_size = 1  #regressing to a continuous value

model = LinearRegression(input_size, output_size)
criterion = nn.MSELoss()

optimizer = optim.Adam(model.parameters(), lr=0.07)

num_epochs = 20
for epoch in range(num_epochs):
    outputs = model(X_train) #fw pass
    loss = criterion(outputs, y_train.float())

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

with torch.no_grad():
    model.eval()
    outputs = model(X_test)
    predicted = torch.round(outputs)
    correct = (predicted == y_test.float()).sum().item()
    total = y_test.size(0)
    accuracy = correct / total
    print(f'Accuracy on test set: {accuracy:.2%}')  #accuracy for this model is low, i am not sure why this is the case.


Accuracy on test set: 30.00%
