# PyTorch - homework 1


Please run the whole notebook with your code and submit the `.ipynb` file that includes your answers. 

In [1]:
from termcolor import colored

student_number = "1004295"
student_name = "Koh Jun Hao"

print(colored("Homework by "  + student_name + ', number: ' + student_number,'red'))

[31mHomework by Koh Jun Hao, number: 1004295[0m


 ## Question 1 -- matrix multiplication

Implement the following mathematical operation on both the CPU and GPU (use Google Colab or another cloud service if you don't have a GPU in your computer). Print:

a) which type of GPU card you have and 

b) show the computation time for both CPU and GPU (using PyTorch). 

c) How much % fast is the GPU? 

 The operation to implement is the dot product $C = B * A^T$

 whereby $A$ is a random matrix of size $20,000 \times 1,000$ and $B$ is a random matrix of size $2,000 \times 1,000$. In addition to the required information asked above:
 
 d) please also print the resulting two $C$ matrices (they should be the same btw). 
 



In [2]:
# implement solution here
import torch
import numpy as np
from tqdm import tqdm

## Part A
print("Type of GPU card:")
print(torch.cuda.get_device_name())
print("\n")

## Part B
def dot_product(A_T, B, device):
  start = torch.cuda.Event(enable_timing=True)
  end = torch.cuda.Event(enable_timing=True)
  A_T = A_T.to(device)
  B = B.to(device)

  start.record()
  C = B @ A_T
  end.record()
  torch.cuda.synchronize()

  time_taken = start.elapsed_time(end)
  return time_taken, C

A = torch.rand(20_000, 1_000)
A_T = torch.transpose(A, 0, 1)
B = torch.rand(2_000, 1_000)
CPU = torch.device("cpu")
GPU = torch.device("cuda")

CPU_time, CPU_output = dot_product(A_T, B, CPU)
GPU_time, GPU_output = dot_product(A_T, B, GPU)

print(f"Time taken for CPU: {CPU_time} ms")
print(f"Time taken for GPU: {GPU_time} ms")
print("\n")

## Part C
speedup = (CPU_time-GPU_time)/GPU_time * 100
print(f"GPU is faster by {speedup}%")
print("\n")

## Part D
print(f"CPU output is {CPU_output}\n")
print(f"GPU output is {GPU_output.cpu()}\n")
# print(f"Are CPU and GPU outputs the same: {torch.eq(CPU_output, GPU_output.cpu())}")





Type of GPU card:
Tesla T4


Time taken for CPU: 2539.18212890625 ms
Time taken for GPU: 97.07884979248047 ms


GPU is faster by 2515.5873646361742%


CPU output is tensor([[262.9389, 262.1331, 258.2029,  ..., 247.3846, 255.4586, 260.5882],
        [255.5988, 255.1808, 252.4708,  ..., 242.8665, 251.2313, 255.8275],
        [254.6067, 250.8993, 247.8918,  ..., 235.8824, 247.9836, 249.4794],
        ...,
        [255.2737, 255.6243, 245.6241,  ..., 237.2198, 255.6026, 252.3542],
        [249.9145, 253.2180, 248.7569,  ..., 239.2458, 248.9812, 255.1312],
        [260.3334, 258.7924, 251.2981,  ..., 243.6195, 261.1603, 259.9942]])

GPU output is tensor([[262.9389, 262.1331, 258.2027,  ..., 247.3848, 255.4585, 260.5882],
        [255.5988, 255.1809, 252.4709,  ..., 242.8665, 251.2314, 255.8275],
        [254.6066, 250.8993, 247.8918,  ..., 235.8824, 247.9835, 249.4793],
        ...,
        [255.2737, 255.6240, 245.6240,  ..., 237.2198, 255.6026, 252.3545],
        [249.9145, 253.2181, 248.

## Question 2 - grad


Find the gradient (partial derivatives) of the function $g(w)$ below. 

Let  $w=[w_1,w_2]^T$

Consider  $g(w)=2w_1w_2+w_2cos(w_1)$

a) In PyTorch, compute:   $\nabla g(w)$ 

 and verify that $\nabla g([\pi,1])=[2,2\pi−1]^T$ using the grad function, whereby the first position is the partial for $w_1$ and the second position is the partial for $w_2$. 

b) You can also write a function to manually calculate these partial derivatives! You can review your differential equations math at [here](https://www.wolframalpha.com/input/?i=derivative+y+cos%28x%29) and implement this as a second function below to verify that it comes to the same solution. 


In [3]:
# write your solution here

## Part A
def delta_g(w):
  w1 = torch.tensor(w[0], requires_grad=True)
  w2 = torch.tensor(w[1], requires_grad=True)
  ans = 2*w1*w2 + w2*torch.cos(w1)
  ans.backward()
  return w1.grad, w2.grad


w = [torch.pi, 1.0]
print(f"Partial derivative calculated using torch: {delta_g(w)}\n")

## Part B
def delta_w1(w):
  w1 = torch.tensor(w[0], requires_grad=True)
  w2 = torch.tensor(w[1], requires_grad=True)
  d_w1 = 2*w2 - w2*torch.sin(w1)
  return d_w1

def delta_w2(w):
  w1 = torch.tensor(w[0], requires_grad=True)
  w2 = torch.tensor(w[1], requires_grad=True)
  d_w2 = 2*w1 + torch.cos(w1)
  return d_w2

print(f"Partial derivative of w1: {delta_w1(w)}")
print(f"Partial derivative of w2: {delta_w2(w)}")

Partial derivative calculated using torch: (tensor(2.), tensor(5.2832))

Partial derivative of w1: 2.0
Partial derivative of w2: 5.2831854820251465


## Question 3 - dance hit song prediction

Implement logistic regression in PyTorch for the following dance hit song prediction training dataset: 
https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030training.csv

 * Input variables: a number of audio features (most already standardized so don't worry about that)
 * Target variable: Topclass1030: 
   * 1 means it was a top 10 hit song; 
   * 0 means it never went above top 30 position.

This dataset is derived from my paper on dance hit song prediction, for full description of features have a look at https://arxiv.org/abs/1905.08076. 

Print the evolution of the loss every few epochs and train the model until it converges. 
 
 After training the logistic regression model, calculate the prediction accuracy on the test set: 
 https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030test.csv








In [4]:
# Your code here
import torch.nn as nn
import torch.nn.functional as F
import pandas as pd 
from torch.utils.data import Dataset, DataLoader

# load data

!wget "https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030training.csv"

csv_file = "herremans_hit_1030training.csv"

csv = pd.read_csv(csv_file)

--2022-06-16 19:43:23--  https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030training.csv
Resolving dorax.s3.ap-south-1.amazonaws.com (dorax.s3.ap-south-1.amazonaws.com)... 52.219.64.38
Connecting to dorax.s3.ap-south-1.amazonaws.com (dorax.s3.ap-south-1.amazonaws.com)|52.219.64.38|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 147372 (144K) [text/csv]
Saving to: ‘herremans_hit_1030training.csv’


2022-06-16 19:43:25 (206 KB/s) - ‘herremans_hit_1030training.csv’ saved [147372/147372]



In [5]:
# load data

csv_file = "herremans_hit_1030training.csv"

class danceSongData(Dataset):
  def __init__(self, csv_file):
    self.csv_file = pd.read_csv(csv_file)
    self.x = torch.tensor(self.csv_file.iloc[:,:-1].values, dtype=torch.float32)
    self.y = torch.tensor(self.csv_file.iloc[:,-1].values, dtype=torch.float32)


  def __len__(self):
    return self.y.shape[0]


  def __getitem__(self, idx):
    return self.x[idx], self.y[idx]


  def num_feature(self):
    return self.x.shape[1]


class Trainer():
  def __init__(self, model, loss_fn, optimiser, config):
    self.model = model
    self.loss_fn = loss_fn
    self.optimiser = optimiser
    self.loss = 0
    self.epochs = config["EPOCHS"]
    self.batch_size = config["BATCH_SIZE"]
  

  def train(self, train_dataloader):
    for epoch in range(self.epochs):
      self._epoch_train(train_dataloader)
      print(f"Epoch: {epoch+1}/{self.epochs}, Loss={self.loss}")


  def _epoch_train(self, dataloader):
    self.loss = 0
    for __, data in enumerate(dataloader, 0):
      x, y = data
      self.optimiser.zero_grad()
      prediction = self.model(x).flatten()

      loss = 0

      for i in range(self.batch_size):
        loss += self.loss_fn(prediction[i], y[i])

      loss /= self.batch_size

      loss.backward()
      self.optimiser.step()

      self.loss += loss


# define logistic regression model

class LogisticRegression(nn.Module):
  # input_size: Dimensionality of input feature vector.
  # num_classes: The number of classes in the classification problem.
  def __init__(self, input_size, num_classes):
    # Always call the superclass (nn.Module) constructor first!
    super(LogisticRegression, self).__init__()
    # Set up the linear transform
    self.linear = nn.Linear(input_size, num_classes)
    # I do not yet include the sigmoid activation after the linear 
    # layer because our loss function will include this as you will see later

  # Forward's sole argument is the input.
  # input is of shape (batch_size, input_size)
  def forward(self, x):
    # Apply the linear transform.
    # out is of shape (batch_size, num_classes). 
    out = self.linear(x)
    out = torch.sigmoid(out)
    # Softmax the out tensor to get a log-probability distribution
    # over classes for each example.
    return out

# train model
config = {
    "DEVICE": torch.device("cuda" if torch.cuda.is_available() else "cpu"),
    "BATCH_SIZE": 64,
    "EPOCHS": 500,
    "LR": 1e-3,
}

train_dataset = danceSongData(csv_file)

input_size = train_dataset.num_feature()
print(f"input_size={input_size}")
num_classes = 1
model = LogisticRegression(input_size, num_classes)

loss_fn = torch.nn.BCELoss()
optimiser = torch.optim.Adam(model.parameters(), lr=config["LR"])

train_dataloader = DataLoader(
    dataset=train_dataset,
    batch_size=config["BATCH_SIZE"],
    shuffle=True,
    drop_last=True,
    num_workers=4
  )

trainer = Trainer(
    model=model,
    loss_fn=loss_fn,
    optimiser=optimiser,
    config=config
)

trainer.train(train_dataloader)



input_size=49


  cpuset_checked))


Epoch: 1/500, Loss=3.4753434658050537
Epoch: 2/500, Loss=3.436222553253174
Epoch: 3/500, Loss=3.393145799636841
Epoch: 4/500, Loss=3.360952377319336
Epoch: 5/500, Loss=3.329829692840576
Epoch: 6/500, Loss=3.2979483604431152
Epoch: 7/500, Loss=3.2622547149658203
Epoch: 8/500, Loss=3.2338969707489014
Epoch: 9/500, Loss=3.2153375148773193
Epoch: 10/500, Loss=3.198021173477173
Epoch: 11/500, Loss=3.1746726036071777
Epoch: 12/500, Loss=3.1563801765441895
Epoch: 13/500, Loss=3.131575107574463
Epoch: 14/500, Loss=3.114480495452881
Epoch: 15/500, Loss=3.0878734588623047
Epoch: 16/500, Loss=3.087918996810913
Epoch: 17/500, Loss=3.072754383087158
Epoch: 18/500, Loss=3.0463550090789795
Epoch: 19/500, Loss=3.0456206798553467
Epoch: 20/500, Loss=3.023674964904785
Epoch: 21/500, Loss=3.0187270641326904
Epoch: 22/500, Loss=3.0100486278533936
Epoch: 23/500, Loss=2.989758253097534
Epoch: 24/500, Loss=2.9904823303222656
Epoch: 25/500, Loss=2.9774436950683594
Epoch: 26/500, Loss=2.9561262130737305
Epoch:

Run the below code to test the accuracy of your model on the training set: 

In [6]:
!wget "https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030test.csv"

--2022-06-16 19:45:42--  https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030test.csv
Resolving dorax.s3.ap-south-1.amazonaws.com (dorax.s3.ap-south-1.amazonaws.com)... 52.219.62.103
Connecting to dorax.s3.ap-south-1.amazonaws.com (dorax.s3.ap-south-1.amazonaws.com)|52.219.62.103|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 36712 (36K) [text/csv]
Saving to: ‘herremans_hit_1030test.csv’


2022-06-16 19:45:43 (154 KB/s) - ‘herremans_hit_1030test.csv’ saved [36712/36712]



In [7]:
import pandas as pd 

test = pd.read_csv('/content/herremans_hit_1030test.csv')
labels = test.iloc[:,-1]
test = test.drop('Topclass1030', axis=1)
testdata = torch.Tensor(test.values)
testlabels = torch.Tensor(labels.values).view(-1,1)

TP = 0
TN = 0
FN = 0
FP = 0

for i in range(0, testdata.size()[0]): 
  # print(testdata[i].size())
  Xtest = torch.Tensor(testdata[i])
  y_hat = model(Xtest)
  
  if y_hat > 0.5:
    prediction = 1
  else: 
    prediction = 0

  if (prediction == testlabels[i]):
    if (prediction == 1):
      TP += 1
    else: 
      TN += 1

  else:
    if (prediction == 1):
      FP += 1
    else: 
      FN += 1

print("True Positives: {0}, True Negatives: {1}".format(TP, TN))
print("False Positives: {0}, False Negatives: {1}".format(FP, FN))
rate = TP/(FN+TP)
print("Class specific accuracy of correctly predicting a hit song is {0}".format(rate))

True Positives: 43, True Negatives: 17
False Positives: 12, False Negatives: 7
Class specific accuracy of correctly predicting a hit song is 0.86
