# PyTorch - homework 1


Please run the whole notebook with your code and submit the `.ipynb` file that includes your answers. 

In [None]:
from termcolor import colored

student_number="1004570"
student_name="Zhihan Yap"

print(colored("Homework by "  + student_name + ', number: ' + student_number,'red'))

[31mHomework by Zhihan Yap, number: 1004570[0m


 ## Question 1 -- matrix multiplication

Implement the following mathematical operation on both the CPU and GPU (use Google Colab or another cloud service if you don't have a GPU in your computer). Print:

a) which type of GPU card you have and 

b) show the computation time for both CPU and GPU (using PyTorch). 

c) How much % faster is the GPU? 

 The operation to implement is the dot product $C = B * A^T$

 whereby $A$ is a random matrix of size $20,000 \times 1,000$ and $B$ is a random matrix of size $2,000 \times 1,000$. In addition to the required information asked above:
 
 d) please also print the resulting two $C$ matrices (they should be the same btw). 
 



In [None]:
# implement solution here

import torch
import numpy as np
import time

A_t = torch.rand(1000, 20000)
B = torch.rand(2000, 1000)



In [None]:
# CPU

start_time = time.monotonic()

C = B @ A_t

end_time = time.monotonic()

print(C)
print(f"CPU: time elapsed = {end_time - start_time}s")

tensor([[260.2056, 246.7230, 252.6091,  ..., 254.3226, 249.2874, 248.9626],
        [259.4456, 242.5540, 252.3417,  ..., 250.8234, 256.4850, 249.8223],
        [253.2127, 239.0423, 241.5245,  ..., 247.8385, 248.0563, 249.0806],
        ...,
        [255.0454, 244.5129, 240.5248,  ..., 247.6823, 252.1554, 246.8359],
        [257.3079, 248.2373, 254.8701,  ..., 254.1974, 257.1751, 258.9467],
        [258.8436, 245.8529, 249.1706,  ..., 246.5582, 248.1608, 249.1446]])
CPU: time elapsed = 0.985332881999966


In [None]:
# GPU
if torch.cuda.is_available():
  print(f"GPU: {torch.cuda.get_device_name()}")

  A_t_gpu = A_t.cuda()
  B_gpu = B.cuda()

  start_time = time.monotonic()

  C_gpu = B @ A_t 

  end_time = time.monotonic()

  print(C_gpu)
  print(f"GPU: time elapsed = {end_time - start_time}s")

GPU: Tesla T4
tensor([[260.2056, 246.7230, 252.6091,  ..., 254.3226, 249.2874, 248.9626],
        [259.4456, 242.5540, 252.3417,  ..., 250.8234, 256.4850, 249.8223],
        [253.2127, 239.0423, 241.5245,  ..., 247.8385, 248.0563, 249.0806],
        ...,
        [255.0454, 244.5129, 240.5248,  ..., 247.6823, 252.1554, 246.8359],
        [257.3079, 248.2373, 254.8701,  ..., 254.1974, 257.1751, 258.9467],
        [258.8436, 245.8529, 249.1706,  ..., 246.5582, 248.1608, 249.1446]])
GPU: time elapsed = 1.0105033419999927


## Question 2 - grad


Find the gradient (partial derivatives) of the function $g(w)$ below. 

Let  $w=[w_1,w_2]^T$

Consider  $g(w)=2w_1w_2+w_2cos(w_1)$

a) In PyTorch, compute:   $\nabla g(w)$ 

 and verify that $\nabla g([\pi,1])=[2,2\pi−1]^T$ using the grad function, whereby the first position is the partial for $w_1$ and the second position is the partial for $w_2$. 

b) You can also write a function to manually calculate these partial derivatives! You can review your differential equations math at [here](https://www.wolframalpha.com/input/?i=derivative+y+cos%28x%29) and implement this as a second function below to verify that it comes to the same solution. 


In [42]:
# write your solution here
import torch
import numpy as np
import math

def g(w_t):
  out = 2*w_t[0]*w_t[1] + w_t[1]*math.cos(w_t[0])
  return out

w = torch.tensor([math.pi, 1.0], requires_grad=True)
g_out = g(w)
g_out.sum().backward()
print(w.grad == torch.tensor([2, 2*math.pi-1]))



tensor([True, True])


In [44]:
def g_grad(w1, w2):
  w_1_grad = 2*w2 - w2*math.sin(w1)
  w_2_grad = 2*w1 + math.cos(w1)
  return [w_1_grad, w_2_grad]

print(torch.tensor(g_grad(math.pi, 1.0)))
print(w.grad == torch.tensor(g_grad(math.pi, 1.0)))

tensor([2.0000, 5.2832])
tensor([True, True])


## Question 3 - dance hit song prediction

Implement logistic regression in PyTorch for the following dance hit song prediction training dataset: 
https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030training.csv

 * Input variables: a number of audio features (most already standardized so don't worry about that)
 * Target variable: Topclass1030: 
   * 1 means it was a top 10 hit song; 
   * 0 means it never went above top 30 position.

This dataset is derived from my paper on dance hit song prediction, for full description of features have a look at https://arxiv.org/abs/1905.08076. 

Print the evolution of the loss every few epochs and train the model until it converges. 
 
 After training the logistic regression model, calculate the prediction accuracy on the test set: 
 https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030test.csv








In [17]:
# Your code here
import pandas as pd
import torch.nn as nn
import torch
import numpy as np
# load data
!wget https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030training.csv

data = pd.read_csv("/content/herremans_hit_1030training.csv")


--2022-06-24 06:14:48--  https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030training.csv
Resolving dorax.s3.ap-south-1.amazonaws.com (dorax.s3.ap-south-1.amazonaws.com)... 52.219.66.103
Connecting to dorax.s3.ap-south-1.amazonaws.com (dorax.s3.ap-south-1.amazonaws.com)|52.219.66.103|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 147372 (144K) [text/csv]
Saving to: ‘herremans_hit_1030training.csv.1’


2022-06-24 06:14:49 (206 KB/s) - ‘herremans_hit_1030training.csv.1’ saved [147372/147372]



In [46]:
data_y = torch.tensor(data["Topclass1030"].values).view(-1,1)
data_x = torch.tensor(data.drop("Topclass1030", 1).values)
num_features = data_x.shape[1]
print(num_features)


49


  


In [27]:
class LogisticRegression(nn.Module):
  # input_size: Dimensionality of input feature vector.
  # num_classes: The number of classes in the classification problem.
  def __init__(self, input_size, num_classes):
    # Always call the superclass (nn.Module) constructor first!
    super(LogisticRegression, self).__init__()
    # Set up the linear transform
    self.linear = nn.Linear(input_size, num_classes)
    # I do not yet include the sigmoid activation after the linear 
    # layer because our loss function will include this as you will see later

  # Forward's sole argument is the input.
  # input is of shape (batch_size, input_size)
  def forward(self, x):
    # Apply the linear transform.
    # out is of shape (batch_size, num_classes). 
    out = self.linear(x)
    out = torch.sigmoid(out)
    # Softmax the out tensor to get a log-probability distribution
    # over classes for each example.
    return out

In [41]:
logreg_model = LogisticRegression(num_features, 1)
# device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# print(device)
# logreg_model = logreg_model.to(device)

In [42]:
lr_rate = 0.001

# data_x_gpu = data_x.to(device)
# data_y_gpu = data_y.to(device)


loss_function = nn.BCELoss() 
# SGD: stochastic gradient descent is used to train/fit the model
optimizer = torch.optim.SGD(logreg_model.parameters(), lr=lr_rate)

In [43]:
epochs = 100
step = data_x.shape[0]

for i in range(epochs):
  for j in range(step):
    # randomly sample from all the examples
    index = np.random.randint(data_x.shape[0])
    x_var = torch.tensor(data_x[index]).float()
    y_var = torch.tensor(data_y[index]).float()

    optimizer.zero_grad()
    y_hat = logreg_model(x_var)

    loss = loss_function(y_hat, y_var) 
    loss.backward()
    optimizer.step()

  if i % 10 == 0:
    print(f"epoch {i} complete, loss = {loss.data.numpy()}")

  
  if __name__ == '__main__':


epoch 0 complete, loss = 0.5311824083328247
epoch 10 complete, loss = 1.346374750137329
epoch 20 complete, loss = 0.6477197408676147
epoch 30 complete, loss = 0.22680331766605377
epoch 40 complete, loss = 0.27648574113845825
epoch 50 complete, loss = 0.16359376907348633
epoch 60 complete, loss = 1.5551060438156128
epoch 70 complete, loss = 0.1125195249915123
epoch 80 complete, loss = 0.21069541573524475
epoch 90 complete, loss = 0.2937425673007965


Run the below code to test the accuracy of your model on the training set: 

In [44]:
!wget https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030test.csv

--2022-06-24 06:47:32--  https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030test.csv
Resolving dorax.s3.ap-south-1.amazonaws.com (dorax.s3.ap-south-1.amazonaws.com)... 52.219.158.202
Connecting to dorax.s3.ap-south-1.amazonaws.com (dorax.s3.ap-south-1.amazonaws.com)|52.219.158.202|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 36712 (36K) [text/csv]
Saving to: ‘herremans_hit_1030test.csv’


2022-06-24 06:47:33 (155 KB/s) - ‘herremans_hit_1030test.csv’ saved [36712/36712]



In [45]:
# TP import pandas as pd 


test = pd.read_csv('/content/herremans_hit_1030test.csv')
labels = test.iloc[:,-1]
test = test.drop('Topclass1030', axis=1)
testdata = torch.Tensor(test.values)
testlabels = torch.Tensor(labels.values).view(-1,1)

TP = 0
TN = 0
FN = 0
FP = 0

for i in range(0, testdata.size()[0]): 
  # print(testdata[i].size())
  Xtest = torch.Tensor(testdata[i])
  y_hat = logreg_model(Xtest)
  
  if y_hat > 0.5:
    prediction = 1
  else: 
    prediction = 0

  if (prediction == testlabels[i]):
    if (prediction == 1):
      TP += 1
    else: 
      TN += 1

  else:
    if (prediction == 1):
      FP += 1
    else: 
      FN += 1

print("True Positives: {0}, True Negatives: {1}".format(TP, TN))
print("False Positives: {0}, False Negatives: {1}".format(FP, FN))
rate = TP/(FN+TP)
print("Class specific accuracy of correctly predicting a hit song is {0}".format(rate))

True Positives: 43, True Negatives: 15
False Positives: 14, False Negatives: 7
Class specific accuracy of correctly predicting a hit song is 0.86
