# PyTorch - homework 1


Please run the whole notebook with your code and submit the `.ipynb` file that includes your answers. 

In [1]:
from termcolor import colored

student_number="1002911"
student_name="Calvin Yusnoveri"

print(colored("Homework by "  + student_name + ', number: ' + student_number,'red'))

[31mHomework by Calvin Yusnoveri, number: 1002911[0m


 ## Question 1 -- matrix multiplication

Implement the following mathematical operation on both the CPU and GPU (use Google Colab or another cloud service if you don't have a GPU in your computer). Print:

a) which type of GPU card you have and 

In [2]:
# implement solution here
import torch

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
device_name = torch.cuda.get_device_name(device)
print(f"Q1. A) Device: {device_name} -> {device}")

Q1. A) Device: NVIDIA GeForce GTX 1060 -> cuda:0


b) show the computation time for both CPU and GPU (using PyTorch). 

c) How much % fast is the GPU? 

 The operation to implement is the dot product $C = B * A^T$

 whereby $A$ is a random matrix of size $20,000 \times 1000$ and $B$ is a random matrix of size $2000 \times 1000$. In addition to the required information asked above:
 
 d) please also print the resulting two $C$ matrices (they should be the same btw).

In [3]:
def calculate_dot(A, B, device):
    start = torch.cuda.Event(enable_timing=True)
    end = torch.cuda.Event(enable_timing=True)
    
    B = B.to(device)
    A = torch.transpose(A, 0, 1).to(device)
    
    # only compute the matrix multiplication time
    start.record()
    C = torch.matmul(B, A)
    end.record()
    torch.cuda.synchronize()
    
    time = start.elapsed_time(end)
    return C, time

In [4]:
A = torch.rand((20000, 1000))
B = torch.rand((2000, 1000))

print('Q1 B) & D)')
device = 'cuda'
C_gpu, gpu_time = calculate_dot(A, B, device)
print(f'GPU compute time: {gpu_time} Result: \n {C_gpu}' )

device = 'cpu'
C_cpu, cpu_time = calculate_dot(A, B, device)
print(f'CPU compute time: {cpu_time} Result: \n {C_cpu}')

print(f'Q1 C) { ((gpu_time / cpu_time) -1) * 100} %') # this is because cpu_time > gpu_time! flip the ratio otherwise!
print('It seems the computation (although large), is not complex enough to benefit much from my GPU.')

Q1 B) & D)
GPU compute time: 241.7407684326172 Result: 
 tensor([[254.5484, 243.0836, 244.2404,  ..., 252.3349, 236.7377, 246.6936],
        [261.7545, 245.7598, 248.8301,  ..., 259.2577, 247.2546, 254.6792],
        [256.4694, 251.5695, 249.8525,  ..., 262.8980, 240.3078, 248.4840],
        ...,
        [250.9371, 248.6571, 245.4411,  ..., 259.8455, 237.5949, 244.0561],
        [259.5616, 250.7322, 246.5741,  ..., 255.1424, 239.9357, 252.3823],
        [256.1631, 245.3425, 248.6073,  ..., 260.5112, 242.2224, 248.9274]],
       device='cuda:0')
CPU compute time: 0.0010239999974146485 Result: 
 tensor([[254.5484, 243.0835, 244.2405,  ..., 252.3351, 236.7378, 246.6934],
        [261.7546, 245.7597, 248.8299,  ..., 259.2576, 247.2546, 254.6791],
        [256.4693, 251.5696, 249.8525,  ..., 262.8979, 240.3079, 248.4841],
        ...,
        [250.9372, 248.6573, 245.4409,  ..., 259.8455, 237.5949, 244.0561],
        [259.5616, 250.7323, 246.5742,  ..., 255.1427, 239.9358, 252.3823],
      

## Question 2 - grad


Find the gradient (partial derivatives) of the function $g(w)$ below. 

Let  $w=[w_1,w_2]^T$

Consider  $g(w)=2w_1w_2+w_2cos(w_1)$

a) In PyTorch, compute:   $\nabla g(w)$ 

 and verify that $\nabla g([\pi,1])=[2,2\piâˆ’1]^T$ using the grad function, whereby the first position is the partial for $w_1$ and the second position is the partial for $w_2$. 

b) You can also write a function to manually calculate these partial derivatives! You can review your differential equations math at [here](https://www.wolframalpha.com/input/?i=derivative+y+cos%28x%29) and implement this is a second function below to verify that it comes to the same solution. 


In [5]:
# write your solution here

import numpy as np

W = [[np.pi], [1]]
w = torch.tensor(W, requires_grad=True)
g = 2 * w[0] * w[1]  + w[1] * torch.cos(w[0])
dg = torch.autograd.grad(g, w)[0]
print(f'Q2 A) Autograd: {dg}')

dw1 = 2 * w[1] - torch.sin(w[0]) * w[1]
dw2 = 2 * w[0] + torch.cos(w[0])

dg_manual = torch.Tensor([[dw1], [dw2]])
print(f'Q2 B) Manual: {dg_manual}')

print(f'Answers are the same? {torch.eq(dg, dg_manual)} -> should be [[2.0000], [5.2832]]')

Q2 A) Autograd: tensor([[2.0000],
        [5.2832]])
Q2 B) Manual: tensor([[2.0000],
        [5.2832]])
Answers are the same? tensor([[True],
        [True]]) -> should be [[2.0000], [5.2832]]


## Question 3 - dance hit song prediction

Implement logistic regression in PyTorch for the following dance hit song prediction training dataset: 
https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030training.csv

 * Input variables: a number of audio features (most already standardized so don't worry about that)
 * Target variable: Topclass1030: 
   * 1 means it was a top 10 hit song; 
   * 0 means it never went above top 30 position.

This dataset is derived from my paper on dance hit song prediction, for full description of features have a look at https://arxiv.org/abs/1905.08076. 

Print the evolution of the loss every few epochs and train the model until it converges. 
 
 After training the logistic regression model, calculate the prediction accuracy on the test set: 
 https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030test.csv








In [6]:
# Your code here

# load data

import pandas as pd 

# assuming the data is downloaded and saved in the same dir -> ./content
train_data = './content/herremans_hit_1030training.csv'
test_data = './content/herremans_hit_1030test.csv'
train_data = pd.read_csv(train_data)
test_data = pd.read_csv(test_data)
print(f"Data loaded. Train data shape: {train_data.shape}, Test data shape: {test_data.shape}")
# print(train_data.head(50))

# define logistic regression model
import torch.nn as nn
import torch.nn.functional as F

class LogisticRegression(nn.Module):
    def __init__(self, input_size, num_classes):
        super(LogisticRegression, self).__init__()
        self.linear = nn.Linear(input_size, num_classes)
    def forward(self, x):
        out = self.linear(x)
        out = torch.sigmoid(out)
        return out

# train model
device = 'cuda'
epochs = 5000 + 1

num_out = 1
num_inp = 49 # 50 total dimensions (num of columns of dataset)

lr_rate = 0.001
loss_function = nn.BCELoss()
logreg_clf = LogisticRegression(num_inp, num_out).to(device)
optimizer = torch.optim.SGD(logreg_clf.parameters(), lr=lr_rate)

for epoch in range(epochs):
    features = torch.FloatTensor(train_data.loc[:, train_data.columns != 'Topclass1030'].values).to(device)
    target = torch.FloatTensor(train_data['Topclass1030']).to(device)
    
    optimizer.zero_grad()
    
#     print(prediction.size())
#     print(target.view(-1,1).size())
    
    prediction = logreg_clf(features)
    loss = loss_function(prediction, target.view(-1,1)).to(device)
    loss.backward()
    optimizer.step()
    
    if epoch % 50 == 0: # print every 50 epochs
        print (f"Epoch: {epoch}, Loss: {loss}")

Data loaded. Train data shape: (321, 50), Test data shape: (79, 50)
Epoch: 0, Loss: 0.7352050542831421
Epoch: 50, Loss: 0.7049029469490051
Epoch: 100, Loss: 0.6835950613021851
Epoch: 150, Loss: 0.6684138774871826
Epoch: 200, Loss: 0.6573933959007263
Epoch: 250, Loss: 0.649202823638916
Epoch: 300, Loss: 0.6429457664489746
Epoch: 350, Loss: 0.6380181312561035
Epoch: 400, Loss: 0.6340122818946838
Epoch: 450, Loss: 0.630652129650116
Epoch: 500, Loss: 0.6277498602867126
Epoch: 550, Loss: 0.6251769065856934
Epoch: 600, Loss: 0.6228451728820801
Epoch: 650, Loss: 0.620693564414978
Epoch: 700, Loss: 0.6186798214912415
Epoch: 750, Loss: 0.6167743802070618
Epoch: 800, Loss: 0.6149562001228333
Epoch: 850, Loss: 0.6132106184959412
Epoch: 900, Loss: 0.6115269064903259
Epoch: 950, Loss: 0.609897255897522
Epoch: 1000, Loss: 0.6083159446716309
Epoch: 1050, Loss: 0.606778621673584
Epoch: 1100, Loss: 0.6052818894386292
Epoch: 1150, Loss: 0.6038232445716858
Epoch: 1200, Loss: 0.6024004220962524
Epoch: 125

In [7]:
from datetime import datetime
now = datetime.now()
timestamp = now.strftime("%d%m-%H%M")
save_path = f'./logreg-clf-{timestamp}'
torch.save(logreg_clf.state_dict(), save_path) # model is saved in current dir for reproducibility
print(f'Model saved in {save_path}.')

Model saved in ./logreg-clf-1706-1340.


Run the below code to test the accuracy of your model on the training set: 

In [10]:
import pandas as pd 

test = pd.read_csv('./content/herremans_hit_1030test.csv')
labels = test.iloc[:,-1]
test = test.drop('Topclass1030', axis=1)
testdata = torch.Tensor(test.values) 
testlabels = torch.Tensor(labels.values).view(-1,1)

device = 'cuda'
trained_logreg_clf = LogisticRegression(num_inp, num_out).to(device) # reinitialize model
trained_logreg_clf.load_state_dict(torch.load(save_path)) # load trained model
logreg_clf.eval()

TP = 0
TN = 0
FN = 0
FP = 0

for i in range(0, testdata.size()[0]):
  Xtest = torch.Tensor(testdata[i]).to(device)
  y_hat = logreg_clf(Xtest)
  
  if y_hat > 0.5:
    prediction = 1
  else: 
    prediction = 0

  if (prediction == testlabels[i]):
    if (prediction == 1):
      TP += 1
    else: 
      TN += 1

  else:
    if (prediction == 1):
      FP += 1
    else: 
      FN += 1

print("True Positives: {0}, True Negatives: {1}".format(TP, TN))
print("False Positives: {0}, False Negatives: {1}".format(FP, FN))
rate = TP/(FN+TP)
print("Class specific accuracy of correctly predicting a hit song is {0}".format(rate))

True Positives: 45, True Negatives: 13
False Positives: 16, False Negatives: 5
Class specific accuracy of correctly predicting a hit song is 0.9
