# Homework 4 - PyTorch Introduction

Please run the whole notebook with your code and submit the `.ipynb` file that includes your answers. 

 ## Question 1 -- matrix multiplication

Implement the following mathematical operation on both the CPU and GPU (use Google Colab or another cloud service if you don't have a GPU in your computer). Print:

a) which type of GPU card you have and 

b) show the computation time for both CPU and GPU (using PyTorch). 

c) How much % fast is the GPU? 

 The operation to implement is the dot product $C = B * A^T$

 whereby $A$ is a random matrix of size $30,000 \times 1000$ and $B$ is a random matrix of size $3000 \times 1000$. In addition to the required information asked above:
 
 d) please also print the resulting two $C$ matrices (they should be the same btw). 
 



In [7]:
# implement solution here
import time
import torch
# a) which type of GPU card you have and
!nvidia-smi
# b) show the computation time for both CPU and GPU (using PyTorch).
A = torch.rand(30000, 1000)
A_T = torch.transpose(A, 0, 1)
B = torch.rand(3000, 1000)

cpu_start = time.time()
C_cpu = torch.mm(B, A_T)
cpu_end = time.time()
cpu_time = cpu_end-cpu_start
print("The computational time for CPU is {}s".format(cpu_time))

gpu_start = time.time()
C_gpu = torch.mm(B.cuda(), A_T.cuda())
gpu_end = time.time()
gpu_time = gpu_end-gpu_start
print("The computational time for GPU is {}s".format(gpu_time))

# c) How much % fast is the GPU?
print("The gpu is {}% fast than the cpu".format(cpu_time/gpu_time*100))

# d) please also print the resulting two  C  matrices (they should be the same btw).
print(C_cpu)
print(C_gpu)

Fri Jun 26 06:03:18 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   65C    P0    31W /  70W |   1609MiB / 15079MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## Question 2 - grad


Find the gradient (partial derivatives) of the function $g(w)$ below. 

Let  $w=[w_1,w_2]^T$

Consider  $g(w)=2w_1w_2+w_2cos(w_1)$

a) In PyTorch, compute:   $\Delta_w g(w)$ 

 and verify that $\Delta_w g([\pi,1])=[2,\pi−1]^T$ using the grad function, whereby the first position is the partial for $w_1$ and the second position is the partial for $w_2$. 

b) You can also write a function to manually calculate these partial derivatives! You can review your differential equations math at [here](https://www.wolframalpha.com/input/?i=derivative+y+cos%28x%29) and implement this is a second function below to verify that it comes to the same solution. 


In [3]:
# write your solution here
import math
# (a)
def g(w):
    return (2*w[0]*w[1]) + (w[1]*torch.cos(w[0]))

input_z = (torch.tensor(math.pi, requires_grad=True), torch.tensor(1.0, requires_grad=True))
output_z = g(input_z)
output_z.backward()
print("Using Pytorch grad: {}".format([input_z[0].grad.item(), input_z[1].grad.item()]))

# (b)
def manually_calculate(w):
    return (2*w[1] - w[1]*torch.sin(w[0]), 2*w[0] + torch.cos(w[0]))
manual_output_z = manually_calculate(input_z)
print("Manually calculate grad: {}".format([manual_output_z[0].item(), manual_output_z[1].item()]))

Using Pytorch grad: [2.0, 5.2831854820251465]
Manually calculate grad: [2.0, 5.2831854820251465]


## Question 3 - dance hit song prediction

Implement logistic regression in PyTorch for the following dance hit song prediction training dataset: 
https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030training.csv

 * Input variables: a number of audio features (most already standardized so don't worry about that)
 * Target variable: Topclass1030: 
   * 1 means it was a top 10 hit song; 
   * 0 means it never went above top 30 position.

This dataset is derived from my paper on dance hit song prediction, for full description of features have a look at https://arxiv.org/abs/1905.08076. 

Print the evolution of the loss every few epochs and train the model until it converges. 
 
 After training the logistic regression model, calculate the prediction accuracy on the test set: 
 https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030test.csv








In [4]:
# Your code here
import pandas as pd

# load data
csv = pd.read_csv("https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030training.csv")
targets = csv['Topclass1030']
inputs = csv.drop('Topclass1030', 1)

# define logistic regression model
import torch.nn as nn
import torch.nn.functional as F

class LogisticRegression(nn.Module):
  # input_size: Dimensionality of input feature vector.
  # num_classes: The number of classes in the classification problem.
  def __init__(self, input_size, num_classes):
    # Always call the superclass (nn.Module) constructor first!
    super(LogisticRegression, self).__init__()
    # Set up the linear transform
    self.linear = nn.Linear(input_size, num_classes)
    # I do not yet include the sigmoid activation after the linear 
    # layer because our loss function will include this as you will see later

  # Forward's sole argument is the input.
  # input is of shape (batch_size, input_size)
  def forward(self, x):
    # Apply the linear transform.
    # out is of shape (batch_size, num_classes). 
    out = self.linear(x)
    out = torch.sigmoid(out)
    # Softmax the out tensor to get a log-probability distribution
    # over classes for each example.
    return out

# train model
num_outputs = 1
num_input_features = csv.shape[1] - 1
logreg_clf = LogisticRegression(num_input_features, num_outputs)

import torch 
lr_rate = 0.001  # alpha
loss_function = nn.BCELoss() 
# SGD: stochastic gradient descent is used to train/fit the model
optimizer = torch.optim.SGD(logreg_clf.parameters(), lr=lr_rate)

import numpy as np 
#training loop:
epochs = 200 #how many times we go through the training set
steps = csv.shape[0]

for i in range(epochs):
    for j in range(steps):
        x_var = torch.tensor(inputs.loc[j].values).float()
        y_var = torch.tensor([targets.loc[j]]).float()          

        optimizer.zero_grad() # empty (zero) the gradient buffers
        y_hat = logreg_clf(x_var) #get the output from the model

        loss = loss_function(y_hat, y_var) #calculate the loss
        loss.backward() #backprop
        optimizer.step() #does the update

    if i % 10 == 0:
        print ("Epoch: {0}, Loss: {1}, ".format(i, loss.data.numpy()))

Epoch: 0, Loss: 0.9041625261306763, 
Epoch: 10, Loss: 0.6218525171279907, 
Epoch: 20, Loss: 0.528566300868988, 
Epoch: 30, Loss: 0.46163666248321533, 
Epoch: 40, Loss: 0.42029836773872375, 
Epoch: 50, Loss: 0.39436790347099304, 
Epoch: 60, Loss: 0.37743163108825684, 
Epoch: 70, Loss: 0.365932822227478, 
Epoch: 80, Loss: 0.35787269473075867, 
Epoch: 90, Loss: 0.35208699107170105, 
Epoch: 100, Loss: 0.34787097573280334, 
Epoch: 110, Loss: 0.34477487206459045, 
Epoch: 120, Loss: 0.3425094783306122, 
Epoch: 130, Loss: 0.34087684750556946, 
Epoch: 140, Loss: 0.3397316038608551, 
Epoch: 150, Loss: 0.33897864818573, 
Epoch: 160, Loss: 0.3385392427444458, 
Epoch: 170, Loss: 0.3383553624153137, 
Epoch: 180, Loss: 0.33838361501693726, 
Epoch: 190, Loss: 0.3385876417160034, 


Run the below code to test the accuracy of your model on the training set: 

In [5]:
import pandas as pd 
test = pd.read_csv("https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030test.csv")
labels = test.iloc[:,-1]
test = test.drop('Topclass1030', axis=1)
testdata = torch.Tensor(test.values)
testlabels = torch.Tensor(labels.values).view(-1,1)

TP = 0
TN = 0
FN = 0
FP = 0

for i in range(0, testdata.size()[0]): 
  # print(testdata[i].size())
  Xtest = torch.Tensor(testdata[i])
  y_hat = logreg_clf(Xtest)
  
  if y_hat > 0.5:
    prediction = 1
  else: 
    prediction = 0

  if (prediction == testlabels[i]):
    if (prediction == 1):
      TP += 1
    else: 
      TN += 1

  else:
    if (prediction == 1):
      FP += 1
    else: 
      FN += 1

print("True Positives: {0}, True Negatives: {1}".format(TP, TN))
print("False Positives: {0}, False Negatives: {1}".format(FP, FN))
rate = TP/(FN+TP)
print("Class specific accuracy of correctly predicting a hit song is {0}".format(rate))

True Positives: 39, True Negatives: 20
False Positives: 9, False Negatives: 11
Class specific accuracy of correctly predicting a hit song is 0.78
