# PyTorch - homework 1


Please run the whole notebook with your code and submit the `.ipynb` file that includes your answers. 

In [1]:
from termcolor import colored

student_number="1002819"
student_name="Samson Yu Bai Jian"

print(colored("Homework by "  + student_name + ', number: ' + student_number,'red'))

[31mHomework by Samson Yu Bai Jian, number: 1002819[0m


 ## Question 1 -- matrix multiplication

Implement the following mathematical operation on both the CPU and GPU (use Google Colab or another cloud service if you don't have a GPU in your computer). Print:

a) which type of GPU card you have and 

b) show the computation time for both CPU and GPU (using PyTorch). 

c) How much % fast is the GPU? 

 The operation to implement is the dot product $C = B * A^T$

 whereby $A$ is a random matrix of size $30,000 \times 1000$ and $B$ is a random matrix of size $3000 \times 1000$. In addition to the required information asked above:
 
 d) please also print the resulting two $C$ matrices (they should be the same btw). 
 



In [2]:
# implement solution here
# a)
!nvidia-smi
print("\n")

# b)
import torch
import timeit

B = torch.rand(30000, 1000)
A = torch.rand(3000, 1000)
B_gpu = B.cuda()
A_gpu = A.cuda()

num_runs = 3

cpu_duration = timeit.timeit('B.mm(A.T)', number=num_runs, globals=globals())
C = B.mm(A.T)
print("C for CPU is " + str(C))
print("C for CPU is calculated in " + str(cpu_duration / num_runs) + " seconds from the average of " + str(num_runs) + " runs.")

gpu_duration = timeit.timeit('B_gpu.mm(A_gpu.T)', number=num_runs, globals=globals())
C_gpu = B_gpu.mm(A_gpu.T)
print("C for GPU is " + str(C_gpu))
print("C for GPU is calculated in " + str(gpu_duration / num_runs) + " seconds from the average of " + str(num_runs) + " runs.")
print("\n")

# c)
print("GPU is " + str(cpu_duration / gpu_duration) + "% fast.")

Fri Jun 26 06:13:33 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   58C    P8    31W / 149W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## Question 2 - grad


Find the gradient (partial derivatives) of the function $g(w)$ below. 

Let  $w=[w_1,w_2]^T$

Consider  $g(w)=2w_1w_2+w_2cos(w_1)$

a) In PyTorch, compute:   $\Delta_w g(w)$ 

 and verify that $\Delta_w g([\pi,1])=[2,2\pi−1]^T$ using the grad function, whereby the first position is the partial for $w_1$ and the second position is the partial for $w_2$. 

b) You can also write a function to manually calculate these partial derivatives! You can review your differential equations math at [here](https://www.wolframalpha.com/input/?i=derivative+y+cos%28x%29) and implement this is a second function below to verify that it comes to the same solution. 


In [3]:
# write your solution here
# a)
import numpy as np

def g(w):
    return 2 * w[0] * w[1] + w[1] * torch.cos(w[0])

x = torch.tensor(np.pi, requires_grad=True)
y = torch.tensor(1., requires_grad=True)
z = g([x, y])
z.backward()
print("PyTorch grad function calculations: ", [x.grad.item(), y.grad.item()])

# b)
def manual_dg(w):
    return [2 * w[1] - w[1] * torch.sin(w[0]), 2 * w[0] + torch.cos(w[0])]
dg = manual_dg([x, y])
print("Manual calculations: ", [dg[0].item(), dg[1].item()])

PyTorch grad function calculations:  [2.0, 5.2831854820251465]
Manual calculations:  [2.0, 5.2831854820251465]


## Question 3 - dance hit song prediction

Implement logistic regression in PyTorch for the following dance hit song prediction training dataset: 
https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030training.csv

 * Input variables: a number of audio features (most already standardized so don't worry about that)
 * Target variable: Topclass1030: 
   * 1 means it was a top 10 hit song; 
   * 0 means it never went above top 30 position.

This dataset is derived from my paper on dance hit song prediction, for full description of features have a look at https://arxiv.org/abs/1905.08076. 

Print the evolution of the loss every few epochs and train the model until it converges. 
 
 After training the logistic regression model, calculate the prediction accuracy on the test set: 
 https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030test.csv








In [4]:
# Your code here

# load data
import pandas as pd

train = pd.read_csv('https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030training.csv')
labels = train.iloc[:,-1]
train = train.drop('Topclass1030', axis=1)
traindata = torch.Tensor(train.values)
trainlabels = torch.Tensor(labels.values).view(-1,1)

# define logistic regression model
import torch.nn as nn
import torch.nn.functional as F

class LogisticRegression(nn.Module):
  # input_size: Dimensionality of input feature vector.
  # num_classes: The number of classes in the classification problem.
  def __init__(self, input_size, num_classes):
    # Always call the superclass (nn.Module) constructor first!
    super(LogisticRegression, self).__init__()
    # Set up the linear transform
    self.linear = nn.Linear(input_size, num_classes)

  def forward(self, x):
    # Apply the linear transform.
    # out is of shape (batch_size, num_classes). 
    out = self.linear(x)
    out = torch.sigmoid(out)
    # Softmax the out tensor to get a log-probability distribution
    # over classes for each example.
    return out

# train model
# Binary classifiation
num_outputs = 1
num_input_features = train.shape[1]

# Create the logistic regression model
logreg_clf = LogisticRegression(num_input_features, num_outputs)

print_every = 5
lr_rate = 0.001
criterion = nn.BCELoss() 
optimizer = torch.optim.SGD(logreg_clf.parameters(), lr=lr_rate)
prev_loss = 1e10
epoch = 200

# train till convergence
for i in range(epoch):
    for j in range(traindata.size()[0]):
        Xtrain = torch.Tensor(traindata[j])
        y_var = torch.Tensor([labels[j]]).unsqueeze(0)
        y_hat = logreg_clf(Xtrain)

        optimizer.zero_grad()
        loss = criterion(y_hat, y_var)
        loss.backward()
        optimizer.step()

    if i % print_every == 0:
        print ("Epoch: {0}, Loss: {1}, ".format(i, loss.data.numpy()))

  return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)


Epoch: 0, Loss: 0.34890156984329224, 
Epoch: 5, Loss: 0.45001208782196045, 
Epoch: 10, Loss: 0.48037290573120117, 
Epoch: 15, Loss: 0.4771309792995453, 
Epoch: 20, Loss: 0.46074485778808594, 
Epoch: 25, Loss: 0.4417151212692261, 
Epoch: 30, Loss: 0.42400845885276794, 
Epoch: 35, Loss: 0.4087331295013428, 
Epoch: 40, Loss: 0.39593037962913513, 
Epoch: 45, Loss: 0.38530394434928894, 
Epoch: 50, Loss: 0.3764950931072235, 
Epoch: 55, Loss: 0.36917564272880554, 
Epoch: 60, Loss: 0.36307042837142944, 
Epoch: 65, Loss: 0.3579563498497009, 
Epoch: 70, Loss: 0.35365596413612366, 
Epoch: 75, Loss: 0.35002845525741577, 
Epoch: 80, Loss: 0.3469603359699249, 
Epoch: 85, Loss: 0.3443610668182373, 
Epoch: 90, Loss: 0.3421575427055359, 
Epoch: 95, Loss: 0.34029191732406616, 
Epoch: 100, Loss: 0.3387138545513153, 
Epoch: 105, Loss: 0.33738452196121216, 
Epoch: 110, Loss: 0.33626946806907654, 
Epoch: 115, Loss: 0.3353404700756073, 
Epoch: 120, Loss: 0.3345751166343689, 
Epoch: 125, Loss: 0.3339524567127

Run the below code to test the accuracy of your model on the training set: 

In [6]:
import pandas as pd 

# test = pd.read_csv('/content/herremans_hit_1030test.csv')
test = pd.read_csv('https://dorax.s3.ap-south-1.amazonaws.com/herremans_hit_1030test.csv')
labels = test.iloc[:,-1]
test = test.drop('Topclass1030', axis=1)
testdata = torch.Tensor(test.values)
testlabels = torch.Tensor(labels.values).view(-1,1)

TP = 0
TN = 0
FN = 0
FP = 0

for i in range(0, testdata.size()[0]): 
  # print(testdata[i].size())
  Xtest = torch.Tensor(testdata[i])
  y_hat = logreg_clf(Xtest)
  
  if y_hat > 0.5:
    prediction = 1
  else: 
    prediction = 0

  if (prediction == testlabels[i]):
    if (prediction == 1):
      TP += 1
    else: 
      TN += 1

  else:
    if (prediction == 1):
      FP += 1
    else: 
      FN += 1

print("True Positives: {0}, True Negatives: {1}".format(TP, TN))
print("False Positives: {0}, False Negatives: {1}".format(FP, FN))
rate = TP/(FN+TP)
print("Class specific accuracy of correctly predicting a hit song is {0}".format(rate))

True Positives: 39, True Negatives: 20
False Positives: 9, False Negatives: 11
Class specific accuracy of correctly predicting a hit song is 0.78
