# Knorex's R&D Interview - Programming Test (code: BD-0412)

## Instructions: 

1. This test is designed to test your basic programming skills and machine learning/mathematics knowledge using pytorch.
2. You can use any material to finish the test (book, internet, etc.). However, please keep this test to you only during and after the test. You are not allowed to share the test/communicate with others to get help during the test.
3. You will obtain partial credits if you show a clear thought process even if you don't get the final answer correctly.
4. Please follow good programming practices and make your code easily readable (e.g. variable naming, documentation, comments etc.), following PEP8 is a plus.
5. Make sure your code is runnable after running the set-up commands. The reviewer is not responsible for installing missing libraries you forget to include.
6. Please read the instructions for each question carefully, especially what is & isn't allowed. If you don't follow the instructions, you won't receive any credit for that question.
7. Most likely, you cannot answer all the questions. If you can score 40\% +, you can proceed to the next round.
8. If certain conditions of the questions (for eg. hyperparameter values) are not stated, you are free to choose any values you want.
9. Please start your code right below the `# YOUR CODE STARTS HERE` line. If there is a `# YOUR CODE ENDS HERE` line, please put your implementation between `# YOUR CODE STARTS HERE` and `# YOUR CODE ENDS HERE`. 
10. Last but not least, please manage your time as you have 2 hours to finish the test. You will have another 5 minutes to send your implementation to us (i.e., you have 2 hours and 5 minutes in total). We have already implemented the timestamp printing line to track your output timestamp for every question. You will receive 0 credit for any question which:
    - you remove our timestamp tracking line
    - no timestamp printed out in the output
    - you exceed your limit time for the test

*One good news to motivate you before the test: You don't need to have a GPU machine to finish the test, only CPU should help.*

**Please try your best. Good luck!**

## Dependencies

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import os
os.chdir('/content/drive/MyDrive/Test')

In [None]:
# Set up dependencies if needed. Most likely it'll have been installed for you.
!pip install numpy torchvision
!pip install torch
import torch    
torch.__version__

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


'1.12.1+cu113'

## Question 1

Create a PyTorch tensor $x$ of type FloatTensor, size [5, 2, 7] and filled with random numbers. 

Print tensor $x$ and its type. 

Then, convert the same tensor $x$ to type LongTensor and print its value and type.


In [None]:
%reset -f
import torch
import datetime
print('Timestamp:',datetime.datetime.now().strftime("%y-%m-%d--%H-%M-%S"))


# YOUR CODE STARTS HERE
x = torch.rand(size=(5,2,7), dtype=torch.float32)
print(x, x.type())
x = (x*1000).long()
print(x, x.type())

Timestamp: 22-08-03--02-52-26
tensor([[[0.4023, 0.0305, 0.1618, 0.4873, 0.5966, 0.3101, 0.0112],
         [0.8004, 0.9805, 0.4842, 0.3297, 0.2915, 0.1008, 0.6466]],

        [[0.1416, 0.9133, 0.7419, 0.8351, 0.8044, 0.4760, 0.2419],
         [0.8304, 0.6532, 0.3948, 0.2341, 0.1168, 0.1288, 0.0119]],

        [[0.4419, 0.9825, 0.9454, 0.0597, 0.7139, 0.4074, 0.9987],
         [0.2894, 0.0708, 0.1804, 0.3068, 0.5872, 0.9262, 0.5145]],

        [[0.0627, 0.5143, 0.7386, 0.0733, 0.5312, 0.3972, 0.1765],
         [0.3518, 0.8106, 0.6606, 0.4525, 0.4671, 0.8246, 0.2272]],

        [[0.7734, 0.3351, 0.5695, 0.8969, 0.3788, 0.1784, 0.4622],
         [0.8906, 0.9751, 0.1470, 0.8762, 0.9770, 0.6081, 0.2377]]]) torch.FloatTensor
tensor([[[402,  30, 161, 487, 596, 310,  11],
         [800, 980, 484, 329, 291, 100, 646]],

        [[141, 913, 741, 835, 804, 476, 241],
         [830, 653, 394, 234, 116, 128,  11]],

        [[441, 982, 945,  59, 713, 407, 998],
         [289,  70, 180, 306, 587, 926

## Question 2

Given an input image of size (1, 3, 128, 128) in the form of (batch size, number of channels, height, width) which is passed to a 2D convolutional layer composed of 32 filters with filter size (3,3) in the form of (height, width), no padding and stride value 5. <br>
Implement this 2D convolutional layer and print the size of the output image. <br>

Hint: You may use function `torch.nn.Conv2d`.

In [None]:
%reset -f
import torch
import datetime
print('Timestamp:',datetime.datetime.now().strftime("%y-%m-%d--%H-%M-%S"))

x = torch.rand(1, 3, 128, 128) # input image

# YOUR CODE STARTS HERE
Layer2DConv = torch.nn.Conv2d(3, 32, (3,3), 5)
output_Image = Layer2DConv(x)
print(output_Image.size())

Timestamp: 22-08-03--03-01-35
torch.Size([1, 32, 26, 26])


## Question 3

Implement a Long-short Term Memory (LSTM) cell **WITHOUT USING** `torch.nn.lstm` method. The LSTM cell that you are going to implement is represented by a python function named `lstm_cell()` with predefined parameters. This function should return `c` (cell value) and `h` (hidden value).

<img src="./pic/lstm_wiki.png" width="400"/>
<center> Figure : LSTM cell </center> 

<img src="./pic/lstm_formulas.png" width="600"/>
<center> Figure : LSTM formulas </center> 

**HINT 1**: You can use `torch.sigmoid`, `torch.tanh` and `torch.mm`

**HINT 2**: The variable names in the code below represent the variables in the `Figure : LSTM formulas`, please assign properly.

In [None]:
def lstm_cell(x, h, c, Wf, Wi, Wo, Wc, Uf, Ui, Uo, Uc, bf, bi, bo, bc):
    # YOUR CODE STARTS HERE
    
    f_t = torch.sigmoid(Wf@x + Uf@h + bf)
    i_t = torch.sigmoid(Wi@x + Ui@h + bi)
    o_t = torch.sigmoid(Wo@x + Uo@h + bo)

    g_t = torch.tanh(Wc@x + Uc@h + bc)
    c = f_t * c + i_t * g_t
    h = o_t * torch.tanh(c)

    # YOUR CODE ENDS HERE
    return c,h

print('Timestamp:',datetime.datetime.now().strftime("%y-%m-%d--%H-%M-%S"))

Timestamp: 22-08-03--03-23-59


## Question 4

Given a minibatch $s=[ [5.2,-1.2], [-1.0,4.0], [-1,2] ]$ of scores of three data points and two classes, implement and compute the mean cross entropy with the labels $[0,1,1]$ associated to the minibatch $s$.

As a reminder, the definition of the mean cross entropy loss is 

$$ L = -\frac{1}{N}\sum_{i=1}^N \log \Big(\textrm{entry cl($i$) of probability vector } p^{(i)} \Big)$$

where $N$ is the number of data points, $cl($i$)$ is the class index of the $i^{th}$ training data and $p^{(i)}$ is the probability vector computed by the network (using the score vector $s^{(i)}$ and the softmax operator).

Note that your implementation of the cross entropy **cannot** use the PyTorch function `nn.CrossEntropyLoss`. Please manually implement your own function based on the formula.

In [None]:
%reset -f
import torch
import datetime
print('Timestamp:',datetime.datetime.now().strftime("%y-%m-%d--%H-%M-%S"))

s = torch.Tensor( [ [5.2,-1.2], [-1.0,4.0], [-1,2] ] )
label = torch.LongTensor([0,1,1])

# YOUR CODE STARTS HERE
def softmax_operation(array):
  prob_vec = torch.zeros(size=array.size())
  for i in range(len(prob_vec)):
    prob_vec[i] = torch.exp(array[i]) / torch.sum(torch.exp(array))
  return prob_vec
loss = 0
for i in range(s.size()[0]):
  loss += torch.log(softmax_operation(s[i])[label[i]])
loss = -loss/s.size()[0]
print(loss)

Timestamp: 22-08-03--04-13-27
tensor(0.0190)


## Question 5

Design a MLP (i.e., Feed-forward Neural Network) to classify the two-moon problem dataset below (binary classification).

<img src="pic/two_moon.jpg" width="600"/>
<center> Figure : Two-moon dataset </center> 

__Requirements:__
1. Only use `train_data` and `train_label` for training.
1. The total number of epochs is limited to 30.
1. Print the training loss and accuracy during the training process.
1. Print the test accuracy for `test_data` and `test_label` after training.
1. The test accuracy must be above 90%

Remark: If certain conditions of the questions (for eg. hyperparameter values, network structure, etc.) are not stated, you are free to choose anything you want.

In [None]:
%reset -f
import torch
import datetime
from tqdm.notebook import tqdm
print('Timestamp:',datetime.datetime.now().strftime("%y-%m-%d--%H-%M-%S"))

def get_accuracy(scores, labels):
    num_data = scores.size(0)
    predicted_labels = scores.argmax(dim=1)
    indicator = (predicted_labels == labels)
    num_matches = indicator.sum()
    return 100*num_matches.float()/num_data    

train_data, test_data, train_label, test_label = torch.load('dataset/two_moons.pt')
print(train_data.size(),train_label.size(),test_data.size(),test_label.size())

# YOUR CODE STARTS HERE

class MLP_Custom(torch.nn.Module):
  def __init__(self, inputs):
    super(MLP_Custom, self).__init__()
    self.linear0 = torch.nn.Linear(inputs, 64)
    self.linear1 = torch.nn.Linear(64, 128)
    self.linear2 = torch.nn.Linear(128, 32)
    self.linear3 = torch.nn.Linear(32, 2)
  def forward(self, x):
    x = self.linear0(x)
    x = torch.nn.ReLU()(x)
    x = self.linear1(x)
    x = torch.nn.ReLU()(x)
    x = self.linear2(x)
    x = torch.nn.ReLU()(x)
    x = self.linear3(x)
    x = torch.nn.Softmax()(x)

    return x

# Pre-processing data
train_label_onehot = torch.nn.functional.one_hot(train_label, num_classes=2)

class TwoMoonDataset(torch.utils.data.Dataset):
  def __init__(self, data, label):
    self.data = data
    self.label = label
  def __len__(self):
    return len(self.data)
  def __getitem__(self, idx):
    return self.data[idx], self.label[idx]

train_dataset = TwoMoonDataset(train_data, train_label_onehot)
dataloader_train = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True, num_workers=1)

# Building model
model = MLP_Custom(train_data.shape[1])
loss_function = torch.nn.BCELoss()
optimizer = torch.optim.Adam(params = model.parameters(), lr=1e-4)

# Training model
EPOCHS = 30
for epoch in range(EPOCHS):
  print(f"Epoch {epoch}/{EPOCHS}: ", end = ' ')
  current_loss = 0.0
  current_accuracy = 0.0
  for i, data in enumerate(dataloader_train, 0):
    inputs, targets = data
    optimizer.zero_grad()
    outputs = model(inputs)

    loss = loss_function(outputs, targets.type(torch.FloatTensor))
    loss.backward()
    optimizer.step()
    current_loss += loss.item()
    current_accuracy += (outputs.argmax(dim=1) == targets.argmax(dim=1)).sum()
  print("Loss : %.4f, Accuracy: %.4f" % (current_loss/train_dataset.__len__(), current_accuracy/train_dataset.__len__()))

print('==================================')
predict = model(test_data)
print('Test accuracy: %.4f' %(get_accuracy(predict, test_label)))

Timestamp: 22-09-16--12-02-04
torch.Size([1600, 2]) torch.Size([1600]) torch.Size([400, 2]) torch.Size([400])
Epoch 0/30:  Loss : 0.0210, Accuracy: 0.5031
Epoch 1/30:  



Loss : 0.0192, Accuracy: 0.7931
Epoch 2/30:  Loss : 0.0168, Accuracy: 0.8438
Epoch 3/30:  Loss : 0.0141, Accuracy: 0.8388
Epoch 4/30:  Loss : 0.0118, Accuracy: 0.8462
Epoch 5/30:  Loss : 0.0102, Accuracy: 0.8550
Epoch 6/30:  Loss : 0.0093, Accuracy: 0.8606
Epoch 7/30:  Loss : 0.0085, Accuracy: 0.8687
Epoch 8/30:  Loss : 0.0080, Accuracy: 0.8806
Epoch 9/30:  Loss : 0.0074, Accuracy: 0.8900
Epoch 10/30:  Loss : 0.0070, Accuracy: 0.9000
Epoch 11/30:  Loss : 0.0065, Accuracy: 0.9050
Epoch 12/30:  Loss : 0.0061, Accuracy: 0.9125
Epoch 13/30:  Loss : 0.0057, Accuracy: 0.9200
Epoch 14/30:  Loss : 0.0053, Accuracy: 0.9275
Epoch 15/30:  Loss : 0.0049, Accuracy: 0.9319
Epoch 16/30:  Loss : 0.0046, Accuracy: 0.9375
Epoch 17/30:  Loss : 0.0043, Accuracy: 0.9456
Epoch 18/30:  Loss : 0.0039, Accuracy: 0.9519
Epoch 19/30:  Loss : 0.0036, Accuracy: 0.9569
Epoch 20/30:  Loss : 0.0033, Accuracy: 0.9638
Epoch 21/30:  Loss : 0.0030, Accuracy: 0.9706
Epoch 22/30:  Loss : 0.0027, Accuracy: 0.9719
Epoch 23/3

## Question 6

Implement the 3-layer Feed-forward Neural Network:

$$s = \sigma(\sigma(XW_1 + b_1)W_2 + b_2)W_3 + b_3$$

with the activation function defined as

$$\sigma(x)=\frac{2e^{x}-e^{-x}}{2e^{x}+e^{-x}}$$

Instantiate a network with 784 as input dimension, 25 as hidden dimension and 10 as output dimension.  

__Requirements:__
1. Only use `train_data` and `train_label` for training.
1. Use Adam optimizer instead of SGD : `torch.optim.Adam`.
1. The total number of epochs is limited to 10.
1. Print the training loss and accuracy during the training process.
1. Print the test accuracy for `test_data` and `test_label` after training.
1. The test accuracy must be above 82%.

Remark: If certain conditions of the questions (for eg. hyperparameter values) are not stated, you are free to choose anything you want.


In [None]:
%reset -f
import torch
import datetime
print('Timestamp:',datetime.datetime.now().strftime("%y-%m-%d--%H-%M-%S"))

def get_accuracy(scores, labels):
    num_data = scores.size(0)
    predicted_labels = scores.argmax(dim=1)
    indicator = (predicted_labels == labels)
    num_matches = indicator.sum()
    return 100*num_matches.float()/num_data  

train_data, train_label, test_data, test_label = torch.load('dataset/small_MNIST.pt')
print(train_data.size(),train_label.size(),test_data.size(),test_label.size())

Timestamp: 22-09-16--18-38-36
torch.Size([1000, 28, 28]) torch.Size([1000]) torch.Size([500, 28, 28]) torch.Size([500])


In [None]:
%reset -f
import torch
import datetime
print('Timestamp:',datetime.datetime.now().strftime("%y-%m-%d--%H-%M-%S"))

def get_accuracy(scores, labels):
    num_data = scores.size(0)
    predicted_labels = scores.argmax(dim=1)
    indicator = (predicted_labels == labels)
    num_matches = indicator.sum()
    return 100*num_matches.float()/num_data  

train_data, train_label, test_data, test_label = torch.load('dataset/small_MNIST.pt')
print(train_data.size(),train_label.size(),test_data.size(),test_label.size())

# YOUR CODE STARTS HERE  
class Custom_Activation(torch.nn.Module):
  def __init__(self):
    super().__init__()

  def forward(self, input):
    return torch.div((2*torch.exp(input) - torch.exp(-input)), (2*torch.exp(input) + torch.exp(-input)))

num_class = torch.unique(train_label).size(0)

class MNIST_Dataset(torch.utils.data.Dataset):
  def __init__(self, images, labels):
    self.images = images
    self.labels = labels

  def __len__(self):
    return len(self.images)
  
  def __getitem__(self, idx):
    return self.images[idx], self.labels[idx]

class MLP_three_layer(torch.nn.Module):
  def __init__(self):
    super(MLP_three_layer, self).__init__()
    self.hidden0 = torch.nn.Linear(784, 25)
    self.hidden1 = torch.nn.Linear(25, 25)
    self.output = torch.nn.Linear(25, 10)

  def forward(self, x):
    x = x.view(-1,784)
    x = self.hidden0(x)
    x = Custom_Activation()(x)
    x = self.hidden1(x)
    x = Custom_Activation()(x)
    x = self.output(x)
    x = torch.nn.Softmax(dim=1)(x)
    return x

#Preprocessing
train_label_onehot = torch.nn.functional.one_hot(train_label, num_classes=num_class).type(torch.FloatTensor)
train_dataset = MNIST_Dataset(train_data, train_label_onehot)
train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True, num_workers=1)

# Building the model
model = MLP_three_layer()
loss_function = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)

# Run the training loop
EPOCHS = 10
for epoch in range(EPOCHS): 
  print(f"Epoch {epoch}/{EPOCHS}: ", end = ' ')
  current_loss = 0.0
  current_accuracy = 0.0
  
  for i, data in enumerate(train_dataloader,0):
    images, labels = data
    # Zero the gradients
    optimizer.zero_grad()
    # Perform forward pass
    outputs = model(images)
    # Compute loss
    loss = loss_function(outputs, labels)
    # Perform backward pass
    loss.backward()
    # Perform optimization
    optimizer.step()
    # Compute Loss
    current_loss += loss.item()
    # Check Accuracy
    indicator = (outputs.argmax(dim=1) == labels.argmax(dim=1))
    current_accuracy += indicator.sum()
    
  print("Loss : %.4f, Accuracy: %.4f" % (current_loss/train_dataset.__len__(), current_accuracy/train_dataset.__len__()))

# Process is complete.
print('Training process has finished.')
print('==================================')
predict = model(test_data)
print('Test accuracy: %.4f' %(get_accuracy(predict, test_label)))

Timestamp: 22-09-16--19-18-37
torch.Size([1000, 28, 28]) torch.Size([1000]) torch.Size([500, 28, 28]) torch.Size([500])
Epoch 0/10:  Loss : 0.0729, Accuracy: 0.3140
Epoch 1/10:  Loss : 0.0704, Accuracy: 0.5780
Epoch 2/10:  Loss : 0.0669, Accuracy: 0.6700
Epoch 3/10:  Loss : 0.0631, Accuracy: 0.7440
Epoch 4/10:  Loss : 0.0594, Accuracy: 0.8020
Epoch 5/10:  Loss : 0.0568, Accuracy: 0.8390
Epoch 6/10:  Loss : 0.0547, Accuracy: 0.8810
Epoch 7/10:  Loss : 0.0532, Accuracy: 0.9010
Epoch 8/10:  Loss : 0.0521, Accuracy: 0.9240
Epoch 9/10:  Loss : 0.0512, Accuracy: 0.9310
Training process has finished.
Test accuracy: 85.8000


## Question 7

### Pro tip: You can proceed to this question when you have much time remaining during the test

Alice trained an MLP for odd digits with the MNIST dataset, while Bob trained an MLP for even digits. Now they want to combine their force to predict all digits, but neither of their networks can do it alone. They do not want to train the whole dataset again since it takes too long. They decide to construct a new network with their existing MLPs as follows :

<img src="pic/new_network.svg" width="600"/>
<br>
<center> Figure : New network based on Alice's and Bob's networks </center> 

__Requirements:__
1. Use the pre-trained `net_alice` and `net_bob` to construct the new network. After loading their pre-trained parameters, these parameters must be frozen.
1. Only use `train_data` and `train_label` for training.
1. The total number of epochs is limited to 100.
1. Print the test accuracy for `test_data` and `test_label` after training.
1. The test accuracy must be above 85%.

Remark: If certain conditions of the questions (for eg. hyperparameter values) are not stated, you are free to choose anything you want.


In [None]:
%reset -f
import torch
import datetime
print('Timestamp:',datetime.datetime.now().strftime("%y-%m-%d--%H-%M-%S"))


def get_accuracy(scores, labels):
    num_data = scores.size(0)
    predicted_labels = scores.argmax(dim=1)
    indicator = (predicted_labels == labels)
    num_matches = indicator.sum()
    return 100*num_matches.float()/num_data  

train_data, train_label, test_data, test_label = torch.load('dataset/small_MNIST.pt')
print(train_data.size(),train_label.size(),test_data.size(),test_label.size())


# The network architecture used by Alice and Bob
import torch.nn as nn
class three_layer_MLP(nn.Module):
    def __init__(self, input_size, hidden_size1, hidden_size2, output_size):
        super(three_layer_MLP , self).__init__()
        self.fc1 = nn.Linear(  input_size, hidden_size1 )
        self.fc2 = nn.Linear(  hidden_size1, hidden_size2 )
        self.fc3 = nn.Linear(  hidden_size2, output_size )
    def forward(self, x):
        x = x.view(-1,784)
        x = self.fc1(x)
        x = torch.relu(x)
        x = self.fc2(x)
        x = torch.relu(x)
        score = self.fc3(x)
        return score
    
# Load the pre-trained parameters
net_alice = three_layer_MLP(784,512,128,5)
print(net_alice)
net_alice.load_state_dict(torch.load("net/alice.pt"))
net_bob = three_layer_MLP(784,512,128,5)
net_bob.load_state_dict(torch.load("net/bob.pt"))


## YOUR CODE STARTS HERE 

Timestamp: 22-09-16--19-18-57
torch.Size([1000, 28, 28]) torch.Size([1000]) torch.Size([500, 28, 28]) torch.Size([500])
three_layer_MLP(
  (fc1): Linear(in_features=784, out_features=512, bias=True)
  (fc2): Linear(in_features=512, out_features=128, bias=True)
  (fc3): Linear(in_features=128, out_features=5, bias=True)
)


<All keys matched successfully>