# **328 - Computer Vision - Final Exam**



## **Numerical Computations**



1.   Softmax function Calculator

* [Calculator](https://keisan.casio.com/exec/system/15168444286206)

2.   Cross Entropy Loss

*   [Function](https://keisan.casio.com/exec/system/15168444286206)


3.   Gradient Descent (Multivariable Function Calculator)

*   [Function](https://colab.research.google.com/drive/1hIfhMJpiQC_le7fiSa5E0B5B-GsYE5u1#scrollTo=jgJQG1x5dQI7&line=13&uniqifier=1)

4.   Convolution layer Calculation


*   [Function for One channel](https://colab.research.google.com/drive/1hIfhMJpiQC_le7fiSa5E0B5B-GsYE5u1#scrollTo=q8UohGir4oT4&line=3&uniqifier=1)
*   [Function for Multiple channels](https://colab.research.google.com/drive/1hIfhMJpiQC_le7fiSa5E0B5B-GsYE5u1#scrollTo=MiwYsAZ26wuk&line=12&uniqifier=1)





In [None]:
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.utils.data as utils
import os

#**CNN Architectures**

Reference : [Link](https://d2l.ai/chapter_convolutional-modern/densenet.html)



### 1.   LeNet
<img src = "https://d2l.ai/_images/lenet.svg">
<img src= "https://d2l.ai/_images/lenet-vert.svg">



### 2.   AlexNet

<img src = "https://d2l.ai/_images/alexnet.svg">

LeNet on the left and ALexNet on the right

### 3.   VGG( Visual Geometry Group )

<img src = "https://d2l.ai/_images/vgg.svg">


### 4.   Network in Network (NiN)

<img src = "https://d2l.ai/_images/nin.svg">


### 5.   GoogLeNet

<img src = "https://d2l.ai/_images/inception-full.svg">

### 6.   ResNet

<img src = "https://d2l.ai/_images/residual-block.svg">

A regular block (left) and a residual block (right).

### 6.   DenseNet

<img src = "https://d2l.ai/_images/densenet-block.svg">

The main difference between ResNet (left) and DenseNet (right) in cross-layer connections: use of addition and use of concatenation.¶



# **Cross Entropy Loss**

Formula :


${\displaystyle H(p,q)=-\sum _{x\in {\mathcal {X}}}p(x)\,\log q(x)}$

${\displaystyle \operatorname {E} _{p}[l]=-\operatorname {E} _{p}\left[{\frac {\ln {q(x)}}{\ln(2)}}\right]=-\operatorname {E} _{p}\left[\log _{2}{q(x)}\right]=-\sum _{x_{i}}p(x_{i})\,\log _{2}{q(x_{i})}=-\sum _{x}p(x)\,\log _{2}q(x)=H(p,q)}$

Range : $H(y,y') \rightarrow R\geq 0$, and $\infty ∉\mathbb{R}$

Maximum Value = $\infty$

Minimum Value = 0

**Calculator : 1 Dimension Case only**

For general Use the Cross entropy loss using Pytorch

Documentation : [Pytorch](https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&ved=2ahUKEwiA5-OC-6ntAhXRuHEKHY6uA5UQFjAAegQIAxAC&url=https%3A%2F%2Fpytorch.org%2Fdocs%2Fstable%2Fgenerated%2Ftorch.nn.CrossEntropyLoss.html&usg=AOvVaw09Gz487qgEWWScJ_aSddK-)



```
loss = nn.CrossEntropyLoss()
input = torch.randn(3, 5, requires_grad=True)
target = torch.empty(3, dtype=torch.long).random_(5)
output = loss(input, target)
output.backward()
```





In [None]:
loss = nn.CrossEntropyLoss()
predicted = input("Please enter the values : ").split(",")
predicted = [float(i) for i in predicted]
predicted = np.array(predicted)
inputed = torch.from_numpy(predicted)
inputed = torch.tensor(predicted)
log_dist = torch.log(inputed)
log_dist[torch.isnan(log_dist)] = 0
print("Predicted Log distribution values are : ",log_dist)
truth_value = input("Please enter the ground truth values : ").split(",")
truth_value = [float(i) for i in truth_value]
truth_value = np.array(truth_value)
print("Target values are : ",truth_value)
target = torch.from_numpy(truth_value)
target = torch.tensor(truth_value)
output = -torch.dot(target,log_dist)
print(output)

Please enter the values : -0.1,12.0,3.0,-3.2,-8.0,17.0,1.9,0.6,4.9,-12.0
Predicted Log distribution values are :  tensor([ 0.0000,  2.4849,  1.0986,  0.0000,  0.0000,  2.8332,  0.6419, -0.5108,
         1.5892,  0.0000], dtype=torch.float64)
Please enter the ground truth values : 0,1,0,0,0,0,0,0,0,0
Target values are :  [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
tensor(-2.4849, dtype=torch.float64)


# **Gradient Descent Algorithm**

<img src = "https://miro.medium.com/max/2284/1*jNyE54fTVOH1203IwYeNEg.png">

<img src = "https://2.bp.blogspot.com/-ZxJ87cWjPJ8/TtLtwqv0hCI/AAAAAAAAAV0/9FYqcxJ6dNY/s1600/gradient+descent+algorithm+OLS.png">


In [None]:
from sympy import *
from IPython.display import display, Latex

x = Symbol('x')
y = Symbol('y')

f = x**2 + y**4

#Partial Derivaties and Gradient found
fpx = f.diff(x)
fpy = f.diff(y)
grad = [fpx,fpy]

theta = 0.75  #x
theta1 = 0.75 #y
alpha = .01 # step size
iterations = 0
precision = 1/1000000
printData = True
maxIterations = 3

while True:
    temptheta = theta - alpha*N(fpx.subs(x,theta).subs(y,theta1)).evalf()
    temptheta1 = theta1 - alpha*N(fpy.subs(y,theta1)).subs(x,theta).evalf()

    # Break and exit the loop for number of iterations
    iterations += 1
    if iterations > maxIterations:
      break

    #If the value of theta changes less of a certain amount, our goal is met.
    if abs(temptheta-theta) < precision and abs(temptheta1-theta1) < precision:
        break

    #Simultaneous update
    theta = temptheta
    theta1 = temptheta1
    print("Number of iterations:",iterations,sep=" ")
    print("Theta Value for x(x0) =",temptheta,sep=" ")
    print("Theta1 Value for y(y0) =",temptheta1,sep=" ") 

if printData:
    print("")
    print("Final Result ")
    pprint("The function is "+str(f))
    print("Number of iterations:",iterations,sep=" ")
    print("Theta Value for x(x0) =",temptheta,sep=" ")
    print("Theta1 Value for y(y0) =",temptheta1,sep=" ") 

    

Number of iterations: 1
Theta Value for x(x0) = 0.735000000000000
Theta1 Value for y(y0) = 0.733125000000000
Number of iterations: 2
Theta Value for x(x0) = 0.720300000000000
Theta1 Value for y(y0) = 0.717363625810547
Number of iterations: 3
Theta Value for x(x0) = 0.705894000000000
Theta1 Value for y(y0) = 0.702597109588576

Final Result 
The function is x**2 + y**4
Number of iterations: 4
Theta Value for x(x0) = 0.691776120000000
Theta1 Value for y(y0) = 0.688723832265900


# **Number of Learnable Parameters Calculator**

Reference : [Medium Article](https://medium.com/@iamvarman/how-to-calculate-the-number-of-parameters-in-the-cnn-5bd55364d7ca)


1.   Input layer : All the input layer does is read the image. So, there are no parameters learn in here.

2.   Convolutional Layer : Consider a convolutional layer which takes $l$ feature maps as the input and has $k$ feature maps as output. The filter size is $n*m$.
Example : input has $l=32$ feature maps as inputs, $k=64$ feature maps as outputs and filter size is $n=3$ and $m=3$. It is important to understand, that we don’t simply have a $3*3$ filter, but actually, we have $3*3*32$ filter, as our input has $32$ dimensions. And as an output from first conv layer, we learn 64 different $3*3*32$ filters which total weights is $n*m*k*l$. Then there is a term called bias for each feature map. So, the total number of parameters are $(n*m*l+1)*k$.


3. Pooling(Max/Avg) Layer: There are no parameters you could learn in pooling layer. This layer is just used to reduce the image dimension size.

4. Fully-connected Layer: In this layer, all inputs units have a separable weight to each output unit. For $n$ inputs and $m$ outputs, the number of weights is $n*m$. Additionally, this layer has the bias for each output node, so $(n+1)*m$ parameters.

5. Output Layer: This layer is the fully connected layer, so $(n+1)*m$ parameters, when $n$ is the number of inputs and $m$ is the number of outputs.

In [None]:
option = int(input("Please enter your option from 1 - 5 : "))
if option == 1:
  print("Input layer has 0 parameters")
elif option == 2:
  print("Convolutional Layer : ")
  input_features = int(input("Please enter the number of input_Features : "))
  output_features = int(input("Please enter the number of output_Features : "))
  filter_size = (input("Please enter the filter size : ")).split()
  filter_size = [int(i) for i in filter_size]
  print("Total Number of learnable parameters with bias is : ", (filter_size[0]*filter_size[1]*input_features + 1)*output_features)
  print("Total Number of learnable parameters without bias is : ", (filter_size[0]*filter_size[1]*input_features)*output_features)
elif option == 3:
  print("Pooling layer has 0 parameters")
elif option == 4:
  print("Fully connected Layer : ")
  input_features = int(input("Please enter the number of inputs : "))
  output_features = int(input("Please enter the number of outputs : "))
  print("Total Number of learnable parameters with bias is : ", (input_features + 1)*output_features)
  print("Total Number of learnable parameters without bias is : ", (input_features)*output_features)
else:
  print("Output Layer : ")
  input_features = int(input("Please enter the number of inputs : "))
  output_features = int(input("Please enter the number of outputs : "))
  print("Total Number of learnable parameters with bias is : ", (input_features + 1)*output_features)
  print("Total Number of learnable parameters without bias is : ", (input_features)*output_features)

Please enter your option from 1 - 5 : 2
Convolutional Layer : 
Please enter the number of input_Features : 32
Please enter the number of output_Features : 64
Please enter the filter size : 3 3
Total Number of learnable parameters with bias is :  18496
Total Number of learnable parameters without bias is :  18432


# **Convolution Layer/Pooling**


Reference : [Link](https://d2l.ai/chapter_convolutional-neural-networks/channels.html)

1.   [One channel ](https://colab.research.google.com/drive/1hIfhMJpiQC_le7fiSa5E0B5B-GsYE5u1#scrollTo=q8UohGir4oT4&line=10&uniqifier=1)

    1.1 [Pooling](https://colab.research.google.com/drive/1hIfhMJpiQC_le7fiSa5E0B5B-GsYE5u1#scrollTo=VxS0hVzWFSwR&line=8&uniqifier=1)

2. [Multiple Channels](https://colab.research.google.com/drive/1hIfhMJpiQC_le7fiSa5E0B5B-GsYE5u1#scrollTo=MiwYsAZ26wuk&line=5&uniqifier=1)

    2.1 [Pooling](https://colab.research.google.com/drive/1hIfhMJpiQC_le7fiSa5E0B5B-GsYE5u1#scrollTo=riufTcYbIMkE&line=2&uniqifier=1) 

In [19]:
from d2l import torch as d2
import torch
from torch import nn

def corr2d(X, K):
    h, w = K.shape
    Y = torch.zeros((X.shape[0] - h + 1, X.shape[1] - w + 1))
    for i in range(Y.shape[0]):
        for j in range(Y.shape[1]):
            Y[i, j] = d2.reduce_sum((X[i: i + h, j: j + w] * K))
    return Y

In [None]:
I = int(input("How many rows do you want to enter for the Convolution Input Layer : "))
I_1 = []
for i in range(I):
  r = input("Enter row " + str(i+1) + ": ").split()
  r = [float(i) for i in r]
  I_1.append(r)
W = int(input("How many rows do you want to enter for the Filter Matrix : "))
I_2 = []
for i in range(W):
  r = input("Enter row " + str(i+1) + ": ").split()
  r = [float(i) for i in r]
  I_2.append(r)
X = torch.tensor(I_1)
K = torch.tensor(I_2)
corr2d(X, K)

How many rows do you want to enter for the Convolution Input Layer : 4
Enter row 11 3 2 4 6 4
Enter row 24 8 3 1 0 2
Enter row 32 1 4 3 9 1
Enter row 44 7 2 3 9 2
How many rows do you want to enter for the Filter Matrix : 3
Enter row 1-1 0 1
Enter row 2-1 0 1
Enter row 3-1 0 1


tensor([[ 2., -4.,  6., -1.],
        [-1., -9.,  9., -2.]])

In [20]:
from d2l import torch as d2l
import torch
from torch import nn

def pool2d(X, pool_size, mode='max'):
    p_h, p_w = pool_size
    Y = torch.zeros((X.shape[0] - p_h + 1, X.shape[1] - p_w + 1))
    for i in range(Y.shape[0]):
        for j in range(Y.shape[1]):
            if mode == 'max':
                Y[i, j] = X[i: i + p_h, j: j + p_w].max()
            elif mode == 'avg':
                Y[i, j] = X[i: i + p_h, j: j + p_w].mean()
    return Y

In [23]:
I = int(input("How many rows do you want to enter for the Convolution Input Layer : "))
I_1 = []
for i in range(I):
  r = input("Enter row " + str(i+1) + ": ").split()
  r = [float(i) for i in r]
  I_1.append(r)
X = torch.tensor(I_1)
pool2d(X, (2, 2))

How many rows do you want to enter for the Convolution Input Layer : 2
Enter row 1: 1 2 3
Enter row 2: 1 2 3


tensor([[2., 3.]])

In [26]:
X = d2l.reshape(torch.arange(16, dtype=torch.float32), (1, 1, 4, 4))
pool2d = nn.MaxPool2d(2)
pool2d(X)

tensor([[[[ 5.,  7.],
          [13., 15.]]]])

## **Multiple Channels**

In [None]:
from d2l import torch as d2
import torch

def corr2d_multi_in(X, K):
    return sum(d2.corr2d(x, k) for x, k in zip(X, K))

In [None]:
cnn_final = []
filter_final = []
channel = int(input("How many channels do you have : "))
for i in range(channel):
  I = int(input("How many rows do you want to enter for the Convolution Input Layer " + str(i+1) +": "))
  I_1 = []
  for i in range(I):
    r = input("Enter row " + str(i+1) + ": ").split()
    r = [float(i) for i in r]
    I_1.append(r)
  cnn_final.append(I_1)
  W = int(input("How many rows do you want to enter for the Filter Matrix "+ str(i+1) +": "))
  I_2 = []
  for i in range(W):
    r = input("Enter row " + str(i+1) + ": ").split()
    r = [float(i) for i in r]
    I_2.append(r)
  filter_final.append(I_2)

X = torch.tensor(cnn_final)
K = torch.tensor(filter_final)

corr2d_multi_in(X, K)

tensor([[ 56.,  72.],
        [104., 120.]])

In [27]:
X = torch.cat((X, X + 1), 1)

In [29]:
padding1 = int(input("please enter the padding size : "))
stride1 = int(input("please enter the stride size : "))
pool2d = nn.MaxPool2d(3, padding=padding1, stride=stride1)
pool2d(X)

please enter the padding size : 1
please enter the stride size : 2


tensor([[[[ 5.,  7.],
          [13., 15.]],

         [[ 6.,  8.],
          [14., 16.]]]])