1.Write the Python code to implement a single neuron.

In [71]:
from numpy import exp, array, random, dot, tanh
 
# Class to create a neural
# network with single neuron
class SingleNeuron():
     
    def __init__(self):
         
        # Using seed to make sure it'll 
        # generate same weights in every run
        random.seed(1)
         
        # 3x1 Weight matrix
        self.weight_matrix = 2 * random.random((3, 1)) - 1
 
    # tanh as activation function
    def tanh(self, x):
        return tanh(x)
 
    # derivative of tanh function.
    # Needed to calculate the gradients.
    def tanh_derivative(self, x):
        return 1.0 - tanh(x) ** 2
 
    # forward propagation
    def forward_propagation(self, inputs):
        return self.tanh(dot(inputs, self.weight_matrix))
     
    # training the neural network.
    def train(self, train_inputs, train_outputs,
                            num_train_iterations):
                                 
        # Number of iterations we want to
        # perform for this set of input.
        for iteration in range(num_train_iterations):
            output = self.forward_propagation(train_inputs)
 
            # Calculate the error in the output.
            error = train_outputs - output
 
            # multiply the error by input and then
            # by gradient of tanh function to calculate
            # the adjustment needs to be made in weights
            adjustment = dot(train_inputs.T, error *
                             self.tanh_derivative(output))
                              
            # Adjust the weight matrix
            self.weight_matrix += adjustment
 
# Driver Code
if __name__ == "__main__":
     
    single_neuron = SingleNeuron()
     
    print ('Random weights at the start of the training')
    print (single_neuron.weight_matrix)
 
    train_inputs = array([[0, 0, 1], [1, 1, 1], [1, 0, 1], [0, 1, 1]])
    train_outputs = array([[0, 1, 1, 0]]).T
 
    neural_network.train(train_inputs, train_outputs, 10000)
 
    print ('New weights after the training')
    print (single_neuron.weight_matrix)
 
    # Test the neural network with a new situation.
    print ("Testing network on new examples : ")
    print (single_neuron.forward_propagation(array([1, 0, 0])))

Random weights at the start of the training
[[-0.16595599]
 [ 0.44064899]
 [-0.99977125]]
New weights after the training
[[-0.16595599]
 [ 0.44064899]
 [-0.99977125]]
Testing network on new examples : 
[-0.16444904]


In [2]:
#2.Write the Python code to implement ReLU.

try:
    def relu(a):
        return max(0.0,a)
    a = 1.0
    print('Implementing Relu on (%.1f) gives %.1f' % (a, relu(a)))
    a = 0.0
    print('Implementing Relu on (%.1f) gives %.1f' % (a, relu(a)))
    a = -36.0
    print('Implementing Relu on (%.1f) gives %.1f' % (a, relu(a)))
    a = 43.0
    print('Implementing Relu on (%.1f) gives %.1f' % (a, relu(a)))
    a = -33.3
    print('Implementing Relu on (%.1f) gives %.1f' % (a, relu(a)))
except:
    print('Implementation of ReLU')

Implementing Relu on (1.0) gives 1.0
Implementing Relu on (0.0) gives 0.0
Implementing Relu on (-36.0) gives 0.0
Implementing Relu on (43.0) gives 43.0
Implementing Relu on (-33.3) gives 0.0


3.Write the Python code for a dense layer in terms of matrix multiplication.

Let's write a function that computes the matrix product of two tensors. We'll need three nested for loops: one for the row indices, one for the column indices, and one for the inner sum. ac and ar stand for number of columns of a and number of rows of a, respectively (the same convention is followed for b), and we make sure calculating the matrix product is possible by checking that a has as many columns as b has rows:

In [67]:
import torch 
from torch import tensor

def matmul(a,b):
    ar,ac = a.shape # n_rows * n_cols
    br,bc = b.shape
    assert ac==br
    c = torch.zeros(ar, bc)
    for i in range(ar):
        for j in range(bc):
            for k in range(ac): c[i,j] += a[i,k] * b[k,j]
    return c

4.Write the Python code for a dense layer in plain Python (that is, with list comprehensions and functionality built into Python).

In [68]:
def matmul(a,b):
    ar,ac = a.shape
    br,bc = b.shape
    assert ac==br
    c = torch.zeros(ar, bc)
    for i in range(ar):
        for j in range(bc): c[i,j] = (a[i] * b[:,j]).sum()
    return c

5.What is the “hidden size” of a layer?

The size of the hidden layer is normally between the size of the input and output. It should be 2/3 the size of the input layerplus the size of the o/p layer. The number of hidden neurons should be less than twice the size of the input layer.

6.What does the t method do in PyTorch?

torch.t(input) → Tensor
Expects input to be <= 2-D tensor and transposes dimensions 0 and 1.

0-D and 1-D tensors are returned as is. When input is a 2-D tensor this is equivalent to transpose(input, 0, 1).

Parameters
input (Tensor) – the input tensor.

In [40]:
#Example of t method in Pytorch
import torch
a = torch.randn(())
print(a)
torch.t(a)
print(torch.t(a))
a = torch.randn(16)
print(a)
a = torch.randn(4,5)
print(a)
torch.t(a)

tensor(1.5827)
tensor(1.5827)
tensor([ 1.5105,  0.0037, -0.6219, -2.1718,  0.3947,  1.1170, -0.3889, -0.5528,
         0.5944, -0.0591,  0.9541,  0.5999, -0.1821,  1.2880, -0.7170,  1.0794])
tensor([[ 0.7269,  1.3612, -0.4919, -1.1367,  0.3969],
        [ 0.8873, -0.7430, -0.2489, -0.7647,  0.2062],
        [ 0.7858,  0.0079, -0.2266, -1.4388,  0.0361],
        [ 0.2394,  0.1977,  0.9185, -0.6987, -0.4942]])


tensor([[ 0.7269,  0.8873,  0.7858,  0.2394],
        [ 1.3612, -0.7430,  0.0079,  0.1977],
        [-0.4919, -0.2489, -0.2266,  0.9185],
        [-1.1367, -0.7647, -1.4388, -0.6987],
        [ 0.3969,  0.2062,  0.0361, -0.4942]])

7.Why is matrix multiplication written in plain Python very slow?

Matrix multiplication is executed in plain Python with respect to nested for loops. And that is not a very good idea as Python is a slow language and that is not going to be efficient. That is why matrix multiplication written in plain Python very slow. 

8.In matmul, why is ac==br?

In matmul or matrix multiplication, ac == br because number of columns in left matrix must be equal to number of rows in right matrix.

9.In Jupyter Notebook, how do you measure the time taken for a single cell to execute?

Magic Commands are succinct solutions to common obstacles. They can be identified by their prefix % or %%. There are two kinds of magic commands: Line Magics (% prefix) and Cell Magics (%% prefix). Line magics operate only on their line when Cell Magics operate on their full cell. We can measure the time taken for a single cell to execute in Jupyter Notebook by the magic command or cell magic %%time. Let's have a look through the following example.

In [4]:
%%time
for i in range(5):
    sqrts = [n ** (1/2) for n in range(10**i)]

Wall time: 3.99 ms


10.What is elementwise arithmetic?

All the basic operators (+, -, *,  /, >, <, == ) can be applied elementwise. That means if we write a+b for two tensors a and b that have the same shape, we will get a tensor composed of the sums the elements of a and b:

In [64]:
import torch
from torch import tensor
a = tensor([2.0,4.5,-5.6])
b = tensor([43.43,36.04,108.89])
a + b

tensor([ 45.4300,  40.5400, 103.2900])

In [50]:
# The Booleans operators will return an array of Booleans:

a > b

tensor([False, False, False])

In [51]:
a < b

tensor([True, True, True])

If we want to know if every element of a is less than the corresponding element in b, or if two tensors are equal, we need to combine those elementwise operations with torch.all:

In [54]:
(a < b).all(), (a == b).all()

(tensor(True), tensor(False))

11.Write the PyTorch code to test whether every element of a is greater than the corresponding element of b.

In [55]:
(a > b).all()

tensor(False)

12.What is a rank-0 tensor? How do you convert it to a plain Python data type?

A tensor with rank 0 is a zero-dimensional array. The element of a zero-dimensional array is a point. This is represented as a Scalar in Math and has magnitude. Eg: s = 48.3. Shape - []
Reduction operations like all(), sum() and mean() return tensors with only one element, called rank-0 tensors. If you want to convert this to a plain Python Boolean or number, you need to call .item():

In [63]:
# Conversion of rank-0 tensor into a Python data type

import torch
from torch import tensor
a = tensor([2.0,4.5,-5.6])
b = tensor([43.43,36.04,108.89])
(a + b).mean().item()

63.086669921875

13.How does elementwise arithmetic help us speed up matmul?

With elementwise arithmetic, we can remove one of our three nested loops: we can multiply the tensors that correspond to the i-th row of a and the j-th column of b before summing all the elements, which will speed things up because the inner loop will now be executed by PyTorch at C speed. That means that our matrix multiplication will be speeded up eliminating loops and replacing them with PyTorch functionalities. This will give us C speed (underneath PyTorch) instead of Python speed.

14.What are the broadcasting rules?

Broadcasting Rules: 
Broadcasting two arrays together follow these rules:
 
If the arrays don’t have the same rank then prepend the shape of the lower rank array with 1s until both shapes have the same length.
The two arrays are compatible in a dimension if they have the same size in the dimension or if one of the arrays has size 1 in that dimension.
The arrays can be broadcast together if they are compatible with all dimensions.
After broadcasting, each array behaves as if it had shape equal to the element-wise maximum of shapes of the two input arrays.
In any dimension where one array had size 1 and the other array had size greater than 1, the first array behaves as if it were copied along that dimension.

15.What is expand_as? Show an example of how it can be used to match the results of broadcasting.

Tensor.expand_as(other) → Tensor
Expand this tensor to the same size as other. self.expand_as(other) is equivalent to self.expand(other.size()).

In [41]:
a = torch.rand(2, 3)
b = torch.rand(2,2, 3)
print('a:',a)
print('b:',b)
c = a.expand_as(b)
print('c:',c)

a: tensor([[0.0772, 0.4048, 0.4750],
        [0.7310, 0.8615, 0.9692]])
b: tensor([[[0.0082, 0.2850, 0.5373],
         [0.7049, 0.9183, 0.9978]],

        [[0.6508, 0.4721, 0.2031],
         [0.7776, 0.5318, 0.2108]]])
c: tensor([[[0.0772, 0.4048, 0.4750],
         [0.7310, 0.8615, 0.9692]],

        [[0.0772, 0.4048, 0.4750],
         [0.7310, 0.8615, 0.9692]]])
