In [None]:
Write the Python code to implement a single neuron.
# Python program to implement a
# single neuron neural network

# import all necessary libraries
from numpy import exp, array, random, dot, tanh

# Class to create a neural
# network with single neuron
class NeuralNetwork():
	
	def __init__(self):
		
		# Using seed to make sure it'll
		# generate same weights in every run
		random.seed(1)
		
		# 3x1 Weight matrix
		self.weight_matrix = 2 * random.random((3, 1)) - 1

	# tanh as activation function
	def tanh(self, x):
		return tanh(x)

	# derivative of tanh function.
	# Needed to calculate the gradients.
	def tanh_derivative(self, x):
		return 1.0 - tanh(x) ** 2

	# forward propagation
	def forward_propagation(self, inputs):
		return self.tanh(dot(inputs, self.weight_matrix))
	
	# training the neural network.
	def train(self, train_inputs, train_outputs,
							num_train_iterations):
								
		# Number of iterations we want to
		# perform for this set of input.
		for iteration in range(num_train_iterations):
			output = self.forward_propagation(train_inputs)

			# Calculate the error in the output.
			error = train_outputs - output

			# multiply the error by input and then
			# by gradient of tanh function to calculate
			# the adjustment needs to be made in weights
			adjustment = dot(train_inputs.T, error *
							self.tanh_derivative(output))
							
			# Adjust the weight matrix
			self.weight_matrix += adjustment

# Driver Code
if __name__ == "__main__":
	
	neural_network = NeuralNetwork()
	
	print ('Random weights at the start of training')
	print (neural_network.weight_matrix)

	train_inputs = array([[0, 0, 1], [1, 1, 1], [1, 0, 1], [0, 1, 1]])
	train_outputs = array([[0, 1, 1, 0]]).T

	neural_network.train(train_inputs, train_outputs, 10000)

	print ('New weights after training')
	print (neural_network.weight_matrix)

	# Test the neural network with a new situation.
	print ("Testing network on new examples ->")
	print (neural_network.forward_propagation(array([1, 0, 0])))


In [None]:
Write the Python code to implement ReLU.
def relu(x):
    return max(0.0, x)
 
x = 1.0
print('Applying Relu on (%.1f) gives %.1f' % (x, relu(x)))
x = -10.0
print('Applying Relu on (%.1f) gives %.1f' % (x, relu(x)))
x = 0.0
print('Applying Relu on (%.1f) gives %.1f' % (x, relu(x)))
x = 15.0
print('Applying Relu on (%.1f) gives %.1f' % (x, relu(x)))
x = -20.0
print('Applying Relu on (%.1f) gives %.1f' % (x, relu(x)))


In [None]:
Write the Python code for a dense layer in terms of matrix multiplication.
# Program to multiply two matrices using nested loops
 
# take a 3x3 matrix
A = [[12, 7, 3],
    [4, 5, 6],
    [7, 8, 9]]
 
# take a 3x4 matrix   
B = [[5, 8, 1, 2],
    [6, 7, 3, 0],
    [4, 5, 9, 1]]
     
result = [[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]]
 
# iterating by row of A
for i in range(len(A)):
 
    # iterating by column by B
    for j in range(len(B[0])):
 
        # iterating by rows of B
        for k in range(len(B)):
            result[i][j] += A[i][k] * B[k][j]
 
for r in result:
    print(r)

In [None]:
What is the “hidden size” of a layer?
Hidden size is number of features of the hidden state for RNN. So if you increase hidden size then you compute bigger feature as hidden state output. However, num_layers is just multiple RNN units which contain hidden states with given hidden size.

In [None]:
What does the t method do in PyTorch?

PyTorch is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook's AI Research lab (FAIR)

In [None]:
Matrix multiplications in NumPy are reasonably fast without the need for optimization. However, if every second counts, it is possible to significantly improve performance (even without a GPU).

Below are a collection of small tricks that can help with large (~4000x4000) matrix multiplications. I have used them to reduce inference time in a deep neural network from 24 seconds to less than one second. In fact, in one case, my optimized code on a CPU turned out to run faster than Tensorflow using a GPU (1 second vs 7 seconds).

In [None]:
Do not forget that cell magic starts with %% and line magic starts with %. An easier way is to use ExecuteTime plugin in jupyter_contrib_nbextensions package. You can use timeit magic function for that. I simply added %%time at the beginning of the cell and got the time.

In [None]:
What is elementwise arithmetic?
Each pair of elements in corresponding locations are added together to produce a new tensor of the same shape. So, addition is an element-wise operation, and in fact, all the arithmetic operations, add, subtract, multiply, and divide are element-wise operations.

In [None]:
>> my_list1 = [30, 34, 56]
>>> my_list2 = [29, 500, 43]
>>> all(i >= 30 for i in my_list1)
True
>>> all(i >= 30 for i in my_list2)..

In [None]:
Rank 0 Tensor: The familiar scalar is the simplest tensor and is a rank 0 tensor. Scalars are just single real numbers like ½, 99 or -1002 that are used to measure magnitude (size). Scalars can technically be written as a one-unit array: [½], or [-1002], but it’s not usual practice to do so. Rank 1 Tensor: Vectors are rank 1 tensors.

In [None]:
With elementwise arithmetic, we can remove one of our three nested loops: we can multiply the tensors that correspond to the i -th row of a and the j -th column of b before summing all the elements, which will speed things up because the inner loop will now be executed by PyTorch at C speed.


In [None]:
Rule 1: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded...
Rule 2: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that...
Rule 3: If in any dimension the sizes disagree and neither is equal to 1, an error is raised.