<a href="https://colab.research.google.com/github/KirtiKousik/DL_Theory_Assignments_iNeuron/blob/main/DL_Theory_Assignment_11.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1. Write the Python code to implement a single neuron.

In [2]:
import numpy as np

class Neuron:
    def __init__(self, weights, bias):
        self.weights = weights
        self.bias = bias

    def forward(self, inputs):
        weighted_sum = np.dot(inputs, self.weights) + self.bias
        return weighted_sum

inputs = np.array([1, 2, 3])
weights = np.array([0.1, 0.2, 0.3])
bias = 0.1
neuron = Neuron(weights, bias)

output = neuron.forward(inputs)
print(output)

1.5


# 2. Write the Python code to implement ReLU.

In [3]:
def relu(x):
    return max(0, x)
# This function takes a single input x and returns 0 if x is less than 0, and x otherwise.

# 3. Write the Python code for a dense layer in terms of matrix multiplication.

In [4]:
import numpy as np

class DenseLayer:
    def __init__(self, input_dim, output_dim, use_bias=True):
        self.weights = np.random.randn(input_dim, output_dim)
        self.bias = np.zeros((1, output_dim)) if use_bias else None

    def forward(self, inputs):
        """Performs matrix multiplication and returns the output of the dense layer."""
        output = np.dot(inputs, self.weights)
        if self.bias is not None:
            output = output + self.bias
        return output

# 4. Write the Python code for a dense layer in plain Python (that is, with list comprehensions and functionality built into Python).

In [5]:
def dense_layer(inputs, weights, biases):
    outputs = [sum(input_ * weight_) + bias_ for input_, weight_ in zip(inputs, weights) for bias_ in biases]
    return outputs

# 5. What is the “hidden size” of a layer?

- The "hidden size" of a layer refers to the number of neurons or nodes in that layer. It represents the number of internal representation or features learned by the network in that layer. The hidden size determines the capacity or expressiveness of the network and can affect its ability to model the relationships in the data. Increasing the hidden size increases the capacity of the network but also the risk of overfitting, whereas decreasing the hidden size reduces the capacity and the risk of overfitting. The choice of the hidden size is often a trade-off between modeling power and overfitting risk.

# 6. What does the t method do in PyTorch?

- In PyTorch, the t method is used to transpose a tensor. It transposes the dimensions of the tensor and returns a new tensor with the same data as the original tensor but with the dimensions rearranged according to the specified transpose order.

- For example, if a is a 2D tensor with shape (m, n), then a.t() returns a tensor with shape (n, m). The elements of the transposed tensor correspond to the elements of the original tensor but with the rows and columns exchanged.

# 7. Why is matrix multiplication written in plain Python very slow?

- Matrix multiplication written in plain Python can be slow due to its implementation being in a high-level interpreted language, which can result in a lack of performance compared to low-level, compiled languages like C++. Additionally, the plain Python code is not optimized for matrix multiplication, as it may not take advantage of hardware acceleration such as GPU or TPU processing, or take advantage of parallel processing and vectorization, which are common optimization techniques used in numerical computing libraries such as NumPy, TensorFlow, and PyTorch.

- Therefore, to perform matrix multiplication efficiently in Python, it is usually recommended to use a numerical computing library like NumPy, TensorFlow, or PyTorch, which are optimized for matrix multiplication and other numerical operations.

# 8. In matmul, why is ac==br?

- In matrix multiplication, the number of columns in matrix A must be equal to the number of rows in matrix B, so ac must be equal to br. This condition is necessary because the matrix product AB is defined only if the number of columns in A is equal to the number of rows in B. If ac is not equal to br, it will result in an error or incorrect results.

# 9. In Jupyter Notebook, how do you measure the time taken for a single cell to execute?

- We can measure the time taken for a single cell to execute in Jupyter Notebook by using the %timeit magic command. This command runs the code in the cell multiple times and returns the best run time.

- Here's an example:

In [6]:
%timeit
# code to be executed here

- Alternatively, we can use the time module in Python to measure the time taken for a cell to execute:

In [7]:
import time
start_time = time.time()

# code to be executed here

print("--- %s seconds ---" % (time.time() - start_time))

--- 4.482269287109375e-05 seconds ---


# 10. What is elementwise arithmetic?

- Elementwise arithmetic refers to mathematical operations performed between two tensors (multi-dimensional arrays) of the same shape on an element-by-element basis. This means that the operation is applied individually to each element of the tensors, without considering their relationships with other elements. For example, if two tensors A and B have the same shape, and we perform the elementwise addition operation A + B, then the resulting tensor C will have the same shape as A and B, with each element c_ij = a_ij + b_ij. The most common elementwise arithmetic operations include addition, subtraction, multiplication, and division.

# 11. Write the PyTorch code to test whether every element of a is greater than the corresponding element of b.

In [8]:
import torch

a = torch.tensor([1, 2, 3, 4, 5])
b = torch.tensor([5, 4, 3, 2, 1])

result = (a > b).all().item()

if result == 1:
    print("All elements of a are greater than the corresponding elements of b.")
else:
    print("Not all elements of a are greater than the corresponding elements of b.")

Not all elements of a are greater than the corresponding elements of b.


# 12. What is a rank-0 tensor? How do you convert it to a plain Python data type?

- A rank-0 tensor, also known as a scalar tensor, is a tensor with only one value. It has no shape or dimension. In PyTorch, it is represented by a tensor with shape (). To convert a rank-0 tensor to a plain Python data type (such as an int or a float), we can use the item method. For example:

In [9]:
import torch

tensor = torch.tensor(10.0)
value = tensor.item()
print(value) # prints 10.0

10.0


- If the tensor contains multiple values, we will need to convert it to a plain Python array or matrix first, and then extract the desired value.

# 13. How does elementwise arithmetic help us speed up matmul?

- Elementwise arithmetic is a fundamental operation in matrix multiplication, as it allows for efficient implementation of many matrix operations. In matrix multiplication, elementwise arithmetic is used to perform a single operation on each element of the matrices involved in the operation. This is much faster than performing a single operation on the entire matrix at once, as the number of operations that need to be performed is reduced.

- Elementwise arithmetic can be used to speed up matrix multiplication by breaking the operation into smaller operations on individual elements. For example, the dot product of two matrices can be computed by performing an elementwise multiplication followed by a sum of the results. By parallelizing the elementwise operations, the matrix multiplication can be computed much faster, as the time taken is proportional to the number of elements in the matrices.

- In PyTorch, elementwise arithmetic is supported through the use of broadcast semantics, which automatically broadcast scalars and tensors to the correct size for elementwise operations. This makes it easy to write code for matrix multiplication that is both efficient and readable.

# 14. What are the broadcasting rules?

- In PyTorch (and in most numerical libraries), broadcasting is a mechanism that allows operations to be performed on arrays of different shapes, as long as some rules are followed. The rules for broadcasting are as follows:

    - If two arrays have different numbers of dimensions, the array with fewer dimensions is padded with 1s on the left.

    - If the shape of two arrays does not match on a particular dimension, the size of the smaller array in that dimension is expanded to match the size of the larger array.

    - If either of the arrays has a size of 1 in a particular dimension, the array is stretched to match the size of the other array in that dimension.

- In general, broadcasting can be used to perform elementwise operations between arrays of different shapes, as long as their shapes are broadcastable to a common shape. If the shapes are not broadcastable, a runtime error will be raised.

# 15. What is expand_as? Show an example of how it can be used to match the results of broadcasting.

- In PyTorch, "expand_as" is a function that takes a tensor "t" and a tensor "other" as inputs, and expands the size of "t" to match that of "other". The tensor "t" will be expanded with size "1" along the dimensions that are missing in "t", but present in "other".

- Here is an example of how "expand_as" can be used to match the results of broadcasting:

In [10]:
import torch

# Define tensor t with shape (3,)
t = torch.tensor([1, 2, 3])

# Define tensor other with shape (3, 3)
other = torch.tensor([[10, 20, 30], [40, 50, 60], [70, 80, 90]])

# Expand t to match the shape of other
t_expanded = t.unsqueeze(1).expand_as(other)

# The shape of t_expanded is now (3, 3)
print(t_expanded.shape)

# The expanded tensor can now be used in operations with other
result = t_expanded + other
print(result)

torch.Size([3, 3])
tensor([[11, 21, 31],
        [42, 52, 62],
        [73, 83, 93]])


- In this example, the shape of "t" is expanded from "(3,)" to "(3, 3)" to match the shape of "other" so that they can be used in element-wise operations such as addition.