#Question 1

Write the Python code to implement a single neuron.

...............

Answer 1 -



In [2]:
import numpy as np

class SingleNeuron:
    def __init__(self, input_size):
        # Initialize weights and bias
        self.weights = np.random.rand(input_size)
        self.bias = np.random.rand()

    def activate(self, inputs):
        # Perform weighted sum and apply activation function (linear in this case)
        weighted_sum = np.dot(inputs, self.weights) + self.bias
        output = self.activation_function(weighted_sum)
        return output

    def activation_function(self, x):
        # Linear activation function
        return x

In [3]:
# Example usage
if __name__ == "__main__":
    # Create a single neuron with 3 input weights
    neuron = SingleNeuron(input_size=3)

    # Provide input values
    inputs = np.array([0.5, 0.3, 0.8])

    # Activate the neuron
    output = neuron.activate(inputs)

    # Display the result
    print("Input:", inputs)
    print("Output:", output)

Input: [0.5 0.3 0.8]
Output: 1.2371180238169326


#Question 2

Write the Python code to implement ReLU.

...............

Answer 2 -

In [4]:
import numpy as np

def relu(x):
    # ReLU activation function
    return np.maximum(0, x)

In [5]:
# Example usage

if __name__ == "__main__":
    # Example input values
    input_values = np.array([-2, -1, 0, 1, 2])

    # Apply ReLU activation
    output_values = relu(input_values)

    # Display the result
    print("Input values:", input_values)
    print("Output values after ReLU activation:", output_values)

Input values: [-2 -1  0  1  2]
Output values after ReLU activation: [0 0 0 1 2]


#Question 3

Write the Python code for a dense layer in terms of matrix multiplication.

...............

Answer 3 -

In [6]:
import numpy as np

class DenseLayer:
    def __init__(self, input_size, output_size):
        # Initialize weights and biases
        self.weights = np.random.rand(input_size, output_size)
        self.biases = np.random.rand(output_size)

    def forward(self, inputs):
        # Perform matrix multiplication and add biases
        weighted_sum = np.dot(inputs, self.weights) + self.biases

        # Apply activation function (ReLU in this case)
        output = self.relu_activation(weighted_sum)
        return output

    def relu_activation(self, x):
        # ReLU activation function
        return np.maximum(0, x)

In [7]:
# Example usage
if __name__ == "__main__":
    # Create a dense layer with 3 input neurons and 2 output neurons
    dense_layer = DenseLayer(input_size=3, output_size=2)

    # Provide input values
    inputs = np.array([0.5, 0.3, 0.8])

    # Forward pass through the dense layer
    output = dense_layer.forward(inputs)

    # Display the result
    print("Input values:", inputs)
    print("Output values after dense layer with ReLU activation:", output)

Input values: [0.5 0.3 0.8]
Output values after dense layer with ReLU activation: [0.5451737  0.95509519]


#Question 4

Write the Python code for a dense layer in plain Python (that is, with list comprehensions
and functionality built into Python).

..............

Answer 4 -

In [8]:
import random

class DenseLayer:
    def __init__(self, input_size, output_size):
        # Initialize weights and biases
        self.weights = [[random.random() for _ in range(output_size)] for _ in range(input_size)]
        self.biases = [random.random() for _ in range(output_size)]

    def forward(self, inputs):
        # Perform matrix multiplication and add biases using list comprehensions
        weighted_sum = [sum(x * w for x, w in zip(inputs, self.weights[i])) + b for i, b in enumerate(self.biases)]

        # Apply activation function (ReLU in this case) using list comprehensions
        output = [max(0, x) for x in weighted_sum]
        return output

In [9]:
# Example usage
if __name__ == "__main__":
    # Create a dense layer with 3 input neurons and 2 output neurons
    dense_layer = DenseLayer(input_size=3, output_size=2)

    # Provide input values
    inputs = [0.5, 0.3, 0.8]

    # Forward pass through the dense layer
    output = dense_layer.forward(inputs)

    # Display the result
    print("Input values:", inputs)
    print("Output values after dense layer with ReLU activation:", output)


Input values: [0.5, 0.3, 0.8]
Output values after dense layer with ReLU activation: [0.8585575754586925, 0.3699997105629127]


#Question 5

What is the “hidden size” of a layer?

...............

Answer 5 -

The "hidden size" of a layer in a neural network refers to the number of neurons or units in that layer. It is called "hidden" because it represents the intermediate computational layer between the input layer and the output layer.

In a neural network, information is passed through a series of layers, each layer having a certain number of neurons. The input layer receives the initial data, and the output layer produces the final output or prediction. The layers between the input and output layers are referred to as hidden layers.

The hidden size of a layer is a hyperparameter that you can set when designing your neural network. It determines the capacity or complexity of the model. Larger hidden sizes allow the model to learn more complex representations but may also increase the risk of overfitting, especially if the dataset is not large enough.

For example, in a dense (fully connected) layer of a neural network, the hidden size corresponds to the number of neurons in that layer. If you have a dense layer with 100 neurons, the hidden size of that layer is 100.

#Question 6

What does the t method do in PyTorch?

...............

Answer 6 -

n PyTorch, the `t method` is used to transpose a tensor. Transposing a tensor means swapping its dimensions. If you have a 2D tensor (matrix), calling t will swap its rows and columns.

Here's a simple example:

In [10]:
import torch

# Create a 2x3 tensor
tensor_2d = torch.tensor([[1, 2, 3],
                          [4, 5, 6]])

# Transpose the tensor
transposed_tensor = tensor_2d.t()

# Display original and transposed tensors
print("Original Tensor:")
print(tensor_2d)

print("\nTransposed Tensor:")
print(transposed_tensor)

Original Tensor:
tensor([[1, 2, 3],
        [4, 5, 6]])

Transposed Tensor:
tensor([[1, 4],
        [2, 5],
        [3, 6]])


#Question 7

Why is matrix multiplication written in plain Python very slow?

...............

Answer 7 -

Matrix multiplication written in plain Python is slow compared to optimized implementations using libraries like NumPy or TensorFlow. There are several reasons for this slowness:

1) `Lack of Vectorization` : Plain Python code is not inherently optimized for vectorized operations. Matrix multiplication involves a large number of scalar multiplications and additions, and performing these operations in a loop in Python is much slower than taking advantage of vectorized operations provided by optimized libraries.

2) `Interpretation Overhead` : Python is an interpreted language, and the interpretation overhead can be significant for numerical operations. The Python interpreter executes code line by line, which can lead to slower execution compared to compiled languages.

3) `Dynamic Typing` : Python is dynamically typed, meaning that variable types are not explicitly declared, and type checking is done at runtime. This flexibility comes at a cost, as it adds extra overhead compared to statically-typed languages like C or Fortran, which can result in slower execution of numerical operations.

4) `Memory Management` : Optimized libraries like NumPy use efficient memory management techniques and are often implemented in lower-level languages like C or Fortran, where memory operations can be finely tuned. Plain Python may not have the same level of optimization for memory usage and access patterns.

5) `GIL (Global Interpreter Lock)` : In CPython (the reference implementation of Python), the Global Interpreter Lock (GIL) can impact the performance of multi-threaded numerical computations. The GIL ensures that only one thread executes Python bytecode at a time, limiting parallelism in multi-threaded programs.

To overcome these issues and achieve better performance, it's recommended to use libraries like NumPy, which is implemented in C and optimized for numerical operations. NumPy leverages efficient algorithms, parallel processing, and optimized memory management, leading to significantly faster matrix multiplication compared to plain Python implementations. Additionally, using specialized numerical libraries like TensorFlow or PyTorch can further accelerate matrix operations by leveraging GPU acceleration for large-scale computations.

#Question 8

In matmul, why is ac==br?

...............

Answer 8 -

In the context of matrix multiplication, the requirement that `ac == br` refers to the dimensions of the matrices involved in the multiplication operation.

Let's consider two matrices:

1) A matrix of size `(a, b)` denoted as `A` , where `a` is the number of `rows` and `b` is the number of `columns` .

2) Another matrix of size `(c, d)` denoted as `B` , where `c` is the number of `rows` and `d` is the number of `columns` .

The product of matrices `A` and `B` is given by the matrix `C` with dimensions `(ac, bd)` . To ensure that the multiplication is valid, it must satisfy the condition `ac == br` , where:

-`ac` is the number of columns in matrix `A` (which is `b`).

- `br` is the number of rows in matrix `B` (which is also `c`).

Therefore, for the multiplication `A @ B` to be valid, the number of columns in `A` must be equal to the number of rows in `B` . This condition ensures that the inner dimensions match, allowing the matrix multiplication to proceed and resulting in a matrix with dimensions `(a, d)` .

In Python, when using the `@`  operator for matrix multiplication (or the `np.matmul` function in NumPy), the dimensions are automatically checked, and a ValueError will be raised if the condition `ac == br` is not satisfied. This helps catch potential errors in matrix dimensions before attempting the multiplication.

#Question 9

In Jupyter Notebook, how do you measure the time taken for a single cell to execute?

................

Answer 9 -

In Jupyter Notebook, you can use the %time or %timeit magic commands to measure the time taken for the execution of a single cell.

1) **%time Magic Command**

- `%time` provides a simple way to measure the elapsed time for the execution of a single cell.

- Place `%time` at the beginning of the cell, followed by the code you want to measure.

In [12]:
%time
# Your code here

2) **%timeit Magic Command**

- `%timeit` is more sophisticated than %time and performs multiple runs to get a more accurate estimate.

- It provides additional statistical information such as mean and standard deviation.

- Place %timeit at the beginning of the cell, followed by the code you want to measure.

In [12]:
%timeit
# Your code here

Here's an example:

In [14]:
import numpy as np

# Using %time
%time sum(range(1000000))

CPU times: user 20.7 ms, sys: 789 µs, total: 21.5 ms
Wall time: 21.4 ms


499999500000

In [15]:
import numpy as np

# Using %timeit
%timeit sum(range(1000000))

22.7 ms ± 3.91 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


#Question 10

What is elementwise arithmetic?

...............

Answer 10 -

Elementwise arithmetic refers to performing arithmetic operations on the corresponding elements of two or more arrays or matrices. In elementwise arithmetic, each element of one array is paired with the corresponding element of another array, and the arithmetic operation is performed on those pairs of elements.

For example, given two arrays A and B, elementwise addition would involve adding each element of A to the corresponding element of B, resulting in a new array C where each element is the sum of the corresponding elements in A and B.

Elementwise arithmetic is commonly used in numerical computations, particularly in linear algebra and machine learning. In Python, libraries such as NumPy provide efficient support for elementwise arithmetic operations on arrays and matrices.

Elementwise arithmetic is different from matrix multiplication, where each element in the resulting matrix is the sum of the products of the corresponding elements of two input matrices. In matrix multiplication, the input matrices must have compatible shapes, whereas in elementwise arithmetic, the input arrays must have the same shape.

#Question 11

Write the PyTorch code to test whether every element of a is greater than the
corresponding element of b.

.................

Answer 11 -

In [16]:
import torch

# Sample tensors a and b
a = torch.tensor([1, 2, 3])
b = torch.tensor([0, 2, 2])

# Test whether every element of a is greater than the corresponding element of b
result = torch.all(a > b)

# Display the result
print("Every element of 'a' is greater than the corresponding element of 'b':", result)

Every element of 'a' is greater than the corresponding element of 'b': tensor(False)


#Question 12

What is a rank-0 tensor? How do you convert it to a plain Python data type?

................

Answer 12 -

A rank-0 tensor is essentially a scalar, meaning it has zero dimensions. It represents a single numerical value and doesn't have any axes or dimensions. In PyTorch, rank-0 tensors are created using `torch.tensor()` or `torch.scalar_tensor()` .

To convert a rank-0 tensor (scalar) to a plain Python data type, you can use the `.item()` method. This method extracts the single numerical value from the tensor and returns it as a native Python scalar.

Here's an example:

In [17]:
import torch

# Creating a rank-0 tensor (scalar)
scalar_tensor = torch.tensor(42)

# Converting the rank-0 tensor to a plain Python data type
scalar_value = scalar_tensor.item()

# Displaying the result
print("Rank-0 Tensor (Scalar):", scalar_tensor)
print("Converted to Python Scalar:", scalar_value)

Rank-0 Tensor (Scalar): tensor(42)
Converted to Python Scalar: 42


#Question 13

How does elementwise arithmetic help us speed up matmul?

...............

Answer 13 -

Elementwise arithmetic does not directly speed up matrix multiplication (matmul), but it is a fundamental concept in understanding how optimized numerical libraries, such as NumPy or TensorFlow, can achieve faster matrix operations. Let me explain the relationship between elementwise arithmetic and matrix multiplication optimization:

1) `Vectorization` :

- Elementwise arithmetic operations can be vectorized, meaning they can be applied to entire arrays or vectors at once, avoiding the need for explicit loops.

- Vectorized operations take advantage of low-level optimizations implemented in numerical libraries, which often use highly efficient, parallelized implementations written in languages like C or Fortran.

2) `Optimized BLAS Libraries` :

- Many numerical libraries, including NumPy and TensorFlow, use optimized Basic Linear Algebra Subprograms (BLAS) libraries under the hood.

- BLAS libraries are highly optimized and are designed to perform efficient linear algebra operations, including matrix multiplication, using low-level, hardware-specific optimizations.

3) `Strided Memory Access` :

- Optimized matrix multiplication algorithms leverage strided memory access patterns to efficiently access and process data in memory.

- Elementwise operations contribute to creating strided patterns, as they operate on contiguous blocks of memory.

4) `Parallelism` :

- Modern hardware, including multi-core CPUs and GPUs, supports parallel processing.

- Elementwise operations, when applied to large arrays, can be parallelized to take advantage of the available parallel processing capabilities.

#Question 14

What are the broadcasting rules?

................

Answer 14 -

Broadcasting is a mechanism in NumPy (and other similar libraries) that allows for performing elementwise operations on arrays of different shapes and sizes. The broadcasting rules define how these operations are carried out when the shapes of the input arrays are not identical.

Here are the broadcasting rules:

1) `Dimensions Compatibility` : If the arrays have a different number of dimensions, pad the smaller shape with ones on its left side until the sizes match.

- Example: If one array has shape (3, 1, 6) and the other has shape (5, 6), pad the first array to (1, 3, 1, 6).

2) `Size Compatibility` : For each dimension, the sizes must either be equal, or one of them must be 1.

- Example: Arrays with shapes (5, 1, 3) and (1, 4, 3) are compatible because the sizes along each dimension are either equal or one of them is 1.

3) `Broadcasting` : Once the sizes are compatible, NumPy will perform elementwise operations by broadcasting the smaller array along the dimensions where its size is 1 to match the corresponding size in the larger array.

- Example: Arrays with shapes (3, 1, 6) and (5, 1, 6) are compatible, and broadcasting will be performed along the second dimension.

4) `Size of Output` : The size of the output array is determined by taking the maximum size along each dimension.

- Example: If the input arrays have shapes (3, 1, 6) and (5, 1, 6), the output shape will be (5, 3, 6).

Here's an example to illustrate these rules:

In [19]:
import numpy as np

# Example arrays
A = np.array([[1, 2, 3], [4, 5, 6]])
B = np.array([10, 20, 30])

# Broadcasting: B is broadcasted to match the shape of A
result = A + B

# Display the result
print(result)

[[11 22 33]
 [14 25 36]]


#Question 15

What is expand_as? Show an example of how it can be used to match the results of
broadcasting.

................

Answer 15 -

In PyTorch, `expand_as` is a method that is used to expand the size of one tensor to match the size of another tensor. It is commonly used to ensure that two tensors have the same size or shape before performing elementwise operations, similar to broadcasting in NumPy.

Here's an example of how `expand_as` can be used to match the results of broadcasting:

In [22]:
import torch

# Example tensors
tensor_A = torch.tensor([[1, 2, 3], [4, 5, 6]])
tensor_B = torch.tensor([10, 20, 30])

# Broadcasting without matching shapes
result_broadcasting = tensor_A + tensor_B

# Using expand_as to match shapes before addition
tensor_B_expanded = tensor_B.expand_as(tensor_A)
result_expanded = tensor_A + tensor_B_expanded

# Display the results
print("Result with broadcasting:\n", result_broadcasting)
print("\nResult with expand_as:\n", result_expanded)

Result with broadcasting:
 tensor([[11, 22, 33],
        [14, 25, 36]])

Result with expand_as:
 tensor([[11, 22, 33],
        [14, 25, 36]])
