# Assignment 11 Solutions

1. Write the Python code to implement a single neuron.

Certainly! Here's an example of Python code to implement a single neuron:

```pythoN

In [1]:
import numpy as np

class Neuron:
    def __init__(self, num_inputs):
        self.weights = np.random.rand(num_inputs)
        self.bias = np.random.rand()
    
    def activate(self, inputs):
        weighted_sum = np.dot(inputs, self.weights) + self.bias
        output = self.sigmoid(weighted_sum)
        return output
    
    def sigmoid(self, x):
        return 1 / (1 + np.exp(-x))

# Example usage:
neuron = Neuron(2)  # Create a neuron with 2 inputs
inputs = np.array([0.5, 0.7])  # Input values
output = neuron.activate(inputs)
print("Output:", output)

Output: 0.846625859073371


In this code, we define a `Neuron` class that represents a single neuron. The constructor `__init__` initializes the neuron with random weights and a bias term. The `activate` method takes an array of input values, computes the weighted sum of inputs, adds the bias, and applies the sigmoid activation function to obtain the output of the neuron.

In the example usage, we create an instance of the `Neuron` class with 2 inputs. We then provide input values in the form of a NumPy array and call the `activate` method to compute the output of the neuron. Finally, we print the output.

2. Write the Python code to implement ReLU.

Certainly! Here's an example of Python code to implement the Rectified Linear Unit (ReLU) activation function:

```pythoN

In [3]:
import numpy as np

def relu(x):
    return np.maximum(0, x)

# Example usage:
x = np.array([-2, -1, 0, 1, 2])
output = relu(x)
print("Output:", output)

Output: [0 0 0 1 2]


In this code, we define a function called `relu` that takes an input `x` and applies the ReLU activation function element-wise using NumPy's `maximum` function. The ReLU function returns the maximum value between 0 and `x`, effectively setting negative values to zero.

In the example usage, we create an input array `x` with a range of values. We then pass `x` to the `relu` function to compute the output. Finally, we print the output.

3. Write the Python code for a dense layer in terms of matrix multiplication.

In [4]:
import numpy as np

class DenseLayer:
    def __init__(self, input_size, output_size):
        self.weights = np.random.randn(input_size, output_size)
        self.bias = np.random.randn(output_size)
    
    def forward(self, inputs):
        output = np.dot(inputs, self.weights) + self.bias
        return output

# Example usage:
dense_layer = DenseLayer(3, 2)  # Create a dense layer with 3 inputs and 2 outputs
inputs = np.array([1, 2, 3])  # Input values
output = dense_layer.forward(inputs)
print("Output:", output)


Output: [ 3.10862775 -1.60617565]


In the example usage, we create an instance of the DenseLayer class with 3 inputs and 2 outputs. We then provide input values in the form of a NumPy array and call the forward method to compute the output of the layer. Finally, we print the output.

4. Write the Python code for a dense layer in plain Python (that is, with list comprehensions
and functionality built into Python).

In [None]:
import random

class DenseLayer:
    def __init__(self, input_size, output_size):
        self.weights = [[random.random() for _ in range(output_size)] for _ in range(input_size)]
        self.bias = [random.random() for _ in range(output_size)]
    
    def forward(self, inputs):
        output = [sum([inputs[j] * self.weights[i][j] for j in range(len(inputs))]) + self.bias[i] for i in range(len(self.bias))]
        return output

# Example usage:
dense_layer = DenseLayer(3, 2)  # Create a dense layer with 3 inputs and 2 outputs
inputs = [1, 2, 3]  # Input values
output = dense_layer.forward(inputs)
print("Output:", output)




In this code, we define a DenseLayer class that represents a dense layer in a neural network. The constructor __init__ initializes the layer with random weights and biases using list comprehensions. The forward method takes a list of input values, performs the matrix multiplication-like operation using list comprehensions and built-in sum function, adds the bias term, and returns the output of the layer.

In the example usage, we create an instance of the DenseLayer class with 3 inputs and 2 outputs. We then provide input values in the form of a list and call the forward method to compute the output of the layer. Finally, we print the output. Note that the weights and biases are generated using the random.random() function to generate random values between 0 and 1.

5. What is the “hidden size” of a layer?

The "hidden size" of a layer refers to the number of neurons or units present in that layer. In a neural network, a layer can be composed of multiple neurons, and the hidden size represents the number of neurons in that layer excluding the input and output neurons.

The hidden size determines the capacity and complexity of the layer's representation. Each neuron in the hidden layer processes and transforms the input data using its weights and activation function. The hidden size controls the number of learnable parameters in the layer, influencing the layer's ability to capture and represent complex patterns or relationships in the data.

For example, in a fully connected layer of a neural network, the hidden size refers to the number of neurons in that layer, each of which receives input from all the neurons in the previous layer and provides output to all the neurons in the next layer. The hidden size is a hyperparameter of the neural network, and its choice can impact the model's performance, capacity, and training time.

6. What does the t method do in PyTorch?

7. Why is matrix multiplication written in plain Python very slow?

Matrix multiplication implemented in plain Python can be slower compared to optimized implementations in libraries like NumPy or libraries that utilize lower-level languages like C or CUDA. There are several reasons for this performance difference:

1. Lack of vectorization: In plain Python, matrix multiplication is typically implemented using nested loops. This approach processes each element of the resulting matrix individually, which can be inefficient. On the other hand, optimized libraries like NumPy use vectorized operations, which allow for parallelized computations and take advantage of underlying hardware acceleration.

2. Dynamic typing: Python is a dynamically typed language, which means that the type of variables can change at runtime. This dynamic typing introduces additional overhead compared to statically typed languages like C or Fortran, which can have a negative impact on performance, especially for large-scale matrix multiplication.

3. Interpretation overhead: Python code is interpreted rather than directly compiled to machine code. The interpreter needs to translate the Python code line by line, which introduces additional overhead compared to compiled languages. In contrast, compiled languages can perform extensive optimizations during the compilation process, resulting in faster execution.

4. Python's Global Interpreter Lock (GIL): The GIL is a mechanism in CPython (the reference implementation of Python) that allows only one thread to execute Python bytecode at a time. This can limit the performance of multi-threaded matrix multiplication, as the GIL prevents true parallel execution on multi-core systems.

To achieve faster matrix multiplication in Python, it is recommended to leverage optimized libraries like NumPy, which internally use highly optimized routines written in low-level languages. These libraries take advantage of efficient algorithms, hardware-specific optimizations, and parallelization to perform matrix operations efficiently, resulting in significant speed improvements compared to plain Python implementations.

8. In matmul, why is ac==br?

In matrix multiplication, the condition `ac == br` (where `a` is an `m x n` matrix and `b` is an `n x p` matrix) ensures that the number of columns in the left matrix (`a`) matches the number of rows in the right matrix (`b`). This condition is necessary for the matrices to be compatible for multiplication.

When multiplying two matrices, the number of columns in the left matrix must be equal to the number of rows in the right matrix in order for the multiplication operation to be defined. The resulting matrix will have dimensions `m x p`, where `m` is the number of rows in the left matrix and `p` is the number of columns in the right matrix.

If the condition `ac == br` is not satisfied, the matrix multiplication operation cannot be performed, as the dimensions of the matrices are incompatible.

9. In Jupyter Notebook, how do you measure the time taken for a single cell to execute?

10. What is elementwise arithmetic?

Elementwise arithmetic refers to performing arithmetic operations on corresponding elements of two or more arrays or tensors, resulting in a new array or tensor with the same shape as the original arrays.

In elementwise arithmetic, each element of the output array is computed by applying the arithmetic operation to the corresponding elements from the input arrays. The operation is performed independently on each pair of corresponding elements without any dependencies on neighboring elements.

Elementwise arithmetic operations can include addition, subtraction, multiplication, division, exponentiation, and more. These operations can be performed on arrays or tensors of the same shape, or arrays and tensors that can be broadcasted to have compatible shapes.

For example, given two arrays `A = [1, 2, 3]` and `B = [4, 5, 6]`, elementwise addition would result in `C = [5, 7, 9]`, where each element of `C` is the sum of the corresponding elements from `A` and `B`.

Elementwise arithmetic is a fundamental operation in many numerical computing libraries and frameworks, such as NumPy, PyTorch, and TensorFlow. It allows for efficient and convenient computation of operations on arrays and tensors, enabling vectorized and parallelized calculations.

11. Write the PyTorch code to test whether every element of a is greater than the
corresponding element of b.

Certainly! Here's an example of PyTorch code to test whether every element of `a` is greater than the corresponding element of `b`:

```python
import torch

def compare_elements(a, b):
    return torch.all(a > b)

# Example usage:
a = torch.tensor([1, 2, 3])
b = torch.tensor([0, 1, 2])

result = compare_elements(a, b)
print("All elements of 'a' are greater than corresponding elements of 'b':", result)
```

In this code, we define a function called `compare_elements` that takes two PyTorch tensors `a` and `b`. The function uses the `>` operator to compare each element of `a` with the corresponding element of `b`. The `torch.all` function is then used to check if all the elements satisfy the condition `a > b`. It returns a boolean value indicating whether all the elements of `a` are greater than the corresponding elements of `b`.

In the example usage, we create two tensors `a` and `b` with corresponding elements. We then call the `compare_elements` function with `a` and `b` as arguments and store the result in the `result` variable. Finally, we print the result indicating whether all elements of `a` are greater than the corresponding elements of `b`.

12. What is a rank-0 tensor? How do you convert it to a plain Python data type?

13. How does elementwise arithmetic help us speed up matmul?

Elementwise arithmetic operations do not directly speed up matrix multiplication (`matmul`), but they can be used in conjunction with other techniques to optimize the overall performance of matrix multiplication operations. Here are a few ways elementwise arithmetic can be helpful in speeding up `matmul`:

1. Broadcasting: Elementwise arithmetic operations often involve arrays of different shapes. Broadcasting is a technique in NumPy/PyTorch that allows for elementwise operations between arrays of different shapes by automatically expanding the smaller array to match the shape of the larger array. Broadcasting can be leveraged to efficiently perform elementwise operations during matrix multiplication, reducing the need for explicit loops.

2. Parallelization: Many elementwise arithmetic operations, such as addition, subtraction, or multiplication, can be parallelized efficiently using SIMD (Single Instruction, Multiple Data) instructions or GPU acceleration. By leveraging parallel computing capabilities, elementwise operations can be performed simultaneously on multiple elements, leading to faster execution of the matrix multiplication operation as a whole.

3. Fusion with other operations: Elementwise arithmetic operations can be fused with other operations within a computation graph or computational kernel. For example, modern deep learning frameworks like TensorFlow or PyTorch use automatic differentiation and computational graph optimization techniques. These frameworks can fuse elementwise operations with other operations like matrix multiplication, allowing for better memory access patterns and reducing overhead, which can result in faster overall execution.

It's important to note that elementwise arithmetic itself does not directly optimize the speed of matrix multiplication but rather complements other techniques or optimizations that improve the performance of matrix multiplication operations.

14. What are the broadcasting rules?

Broadcasting is a technique used in NumPy and PyTorch to perform elementwise operations between arrays of different shapes. The broadcasting rules determine how arrays of different shapes can be broadcasted to have compatible shapes for elementwise operations. The broadcasting rules are as follows:

1. Dimensions with size 1: If the two arrays have different numbers of dimensions, the array with fewer dimensions is padded with ones on the left until the numbers of dimensions match.

2. Dimensions of size 1 are expanded: For each dimension where one array has size 1 and the other array has a greater size, the array with size 1 is stretched or replicated along that dimension to match the size of the other array.

3. Dimensions of non-equal size must be compatible: If the arrays have the same number of dimensions and their sizes do not match for at least one dimension, the dimension sizes must be equal or one of them must be 1. Otherwise, broadcasting cannot be performed, and the operation will raise a ValueError.

4. Output shape: The resulting shape of the broadcasted arrays is determined by taking the elementwise maximum of the sizes along each dimension of the input arrays.

Here's an example to illustrate the broadcasting rules:

```python
import numpy as np

A = np.array([[1, 2, 3]])
B = np.array([[4], [5]])

C = A + B  # Elementwise addition

print("Array A:\n", A)
print("Array B:\n", B)
print("Array C:\n", C)
```

Output:
```
Array A:
 [[1 2 3]]
Array B:
 [[4]
 [5]]
Array C:
 [[5 6 7]
 [6 7 8]]
```

In this example, we have arrays `A` and `B` with different shapes. Array `A` is 1x3, and array `B` is 2x1. By applying elementwise addition (`+` operator), broadcasting is applied, and the arrays are expanded to match their shapes. The resulting array `C` has the shape 2x3, and the elementwise addition is performed between corresponding elements of `A` and `B`.

By following the broadcasting rules, arrays of different shapes can be combined in elementwise operations, avoiding the need to explicitly reshape or duplicate arrays to match their shapes. Broadcasting allows for more concise and efficient code when performing elementwise arithmetic operations.

15. What is expand_as? Show an example of how it can be used to match the results of
broadcasting.

In PyTorch, the `expand_as` method is used to expand the shape of a tensor to match the shape of another tensor. It returns a new tensor with the expanded shape.

Here's an example to illustrate how `expand_as` can be used to match the results of broadcasting:

```python
import torch

A = torch.tensor([[1, 2, 3]])
B = torch.tensor([[4], [5]])

expanded_B = B.expand_as(A)  # Expand B to match the shape of A

C = A + expanded_B  # Elementwise addition

print("Tensor A:\n", A)
print("Tensor B:\n", B)
print("Expanded B:\n", expanded_B)
print("Resulting Tensor C:\n", C)
```

Output:
```
Tensor A:
 tensor([[1, 2, 3]])
Tensor B:
 tensor([[4],
        [5]])
Expanded B:
 tensor([[4, 4, 4],
        [5, 5, 5]])
Resulting Tensor C:
 tensor([[5, 6, 7],
        [6, 7, 8]])
```

In this example, we have tensors `A` and `B` with different shapes. `A` is of shape 1x3, and `B` is of shape 2x1. To match the shape of `A`, we use the `expand_as` method on `B`, passing `A` as the argument. This expands the shape of `B` to match the shape of `A`, resulting in `expanded_B` with shape 1x3.

We then perform elementwise addition between `A` and `expanded_B`, resulting in tensor `C`. By using `expand_as`, we ensure that both tensors have compatible shapes for the elementwise addition operation, allowing us to match the results of broadcasting.

`expand_as` is a convenient method to align the shapes of tensors for elementwise operations, eliminating the need to manually reshape or duplicate tensors. It helps to ensure consistent and compatible shapes for efficient elementwise arithmetic operations.