### 1. Write the Python code to implement a single neuron.

In [18]:
import numpy as np

class SingleNeuron:
    def __init__(self, input_size, activation_function):
        # Initialize weights and bias randomly
        self.weights = np.random.rand(input_size)
        self.bias = np.random.rand(1)
        self.activation_function = activation_function

    def forward(self, inputs):
        # Calculate weighted sum and apply activation function
        weighted_sum = np.dot(inputs, self.weights) + self.bias
        output = self.activation_function(weighted_sum)
        return output

# Example usage:
# Define an activation function (e.g., sigmoid)
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Create a single neuron with 3 input features and the sigmoid activation function
neuron = SingleNeuron(input_size=3, activation_function=sigmoid)

# Input features
inputs = np.array([0.5, -0.2, 0.1])

# Forward pass through the neuron
output = neuron.forward(inputs)

# Print the output
print("Output:", output)

Output: [0.72916074]


### 2. Write the Python code to implement ReLU.

In [19]:
import numpy as np

class ReLU:
    def forward(self, x):
        # Apply ReLU activation element-wise
        return np.maximum(0, x)

# Example usage:
# Create an instance of the ReLU activation function
relu_activation = ReLU()

# Input values
input_values = np.array([-2, -1, 0, 1, 2])

# Apply ReLU activation to the input values
output_values = relu_activation.forward(input_values)

# Print the output
print("Input Values:", input_values)
print("Output Values (ReLU):", output_values)

Input Values: [-2 -1  0  1  2]
Output Values (ReLU): [0 0 0 1 2]


### 3. Write the Python code for a dense layer in terms of matrix multiplication.

In [20]:
import numpy as np

class DenseLayer:
    def __init__(self, input_size, output_size, activation_function):
        # Initialize weights and bias randomly
        self.weights = np.random.rand(input_size, output_size)
        self.bias = np.random.rand(output_size)
        self.activation_function = activation_function

    def forward(self, inputs):
        # Calculate weighted sum and apply activation function
        weighted_sum = np.dot(inputs, self.weights) + self.bias
        output = self.activation_function(weighted_sum)
        return output

# Example usage:
# Define an activation function (e.g., sigmoid)
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Create a dense layer with 3 input features, 2 output features, and the sigmoid activation function
dense_layer = DenseLayer(input_size=3, output_size=2, activation_function=sigmoid)

# Input features
inputs = np.array([0.5, -0.2, 0.1])

# Forward pass through the dense layer
output = dense_layer.forward(inputs)

# Print the output
print("Output:", output)

Output: [0.57959155 0.64617456]


### 4. Write the Python code for a dense layer in plain Python (that is, with list comprehensions and functionality built into Python).

In [21]:
import numpy as np

class DenseLayer:
    def __init__(self, input_size, output_size, activation_function):
        # Initialize weights and bias with random values
        self.weights = np.random.rand(output_size, input_size)
        self.bias = np.random.rand(output_size, 1)
        self.activation_function = activation_function

    def forward(self, inputs):
        # Calculate the weighted sum and apply the activation function
        weighted_sum = np.dot(self.weights, inputs) + self.bias
        output = self.activation_function(weighted_sum)
        return output

# Example usage:
# Define an activation function (e.g., sigmoid)
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Create a dense layer with 3 input neurons and 2 output neurons using sigmoid activation
dense_layer = DenseLayer(input_size=3, output_size=2, activation_function=sigmoid)

# Input features
inputs = np.array([0.5, -0.2, 0.1])

# Forward pass through the dense layer
output = dense_layer.forward(inputs)

# Print the output
print("Output of Dense Layer:", output)

Output of Dense Layer: [[0.51435917 0.63221935]
 [0.5493744  0.66428319]]


### 5. What is the “hidden size” of a layer?


The "hidden size" of a layer in a neural network refers to the number of neurons or units in that layer. It represents the dimensionality of the hidden space or feature space that the layer can capture during the network's forward pass. In other words, the hidden size determines the number of activation values produced by the layer. The hidden size is a hyperparameter that can be adjusted when designing a neural network. It influences the network's capacity to learn complex patterns and representations. Larger hidden sizes allow the network to capture more intricate relationships in the data but also increase the number of parameters in the model, potentially leading to overfitting if not carefully controlled.

In summary, the hidden size of a layer determines the number of neurons in that layer, representing the dimensionality of the hidden space where the network learns to extract features and representations from the input data.

### 6. What does the t method do in PyTorch?

In PyTorch, the `t()` method is used to transpose a tensor. Transposing a tensor means swapping its dimensions. For a 2D tensor (matrix), it effectively swaps the rows and columns. For tensors with more than two dimensions, it performs a general transpose.

In [22]:
import torch

# Create a 2x3 matrix
matrix = torch.tensor([[1, 2, 3],
                       [4, 5, 6]])

# Transpose the matrix
transposed_matrix = matrix.t()

print("Original Matrix:")
print(matrix)

print("\nTransposed Matrix:")
print(transposed_matrix)

Original Matrix:
tensor([[1, 2, 3],
        [4, 5, 6]])

Transposed Matrix:
tensor([[1, 4],
        [2, 5],
        [3, 6]])


### 7. Why is matrix multiplication written in plain Python very slow?


Matrix multiplication written in plain Python (using nested loops and list comprehensions) tends to be slow for several reasons:

1. **Element-wise Operations:** In plain Python, operations are typically performed element-wise, which means looping through each element individually. This results in a high number of Python function calls, leading to significant overhead.
   
2. **Lack of Vectorization:** Python lists and loops are not optimized for numerical computations like matrix multiplication. NumPy, on the other hand, is a highly optimized library that leverages vectorized operations, allowing efficient computation on arrays.

3. **Dynamic Typing Overhead:** Python is a dynamically typed language, and this dynamic typing introduces overhead. NumPy, being a statically typed library, is able to perform operations more efficiently because it knows the data types in advance.

4. **Global Interpreter Lock (GIL):** Python's Global Interpreter Lock (GIL) can limit the parallel execution of threads. While NumPy operations release the GIL during computations, plain Python operations might not benefit from parallelism as effectively.

5. **Memory Management:** NumPy uses efficient memory layouts for arrays, and its operations are optimized for cache usage. In contrast, in plain Python, memory access patterns may be less optimal, leading to slower performance.

### 8. In matmul, why is ac==br?


In the context of matrix multiplication, the condition `ac == br` ensures that the matrices involved in the multiplication operation are conformable, meaning the number of columns in the first matrix (A) must be equal to the number of rows in the second matrix (B). The standard matrix multiplication operation involves multiplying each element of a row in the first matrix by the corresponding element of a column in the second matrix and summing up these products. To accomplish this, the matrices must have compatible dimensions. For two matrices A and B being multiplied (where A has dimensions a x c and B has dimensions b x d), the resulting matrix C (the product of A and B) will have dimensions a x d.

Therefore, to perform the multiplication, it's required that the number of columns in A (c) is equal to the number of rows in B (b), i.e., `ac == br`. If this condition is not satisfied, the matrices are not conformable for multiplication, and attempting to perform the operation will result in a shape mismatch error.

### 9. In Jupyter Notebook, how do you measure the time taken for a single cell to execute?


In Jupyter Notebook, you can measure the time taken for a single cell to execute using the `%time` or `%timeit` magic commands. These commands provide a convenient way to profile the execution time of a specific code cell.

1. **`%time` Magic Command:** `%time` is used to measure the execution time of a single run of the cell's code.

In [23]:
%time
result = 0
for i in range(1000000):
    result += i
print(result)

CPU times: total: 0 ns
Wall time: 0 ns
499999500000


2. **`%timeit` Magic Command:** `%timeit` is used for more accurate and reliable timing by running the code multiple times and averaging the results. It is particularly useful for measuring short execution times.

In [24]:
%timeit
result = 0
for i in range(1000000):
    result += i
print(result)

499999500000


### 10. What is elementwise arithmetic?


Elementwise arithmetic refers to performing arithmetic operations on corresponding elements of two or more arrays or tensors. In the context of numerical computing, elementwise operations are applied independently to each element at corresponding positions in the input arrays, resulting in a new array with the same shape. The most common elementwise arithmetic operations include addition, subtraction, multiplication, division, exponentiation, and more. The key characteristic is that the operation is applied element by element, and the corresponding elements in the input arrays contribute to the corresponding elements in the output array.

### 11. Write the PyTorch code to test whether every element of a is greater than the corresponding element of b.

In [25]:
import torch

# Example tensors a and b
a = torch.tensor([1, 2, 3])
b = torch.tensor([0, 2, 2])

# Check if every element of a is greater than the corresponding element of b
result = torch.all(a > b)

# Print the result
print("Every element of 'a' is greater than the corresponding element of 'b':", result.item())

Every element of 'a' is greater than the corresponding element of 'b': False


### 12. What is a rank-0 tensor? How do you convert it to a plain Python data type?


A rank-0 tensor is essentially a scalar, representing a single value in a tensor. In PyTorch, a rank-0 tensor has zero dimensions, and it can be considered as a 0-dimensional array. While it might be colloquially referred to as a "tensor," it essentially corresponds to a plain scalar value. To convert a rank-0 tensor to a plain Python data type, you can use the `item()` method. This method extracts the scalar value from the tensor and returns it as a native Python data type.

Here's an example:

In [26]:
import torch

# Create a rank-0 tensor (scalar)
rank_0_tensor = torch.tensor(42)

# Convert the rank-0 tensor to a plain Python data type
python_scalar = rank_0_tensor.item()

# Print the result
print("Rank-0 Tensor:", rank_0_tensor)
print("Converted to Python Scalar:", python_scalar)

Rank-0 Tensor: tensor(42)
Converted to Python Scalar: 42


### 13. How does elementwise arithmetic help us speed up matmul?


Elementwise arithmetic does not directly speed up matrix multiplication (`matmul`), but it is an integral part of the overall optimization strategy for improving the performance of matrix multiplication in numerical computing libraries. The process involves taking advantage of parallelism, vectorization, and optimized low-level implementations.

Here's how elementwise arithmetic contributes to the optimization of `matmul`:

1. **Parallelism and Vectorization:**
   - Modern CPUs and GPUs are capable of performing parallel operations on arrays of data. Elementwise arithmetic operations can be parallelized, allowing multiple elements to be processed simultaneously.
   - When performing matrix multiplication, the individual multiplications and additions involved can be parallelized across multiple processor cores or SIMD (Single Instruction, Multiple Data) units, leading to faster computation.

2. **Vectorized Implementations:**
   - Libraries like NumPy, PyTorch, and TensorFlow are designed to take advantage of vectorized implementations. These libraries are written in low-level languages (C or Fortran), and they leverage efficient implementations of elementwise operations for various hardware architectures.
   - Matrix multiplication involves a combination of elementwise multiplications and additions. By using optimized and vectorized implementations of these operations, the overall performance of matrix multiplication is improved.

### 14. What are the broadcasting rules?

Broadcasting is a mechanism in array-oriented computing libraries (such as NumPy, PyTorch, and TensorFlow) that allows operations on arrays of different shapes and sizes to be performed without explicitly copying the data. Broadcasting follows a set of rules to determine how the smaller array can be "broadcast" across the larger array to make their shapes compatible for elementwise operations. The broadcasting rules are as follows:

1. **Rule 1: Dimensions of the Arrays:**
   - If the arrays have a different number of dimensions, pad the smaller array's shape on its left side with ones until the shapes have the same length.

2. **Rule 2: Size of Dimensions:**
   - If, after applying Rule 1, the sizes of the dimensions are still different, the arrays are not broadcast-compatible.

3. **Rule 3: Size 1 Dimensions:**
   - If, after applying Rule 1, the sizes of the dimensions are the same or one of them is 1, the arrays are broadcast-compatible along that dimension.

   Example:
   ```python
   A:     4 x 3
   B:         3   # Rule 1: Pad with ones on the left
   Result: 4 x 3  # Rule 3: Compatible dimensions
   ```

4. **Rule 4: Size 1 Expansion:**
   - If, after applying Rule 1, the sizes of the dimensions are different, but one of them is 1, the array with size 1 is expanded along that dimension to match the size of the other array.

   Example:
   ```python
   A:     4 x 3
   B:     1 x 3  # Rule 1: Pad with ones on the left
   Result: 4 x 3  # Rule 4: Size 1 expansion along the second dimension of B
   ```

5. **Rule 5: Broadcasting along Multiple Dimensions:**
   - If broadcasting is required along multiple dimensions, the sizes in all relevant dimensions must either be the same or one of them must be 1.

   Example:
   ```python
   A:     4 x 3
   B:     1 x 3  # Rule 1: Pad with ones on the left
   Result: 4 x 3  # Rule 4: Size 1 expansion along the second dimension of B
   ```

### 15. What is expand_as? Show an example of how it can be used to match the results of broadcasting.


In PyTorch, the `expand_as` method is used to expand the size of one tensor to match the size of another tensor by broadcasting. It is particularly useful when you want to perform elementwise operations on tensors with different shapes but want to align them according to broadcasting rules.

Here's an example of how `expand_as` can be used to match the results of broadcasting:

In [27]:
import torch

# Example tensors
tensor_a = torch.tensor([[1, 2, 3]])
tensor_b = torch.tensor([[4], [5]])

# Broadcasting without expand_as
result_broadcast = tensor_a + tensor_b

# Broadcasting with expand_as
expanded_a = tensor_a.expand_as(tensor_b)
result_expand_as = expanded_a + tensor_b

# Print the results
print("Result without expand_as (broadcasting):")
print(result_broadcast)

print("\nResult with expand_as:")
print(result_expand_as)

RuntimeError: The expanded size of the tensor (1) must match the existing size (3) at non-singleton dimension 1.  Target sizes: [2, 1].  Tensor sizes: [1, 3]