# Introduction to Computer Vision. Lab 04: Back-propagation With Scalar Variable


## Introduction

In this lesson, we will explore the back-propagation algorithm and how it can be applied to find optimal parameters for image classification models.


### Finding a Good $W$

We saw from last week that finding a ‘good’ $W$ that minimizes the average loss function using brute force search is so time-consuming that it is almost impractical to do even for small images. Is there a mathematical approach to finding a good $W$?

From optimization, you may know that there is an algorithm, based on mathematics and hence optimal, on how to minimize a convex (maximize a concave) function, which is very efficient. Is our average loss function a convex function with respect to $W$, so that we can apply that very efficient minimization algorithm? Well, that depends on the function $f(x, W)$, i.e., for some $f(x, W)$ it is and for some it is not.

But even if the average loss function is a non-convex function with respect to $W$, we will again use this minimization algorithm, simply because no one invented an efficient algorithm for minimization of non-convex functions that we can use (maybe you will invent one and become famous and rich!).


## Exercise 1

Create two functions: One that performs forward propagation over the softmax (without the normalization layer included). The other that performs backward propagation over the softmax (without the normalization layer included).

- You can, if you wish, combine both functions into one.
- These functions must be created for any number of labels $C$.


In [None]:
import numpy as np

# Function to perform forward propagation over the softmax
def softmax_forward(logits):
    exps = np.exp(logits)
    return ''' TO DO '''

# Function to perform backward propagation over the softmax
def softmax_backward(softmax_output, true_labels):
    return ''' TO DO '''

# Example usage
logits = np.array([[1.0, 2.0, 3.0], [1.0, 2.0, 3.0]])
true_labels = np.array([[0, 0, 1], [0, 1, 0]])

softmax_output = softmax_forward(logits)
print("Softmax Output:\n", softmax_output)

grad = softmax_backward(softmax_output, true_labels)
print("Gradient:\n", grad)

# Assert statements to check correctness
assert softmax_output.shape == logits.shape, "Output shape mismatch"
assert grad.shape == logits.shape, "Gradient shape mismatch"


### Backpropagation

The algorithm that solves this task is called back-propagation. It efficiently updates the parameters of our model to minimize the loss function.


## Exercise 2

Create functions that perform forward propagation and backward propagation over the softmax with the normalization layer included.

- You can, if you wish, combine both functions into one.
- These functions must be created for any number of labels $C$.


In [None]:
# Function to perform forward propagation over the softmax with normalization
def softmax_with_normalization_forward(logits):
    logits_normalized = ''' TO DO '''
    exps = ''' TO DO '''
    return ''' TO DO '''

# Function to perform backward propagation over the softmax with normalization
def softmax_with_normalization_backward(softmax_output, true_labels):
    return ''' TO DO '''

# Example usage
logits = np.array([[1.0, 2.0, 3.0], [1.0, 2.0, 3.0]])
true_labels = np.array([[0, 0, 1], [0, 1, 0]])

softmax_output = softmax_with_normalization_forward(logits)
print("Softmax Output with Normalization:\n", softmax_output)

grad = softmax_with_normalization_backward(softmax_output, true_labels)
print("Gradient with Normalization:\n", grad)

# Assert statements to check correctness
assert softmax_output.shape == logits.shape, "Output shape mismatch"
assert grad.shape == logits.shape, "Gradient shape mismatch"


### Softmax with Normalization Layer

In practical scenarios, when the logits (inputs to the softmax) are very large, exponentiating them can lead to very large numbers, causing numerical instability. To counter this, we normalize the logits by subtracting the maximum logit value from each logit before applying the softmax.


## Conclusion

In this lesson, we explored the back-propagation algorithm and its application in optimizing parameters for image classification models. We implemented forward and backward propagation for the softmax function, both with and without normalization, and discussed the importance of normalization in preventing numerical instability.


## Additional Resources

For further reading and more complex image processing techniques, consider exploring the following resources:

- [OpenCV Documentation](https://docs.opencv.org/)
- [scikit-image Documentation](https://scikit-image.org/docs/stable/)
- [Computer Vision: Algorithms and Applications](https://szeliski.org/Book/)

Feel free to search for more information and examples online to enhance your understanding of computer vision.
