# The Optimization Function

An optimization function in a neural network is an algorithm that adjusts the model’s weights to minimize the error (or loss) between the network's predictions and the actual target values. Common optimization functions include **Gradient Descent** and its variants, like **Adam** and **RMSprop**.

Today we will focus on perhaps the simplest one: **Gradient Descent**


Before doing the activity down below, familiarize yourself with some key concepts by doing the following 2 activities:
- [What is a gradient and how does it descend?](<3.1.1 gradient_descent.ipynb>)
- [Backpropagation or how to calculate the gradients?](<3.1.2 concept_of_backpropagation.ipynb>)

In summary:
- A gradient tells us in what direction to adjust our parameters.
- Backpropagation allows us to compute the gradients from the output layer all the way back to the input of our model.
- Gradient Descent is an algorithm which will iteratively adjust the parameters of our model in order to minimize the error (loss) of our predictions.

In [None]:
import numpy as np


input = np.array([1.0, 2.0, 3.0, 4.0])
target = np.array([2.0, 3.0, 4.0, 5.0])

W1 = np.array([[1.5, 1.3, 1.8, 1.1],
              [1.5, 1.3, 1.8, 1.1],
              [1.5, 1.3, 1.8, 1.1],
              [1.5, 1.3, 1.8, 1.1]]) # weights for the input layer that are randomly initialized for 1 example, change if you want to test with more examples

prediction = input @ W1

loss = np.mean((prediction - target) ** 2) # mean squared error
print("loss", loss)

# TODO: calculate the gradient of the loss function
G1 = ...
print("gradient", G1)


Now that we have a gradient let's try to minimize the loss **below 1.30**, try your best !

In [None]:
# TODO : update the weights with the gradient
New_W1 = ...

prediction = New_W1 @ input
loss = np.mean((prediction - target) ** 2)
print("new loss", loss) # should be smaller than the previous loss

W1 = New_W1
# Don't hesitate to run the code multiple times to see the loss decreasing or increasing

Great job! Remember, when updating the weights using the gradient, you typically apply a fraction of the gradient, this fraction is controlled by the **learning rate**. You can set the learning rate manually or use advanced algorithms that adjust it automatically.

Now you understand how a model learns and adapts by updating its weights through gradient descent. Keep in mind that the gradient needs to be recalculated and reset at the start of each training epoch to continue refining the model effectively.