# Optimizing with gradient descent

Our neural network framework is almost complete. The `Scalar` class has almost everything we need, the one thing left to do is to implement gradient descent.

Let's recap: suppose that we want to find the minima of the function $ f(x) $. Analytic solutions are rarely possible, so quite often the best option is an iterative method called gradient descent. According to theory, if the starting point $ x_0 $ and the learning rate $ h $ is <strike>luckily</strike> properly selected, the sequence

$$
x_{n + 1} = x_n - h f^\prime(x_n)
$$ (eq:computational-graphs/gradient-descent/gradient-update)

converges to a local minimum. I know, it feels like black magic, but it works.

So far, here's our training loop.

```
# -- inputs --
# model: the machine learning model
# xs: training data
# ys: the ground truth
# n_epochs: the number of iterations
# lr: learning rate

for _ in range(n_epochs):
    ys_pred = [model(x) for x in xs]
    loss = loss_function(ys, ys_pred)

    loss.backward()

    for p in model.parameters:
        p.gradient_update(lr)
```

There's only one thing missing: the gradient update. Based on {eq}`eq:computational-graphs/gradient-descent/gradient-update`, it's quite simple to add.