### Overview

We will be illustrating using gradient descent to minimize a function.

Suppose we have an error surface defined by $E(u, v) = (ue^v-2ve^{-u})^2$

We have that 

$
\begin{align}
\frac{\partial E}{\partial u} &= 2(ue^v-2ve^{-u})(e^v + 2ve^{-u})\\
\frac{\partial E}{\partial v} &= 2(ue^v-2ve^{-u})(ue^v - 2e^{-u})
\end{align}
$

In [18]:
from math import exp

def E(u, v):
    return (u*exp(v) - 2*v*exp(-u))**2

def gradient_descent(tolerance=1e-14, learning_rate=0.05):
    u = 1
    v = 1
    iterations = 0
    
    while E(u, v) >= tolerance:
        u -= learning_rate * (2*(u*exp(v) - 2*v*exp(-u))*(exp(v) + 2*v*exp(-u)))
        v -= learning_rate * (2*(u*exp(v) - 2*v*exp(-u))*(u*exp(v) - 2*exp(-u)))
        iterations += 1
        print(iterations, u, v, E(u, v))
        
    return u, v, iterations

gradient_descent()

1 0.31522850340158015 0.9637157183161631 0.33634280992962556
2 0.4447107652506278 0.9629051461150087 0.0048489441416542985
3 0.46141807889386727 0.9628768614283155 2.9357732555861024e-05
4 0.4627306728624182 0.9628750332606111 1.503952909869561e-07
5 0.4628246919848664 0.9628749043188305 7.597967090200997e-10
6 0.4628313749841686 0.9628748951636231 3.834645788663458e-12
7 0.4628318497582682 0.9628748945132695 1.9351834628142132e-14
8 0.46283188348587945 0.9628748944670691 9.766002182388126e-17


(0.46283188348587945, 0.9628748944670691, 8)

In [17]:
exp(-5)

0.006737946999085467