## Extremes

Normally, there are two ways to get extremes of a function.

1. find the root of equation $\frac{\partial f}{\partial x}=0$
2. approach local extreme step by step

Since the first method has been discussed in the note of finding roots, we only consider the second method here.

In machine learning, how to find the extremes is a core topic. Thus, with the development of machine learning, a lot of methods have been created, e.g. OGD, SGD, momenum, Adam and so on. Here, we only talk about ordinary gradient descent(OGD).

### Steepest-Descent Method

As the gradient of a function points to the steepest descent direction, we can move x toward this direction to find the local minimum.

$$x_{i+1} = x_i -\alpha\nabla f(x_i)$$

where $\alpha$ is learning rate. That is a hyperparameter set manually.

#### Example:

Let's find the minimum of function $f(x, y) = x^2+y^2$

#### Code Example (Julia language):

In machine learning, we uaually use automatic differentiation to get the differentiation of a function, e.g. Zygote package of Julia. But, we use its differentiation directly here.

$$\nabla f(x,y) = 2x\vec{e_x}+2y\vec{e_y}$$

In [7]:
# define function and derived function
function func(x::Float64, y::Float64)
    return x^2+y^2
end

function derivedFunc(x::Float64, y::Float64)
    return 2x, 2y
end

# define variables
lr = 0.001  # learning rate
ϵ = 1.0e-8  # tolerance
x = rand()  # initial x
y = rand()  # initial y

# loop to find minimum
while true
    originalValue = func(x, y)
    dx, dy = derivedFunc(x, y)
    x -= lr * dx
    y -= lr * dy
    newValue = func(x, y)
    
    if abs(newValue-originalValue) < ϵ
        break
    end
end

println("x: ", x)
println("y: ", y)
println("Function value: ", func(x, y))

x: 0.00014094409376036872
y: 0.0015717102227927895
Function value: 2.4901382619972914e-6
