The idea is to add a Gaussian noise to a descent algorithm in order to converge to the global minimum. Three algorithms are coded :
- Gradient Descent with noise
- Momentum 1 (Langevin)
- Momentum 2.
The goal of this repository is to find the global minimum of a non-convex function
To attain a global minimum we add a noise to three different descent algorithms. This is called simulated annealing. For each algorithms in order to converge to global mimnium we need
For example let us pick a non convex function
The plot of F is as follows:
We can see that F has a global minimum at (x_1 < 0,0) and local minimum at (x_2 > 0,0). To make it complicated, let us choose for the initial guess (1,1) (which is in the wrong basin). We obtain the following results:
**Algorithm** | (x,y) | F(x,y) |
Gradient Descent | [ 0.94 , 0] | -0.152 |
Gradient Descent with Noise | [ -1.04 , 0] | -0.35 |
Momentum 1 | [ -1.04 , 0] | -0.35 |
Momentum 2 | [ -1.04 , 0] | -0.35 |
The table shows that the regular gradient converges to a local minimum while our algorithms with noise converges to the global minimum.
"""
Gradient Descent with Noise Algorithm
PARAMETERS:
-----------
* Func : callable
Function to minimize
* Jac : callable
Jacobian of the function to minimize
* gamma : float
Real number that divide the gradient
* amp : float
Real number, amplitude of the noise
* tol : flaot
Real number, tolerance for the convergence
* x_init : ndarray
Initial guess of x
* itmax : int
maximum number of iteration
* K : int
number of times to repeat the algorithm
* it_init : int, optional
iteration number where to start the algorithm
* bounds : sequence
dimension 2 where each value of x has to be between
* DirDoF : list of tuple
degree(s) of freedom where x will not change
RETURNS:
-------
* x : ndarray
value of x where Func(x) is the minimum, same size of x_init
* min_Func : flaot
value of the minimum of Func(x)
* min_it : int
iteration number where the algorithm stopped
"""
import numpy as np
from stdsct import *
def f(xt):
return 0.5*xt[0]**2.0+0.5*(1.0-np.cos(2*xt[0]))+xt[1]**2.0
def df(xt):
return np.array([2.0*0.5*xt[0]+2.0*0.5*np.sin(2*xt[0]), 2.0*xt[1]])
K = 10
itmax = 1000
tol = 1E-5
gamma = 1.0
amp = 1.0
x0 = np.array([3.0, -5.0])
v0 = np.copy(x0)
z0 = np.copy(x0)
[x1, min1, it1] = GradientNoise(f, df, gamma, amp, tol, x0, itmax, K)
"""
Momentum 1 Algorithm
PARAMETERS:
-----------
* Func : callable
Function to minimize
* Jac : callable
Jacobian of the function to minimize
* lda : float
value is between 0 and 1
* gamma : float
Real number that divide the gradient
* amp : float
Real number, amplitude of the noise
* tol : flaot
Real number, tolerance for the convergence
* x_init : ndarray
Initial guess of x
* v_init : ndarray
Initial guess for Momentum 1 (velocity), same shape as x0
* itmax : int
maximum number of iteration
* K : int
number of times to repeat the algorithm
* it_init : int, optional
iteration number where to start the algorithm
* bounds : sequence
dimension 2 where each value of x has to be between
* DirDoF : list of tuple
degree(s) of freedom where x will not change
RETURNS:
-------
* x : ndarray
value of x where Func(x) is the minimum, same size of x_init
* v : ndarray
value of Momentum 1 (velocity)
* min_Func : flaot
value of the minimum of Func(x)
* min_it : int
iteration number where the algorithm stopped
"""
import numpy as np
from stdsct import *
def f(xt):
return 0.5*xt[0]**2.0+0.5*(1.0-np.cos(2*xt[0]))+xt[1]**2.0
def df(xt):
return np.array([2.0*0.5*xt[0]+2.0*0.5*np.sin(2*xt[0]), 2.0*xt[1]])
K = 10
itmax = 1000
tol = 1E-5
lda = 0.1
gamma = 1.0
amp = 1.0
x0 = np.array([3.0,-5.0])
v0 = np.copy(x0)
z0 = np.copy(x0)
[x, v, min, it] = Momentum1(f, df, lda, gamma, amp, tol, x0, v0, itmax,K)
"""
Momentum 2 Algorithm
PARAMETERS:
-----------
* Func : callable
Function to minimize
* Jac : callable
Jacobian of the function to minimize
* lda : ndarray
vector of dimension 3 where each value is between 0 and 1
* gamma : float
Real number that divide the gradient
* amp : float
Real number, amplitude of the noise
* tol : flaot
Real number, tolerance for the convergence
* x_init : ndarray
Initial guess of x
* v_init : ndarray
Initial guess for Momentum 1 (velocity), same shape as x0
* z_init : ndarray
Initial guess for Momentum 2, same shape as x0
* itmax : int
maximum number of iteration
* K : int
number of times to repeat the algorithm
* it_init : int, optional
iteration number where to start the algorithm
* bounds : sequence
dimension 2 where each value of x has to be between
* DirDoF : list of tuple
degree(s) of freedom where x will not change
RETURNS:
-------
* x : ndarray
value of x where Func(x) is the minimum, same size of x_init
* v : ndarray
value of Momentum 1 (velocity)
* z : ndarray
value ofMomentum 2
* min_Func : flaot
value of the minimum of Func(x)
* min_it : int
iteration number where the algorithm stopped
"""
import numpy as np
from stdsct import *
def f(xt):
return 0.5*xt[0]**2.0+0.5*(1.0-np.cos(2*xt[0]))+xt[1]**2.0
def df(xt):
return np.array([2.0*0.5*xt[0]+2.0*0.5*np.sin(2*xt[0]), 2.0*xt[1]])
K = 10
itmax = 1000
tol = 1E-5
lda = [0.1, 0.1, 0.1]
gamma = 1.0
amp = 1.0
x0 = np.array([3.0, -5.0])
v0 = np.copy(x0)
z0 = np.copy(x0)
[x, v, z, min, it] = Momentum2(f, df, lda, gamma, amp, tol, x0, v0, z0, itmax, K)