<a href="https://colab.research.google.com/github/fbeilstein/machine_learning/blob/master/workbook_08_optimization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Exploring simple gradient descent

Explore gradient descent with the following code.
You should understand how it works and its weaknesses.
If you run out of ideas, use the following parameters

function_name | x_ini | y_ini | max_iter | theta | comment
---|---|---|---|---|---
```x**2 + y**2``` | -5.0 | -4.0 | 20 | 0.1 | good convex function
```x**2 + 9*y**2``` | -5.0 | -4.0 | 20 | 0.1 | convex function, see oscillations
```x**2 - y**2``` | 1.5 | 0.0 | 50 | 0.25 | stucks in inflection point
```(x/2)**4+(y/2)**4``` | -1.0 | -8.0 | 30 | 0.1 | divergence
```(x/2)**4+(y/2)**4``` | -1.0 | -1.0 | 50 | 0.1 | very slow convergence
```sin(x) + sin(y)``` | 1.0 | 1.0 | 10 | 0.01 | many global minima
```(1-x**2)+100*(y-x**2)**2``` | -1.0 | -1.0 | 10 | 0.01 | famous Rosenbrock function

In [None]:
#@title # Exploring simple gradient descent

from plotly.subplots import make_subplots
import plotly.graph_objects as go
import numpy as np

#@markdown ---
#@markdown ##Gradient descent options
#@markdown These variables correspond to the parameters of the minimizator
x_ini = -1 #@param {type:"slider", min:-10.0, max:10.0, step:0.1}

y_ini = -1 #@param {type:"slider", min:-10.0, max:10.0, step:0.1}

max_iter = 50 #@param {type:"slider", min:2, max:50, step:1}

theta = 0.101 #@param {type:"slider", min:0, max:1, step:0.001}
#@markdown ---
#@markdown ##Function to minimize
#@markdown Write function using sympy syntax.
#@markdown Use x and y as variables.
#@markdown You can use all elementary functions, (inverse)trigonometric, (inverse)hyperbolic functions, etc.
#@markdown for more details visit http://www.cfm.brown.edu/people/dobrush/am33/SymPy/function.html

function_name = '(x/2)**4+(y/2)**4' #@param {type:"string"}

from sympy.parsing.sympy_parser import standard_transformations, implicit_multiplication_application, parse_expr
transformations = (standard_transformations + (implicit_multiplication_application,))
f = parse_expr(function_name, transformations=transformations)

from sympy import diff
Gx = diff(f, 'x')
Gy = diff(f, 'y')

trace = [[x_ini, y_ini, f.evalf(subs={'x':x_ini, 'y':y_ini})]]
for i in range(max_iter):
  g_x = Gx.evalf(subs={'x':trace[-1][0], 'y':trace[-1][1]})
  g_y = Gy.evalf(subs={'x':trace[-1][0], 'y':trace[-1][1]})
  x_new = trace[-1][0] - theta * g_x
  y_new = trace[-1][1] - theta * g_y
  z_new = f.evalf(subs={'x':x_new, 'y':y_new})
  trace.append([x_new, y_new, z_new])
trace = np.array(trace, dtype=float)

x_min = min(-10.0, np.min(trace[:,0]))
x_max = max(10.0, np.max(trace[:,0]))
y_min = min(-10.0, np.min(trace[:,1]))
y_max = max(10.0, np.max(trace[:,1]))

# function calculated for 3d plot
x_ = np.linspace(x_min, x_max, num=50)
y_ = np.linspace(y_min, y_max, num=50)
z_ = np.array([[f.evalf(subs={'x':x__, 'y':y__}) for x__ in x_] for y__ in y_], dtype=float)

fig = make_subplots(rows=1, cols=2, specs=[[{"type": "scene"}, {"type": "xy"}]])

fig.add_trace(go.Scatter3d(x=trace[:, 0], y=trace[:, 1], z=trace[:, 2], 
                           marker=dict(size=4, colorscale='Viridis'),
                           line=dict(color='red', width=2)),
              row=1, col=1)

fig.add_trace(go.Surface(x=x_, y=y_, z=z_, opacity=0.9, showscale=False),
              row=1, col=1)

fig.add_trace(go.Contour(z=z_, x=x_, y=y_, contours=dict(showlabels=True)),
              row=1, col=2)

fig.add_trace(go.Scatter(x=trace[:, 0], y=trace[:, 1], line=dict(color='red', width=2)),
              row=1, col=2)

fig.update_layout(width=1200, height=600, autosize=False, 
                  title_text="Gradient descent demonstration",
                  #scene=dict(aspectratio = dict(x=1, y=1, z=1)),
                  showlegend=False)
fig.show()

#Exploring sliced function approximation

Following code shows function sliced by a vertical plane.
You can vary point, through which the plane passes and angle of the plane (see x_ini, y_ini and angle slider).
Plot on the right shows slice of the function in red.
Blue curve is the second-order approximation of the slice.
$$
\bar{f}(\alpha)=f(\boldsymbol {x})+\alpha \nabla f(\boldsymbol {x})\boldsymbol {h} +\frac{\alpha^2}{2!}\boldsymbol {h}^{\top} \boldsymbol {H} \boldsymbol {h}.
$$

We can choose gradient descent step based on the minimum of the approximation of the slice (remember, you cannot calculate the whole function for many-dimensional case).
Direction of the slice should be determined by the gradient, but current demo allows you to explore them all.


* Try different function and gain intuition on how good they can be approximated.
* Consider the functions you explored with simple gradient descent.
Understand which problems of the gradient descent can be solved this way.


In [None]:
#@title #Exploring function slices

from plotly.subplots import make_subplots
import plotly.graph_objects as go
import numpy as np
from scipy.optimize import minimize

#@markdown ---
#@markdown ##Minimization options
#@markdown These variables correspond to the point through which rotation axis passes
x_ini = -1.1 #@param {type:"slider", min:-10.0, max:10.0, step:0.1}
y_ini = 1.6 #@param {type:"slider", min:-10.0, max:10.0, step:0.1}

#@markdown ---
#@markdown ##Function to explore
#@markdown Write function using sympy syntax.
#@markdown You can use all elementary functions, (inverse)trigonometric, (inverse)hyperbolic functions, etc.
#@markdown for more details visit http://www.cfm.brown.edu/people/dobrush/am33/SymPy/function.html


function_name = 'x**2+(3*y)**3' #@param {type:"string"}


from sympy.parsing.sympy_parser import standard_transformations, implicit_multiplication_application, parse_expr
transformations = (standard_transformations + (implicit_multiplication_application,))
f = parse_expr(function_name, transformations=transformations)

from sympy import diff
g_x = lambda x,y: diff(f, 'x').evalf(subs={'x':x, 'y':y})
g_y = lambda x,y: diff(f, 'y').evalf(subs={'x':x, 'y':y})
jacobian = lambda x: np.array([g_x(x[0], x[1]), g_y(x[0], x[1])], dtype=float)

g_xx = lambda x,y: diff(diff(f, 'x'), 'x').evalf(subs={'x':x, 'y':y})
g_xy = lambda x,y: diff(diff(f, 'x'), 'y').evalf(subs={'x':x, 'y':y})
g_yy = lambda x,y: diff(diff(f, 'y'), 'y').evalf(subs={'x':x, 'y':y})
H = lambda x,y: [[g_xx(x, y), g_xy(x, y)], [g_xy(x, y), g_yy(x, y)]]
hessian = lambda x: np.array(H(x[0], x[1]), dtype=float)


func_to_minimize = lambda x: float(f.evalf(subs={'x':x[0], 'y':x[1]}))
history = [np.array([x_ini, y_ini])]
store_data = lambda xk : history.append(xk)


x_min = -10.0
x_max =  10.0
y_min = -10.0
y_max =  10.0

x_ = np.linspace(x_min, x_max, num=50)
y_ = np.linspace(y_min, y_max, num=50)
z_ = np.array([[func_to_minimize([x,y]) for x in x_] for y in y_], dtype=float)

z_max = np.max(z_)
z_min = np.min(z_)

in_square = lambda p, v: x_min < p * v[0] + x_ini< x_max and y_min < p * v[1] + y_ini < y_max

def slice_XY(angle):
  param = np.linspace(-np.sqrt((x_max-x_min)**2 + (y_max-y_min)**2), 
                      +np.sqrt((x_max-x_min)**2 + (y_max-y_min)**2), 
                      50)
  vec = np.array([np.cos(angle / 180.0 * np.pi), np.sin(angle / 180.0 * np.pi)])
  in_s = lambda p: in_square(p, vec)
  return np.array([[p * vec[0] + x_ini, p * vec[1] + y_ini, p] 
                    for p in filter(in_s, param)], dtype=float)

def slice(angle):
  xy_coords = slice_XY(angle)
  return np.array([[p[0], p[1], func_to_minimize(p[:2]), p[2]] for p in xy_coords], dtype=float)

def plane(angle):
  xy = slice_XY(angle)
  return np.array([[xy[0,0], xy[0,1], z_min], 
                   [xy[0,0], xy[0,1], z_max],
                   [xy[-1,0], xy[-1,1], z_max], 
                   [xy[-1,0], xy[-1,1], z_min],
                   [x_ini, y_ini, z_min]], dtype=float)

def approximation(x_list, x0):
  G = jacobian(x0)
  H = hessian(x0)
  F = func_to_minimize(x0)
  return np.array([F + np.dot(G, (x - x0)) + 0.5 * np.dot(x - x0, H.dot(x - x0)) for x in x_list], dtype=float)
  
def approx(angle):
  xy = slice_XY(angle)
  #print(np.max(xy[:, 2]))
  return xy[:,2], approximation(xy[:,:2], np.array([x_ini, y_ini]))
  
def plot_data(angle):
  return [{'type': 'scatter3d', 
           'mode': 'lines', 
           'name': 's3', 
           'x': slice(angle)[:,0], 
           'y': slice(angle)[:,1], 
           'z': slice(angle)[:,2], 
           'line': {'color': 'red', 'width': 2}
          },
          {'type': 'surface', 
           'name': 'f2', 
           'x': x_, 
           'y': y_, 
           'z': z_, 
           'opacity': 0.8, 
           'showscale': False
          },
          {'type': 'mesh3d', 
           'name': 'f2', 
           'alphahull':0, 
           'x': plane(angle)[:,0], 
           'y': plane(angle)[:,1], 
           'z': plane(angle)[:,2], 
           'color':'blue', 
           'opacity': 0.1, 
           'showscale': False
          },
          {'type': 'scatter3d', 
           'mode': 'lines', 
           'name': 's3', 
           'x': [x_ini, x_ini], 
           'y': [y_ini, y_ini], 
           'z': [z_min, z_max], 
           'line': {'color': 'green', 'width': 2}
          },
          {'type': 'scatter', 
           'name': 's2', 
           'x': slice(angle)[:,3], 
           'y': slice(angle)[:,2], 
           'line': {'color': 'red', 'width': 2}
          },
          {'type': 'scatter', 
           'name': 's2', 
           'x': approx(angle)[0], 
           'y': approx(angle)[1], 
           'line': {'color': 'blue', 'width': 2}
          },
         ]

fig = dict(
    layout = dict(
        width=1200, height=600, autosize=False,
        showlegend = False,
        scene = { 'domain': { 'x': [0.0, 0.44], 'y': [0, 1] } },
        xaxis1 = {'domain': [0.55, 1], 'autorange':True},
        yaxis1 = {'domain': [0.0, 1.0], 'autorange':True},
        title  = 'Minimization',
        margin = {'t': 50, 'b': 50, 'l': 50, 'r': 50},
        shapes = [
        # Line Vertical
        {
            'type': 'line',
            'xref': 'x',
            'yref': 'paper',
            'x0': 0,
            'y0': 0,
            'x1': 0,
            'y1': 1,
            'line': {
                'color': 'rgb(0, 255, 0)',
                'width': 3,
            },
        }],
        sliders = [{'yanchor': 'top',
                    'xanchor': 'left',
                    'currentvalue': {'font': {'size': 16}, 
                                     'prefix': 'Angle: ', 
                                     'visible': True, 
                                     'xanchor': 'right'},
                    'transition': {'duration': 0.0},
                    'pad': {'b': 10, 't': 50},
                    'len': 0.9,
                    'x': 0.1,
                    'y': 0,
                    'steps': [{'args': [[k], {'frame': 
                                              {'duration': 0.0, 
                                               'easing': 'linear', 
                                               'redraw': True},
                                              'transition': 
                                              {'duration': 0, 
                                               'easing': 'linear'
                                               }
                                              }
                                        ],
                               'label': k * 5.0,
                               'method': 'animate'} for k in range(36)
                    ]}]
    ),
    data = plot_data(0.01),
    frames=[
        {'name': k,
         'data': plot_data(5.0 * k + 0.01)} for k in range(36) ]
)
#plot(fig, auto_open=False)
f = go.Figure(fig)
f.show()

#Exploring modified gradient descent

After you gained some intuition with two previous demos, test it on the following.
Try different functions that caused trouble with gradients.
Check, whether choosing a step with a second order approximation fixes them.
Test, whether your intuition is right or wrong telling you if the problem is "fixable".

In [None]:
#@title # Exploring gradient+hessian descent

from plotly.subplots import make_subplots
import plotly.graph_objects as go
import numpy as np

#@markdown ---
#@markdown ##Gradient descent options
#@markdown These variables correspond to the parameters of the minimizator
x_ini = -6.4 #@param {type:"slider", min:-10.0, max:10.0, step:0.1}

y_ini = -9.4 #@param {type:"slider", min:-10.0, max:10.0, step:0.1}

max_iter = 50 #@param {type:"slider", min:2, max:50, step:1}

#@markdown ---
#@markdown ##Function to minimize
#@markdown Write function using sympy syntax.
#@markdown Use x and y as variables.
#@markdown You can use all elementary functions, (inverse)trigonometric, (inverse)hyperbolic functions, etc.
#@markdown for more details visit http://www.cfm.brown.edu/people/dobrush/am33/SymPy/function.html

function_name = '(x/2)**4+(y/2)**4' #@param {type:"string"}

from sympy.parsing.sympy_parser import standard_transformations, implicit_multiplication_application, parse_expr
transformations = (standard_transformations + (implicit_multiplication_application,))
f = parse_expr(function_name, transformations=transformations)

from sympy import diff
Gx = diff(f, 'x')
Gy = diff(f, 'y')
g_xx = lambda x,y: diff(diff(f, 'x'), 'x').evalf(subs={'x':x, 'y':y})
g_xy = lambda x,y: diff(diff(f, 'x'), 'y').evalf(subs={'x':x, 'y':y})
g_yy = lambda x,y: diff(diff(f, 'y'), 'y').evalf(subs={'x':x, 'y':y})
H = lambda x,y: np.array([[g_xx(x, y), g_xy(x, y)], [g_xy(x, y), g_yy(x, y)]], dtype=float)

trace = [[x_ini, y_ini, f.evalf(subs={'x':x_ini, 'y':y_ini})]]
for i in range(max_iter):
  g_x = Gx.evalf(subs={'x':trace[-1][0], 'y':trace[-1][1]})
  g_y = Gy.evalf(subs={'x':trace[-1][0], 'y':trace[-1][1]})
  h = np.array([g_x, g_y])
  H_ = H(trace[-1][0], trace[-1][1])
  theta = np.dot(h, h) / np.dot(h, H_.dot(h))
  x_new = trace[-1][0] - theta * g_x
  y_new = trace[-1][1] - theta * g_y
  z_new = f.evalf(subs={'x':x_new, 'y':y_new})
  trace.append([x_new, y_new, z_new])
trace = np.array(trace, dtype=float)

x_min = min(-10.0, np.min(trace[:,0]))
x_max = max(10.0, np.max(trace[:,0]))
y_min = min(-10.0, np.min(trace[:,1]))
y_max = max(10.0, np.max(trace[:,1]))

# function calculated for 3d plot
x_ = np.linspace(x_min, x_max, num=50)
y_ = np.linspace(y_min, y_max, num=50)
z_ = np.array([[f.evalf(subs={'x':x__, 'y':y__}) for x__ in x_] for y__ in y_], dtype=float)

fig = make_subplots(rows=1, cols=2, specs=[[{"type": "scene"}, {"type": "xy"}]])

fig.add_trace(go.Scatter3d(x=trace[:, 0], y=trace[:, 1], z=trace[:, 2], 
                           marker=dict(size=4, colorscale='Viridis'),
                           line=dict(color='red', width=2)),
              row=1, col=1)

fig.add_trace(go.Surface(x=x_, y=y_, z=z_, opacity=0.9, showscale=False),
              row=1, col=1)

fig.add_trace(go.Contour(z=z_, x=x_, y=y_, contours=dict(showlabels=True)),
              row=1, col=2)

fig.add_trace(go.Scatter(x=trace[:, 0], y=trace[:, 1], line=dict(color='red', width=2)),
              row=1, col=2)

fig.update_layout(width=1200, height=600, autosize=False, 
                  title_text="Gradient descent demonstration",
                  #scene=dict(aspectratio = dict(x=1, y=1, z=1)),
                  showlegend=False)
fig.show()

#Exploring second-order approximation

Now we are on a way to Newton's method.
Use the following demo to explore a second-order approximation of a function.
Try different functions and points of approximation until you get intuitive feeling how the second order approximation looks like.
You can find some thought-provoking examples in the following table but feel free to try anything that comes to your mind.

function | x_ini | y_ini | Comment
---|---|---|---
sin(x)+4*sin(y) | 2.9 | 4.9 | non-positive definite Hessian
x**2 | 0.0 | 0.0 | long valley, no global minimum
x** 2 - y** 2 | 0.0 | 0.0 | saddle point
x** 4 + y** 4 | 8.0 | 8.0 | convex, non-second-order
x** 4 + y** 4 | 1.0 | 1.0 | convex, non-second-order

On the left you will see eigenvalues of the Hessian matrix.
Gain some intuition how different signs of these eigenvalues depend on function you approximate and how the approximation looks like.

In [None]:
#@title # Geometric interpretation

from plotly.subplots import make_subplots
import plotly.graph_objects as go
import numpy as np
from scipy.optimize import minimize
from ipywidgets import interact

#@markdown ---
#@markdown ##Point of approximation
#@markdown These variables correspond to the point at which we perform Tailor expansion
x_ini = 8.5 #@param {type:"slider", min:-10.0, max:10.0, step:0.1}
y_ini = 8.4 #@param {type:"slider", min:-10.0, max:10.0, step:0.1}

#@markdown ---
#@markdown ##Function to explore
#@markdown Write function using sympy syntax.
#@markdown You can use all elementary functions, (inverse)trigonometric, (inverse)hyperbolic functions, etc.
#@markdown for more details visit http://www.cfm.brown.edu/people/dobrush/am33/SymPy/function.html


function_name = 'x**2+y**2' #@param {type:"string"}


from sympy.parsing.sympy_parser import standard_transformations, implicit_multiplication_application, parse_expr
transformations = (standard_transformations + (implicit_multiplication_application,))
f = parse_expr(function_name, transformations=transformations)

from sympy import diff
g_x = lambda x,y: diff(f, 'x').evalf(subs={'x':x, 'y':y})
g_y = lambda x,y: diff(f, 'y').evalf(subs={'x':x, 'y':y})
jacobian = lambda x: np.array([g_x(x[0], x[1]), g_y(x[0], x[1])], dtype=float)

g_xx = lambda x,y: diff(diff(f, 'x'), 'x').evalf(subs={'x':x, 'y':y})
g_xy = lambda x,y: diff(diff(f, 'x'), 'y').evalf(subs={'x':x, 'y':y})
g_yy = lambda x,y: diff(diff(f, 'y'), 'y').evalf(subs={'x':x, 'y':y})
H = lambda x,y: [[g_xx(x, y), g_xy(x, y)], [g_xy(x, y), g_yy(x, y)]]
hessian = lambda x: np.array(H(x[0], x[1]), dtype=float)


func_to_minimize = lambda x: f.evalf(subs={'x':x[0], 'y':x[1]})
history = [np.array([x_ini, y_ini])]
store_data = lambda xk : history.append(xk)


x_min = -10.0
x_max =  10.0
y_min = -10.0
y_max =  10.0

x_ = np.linspace(x_min, x_max, num=50)
y_ = np.linspace(y_min, y_max, num=50)
z_ = np.array([[func_to_minimize([x,y]) for x in x_] for y in y_], dtype=float)

z_max = np.max(z_)
z_min = np.min(z_)

G = jacobian([x_ini, y_ini])
H = hessian([x_ini, y_ini])
F = func_to_minimize([x_ini, y_ini])
appr = lambda x,y: F + np.dot(G, np.array([x-x_ini, y - y_ini])) + 0.5 * np.dot(np.array([x-x_ini, y - y_ini]), H.dot(np.array([x-x_ini, y - y_ini])))
Z_ = np.array([[appr(x, y) for x in x_] for y in y_], dtype=float)

tr = H[0,0] + H[1,1]
d = np.linalg.det(H)
l1 = (tr + np.sqrt(tr**2 - 4 * d)) / 2.0
l2 = (tr - np.sqrt(tr**2 - 4 * d)) / 2.0
txt = "H eigenvalues: ({:.2f};  {:.2f})".format(l1, l2)
  
def plot_data(angle):
  return [{'type': 'surface', 
           'name': 'f2', 
           'x': x_, 
           'y': y_, 
           'z': z_, 
           'opacity': 0.8, 
           'showscale': False,
           'colorscale': 'Viridis'
          },
          {'type': 'surface', 
           'name': 'f1', 
           'x': x_, 
           'y': y_, 
           'z': Z_, 
           'opacity': 0.6,
           'surfacecolor': [[1.0 for x in range(len(x_))] for y in range(len(y_))],
           'cauto': False,
           'colorscale': [[0.0, "rgb(0, 0, 0)"], [1.0, "rgb(255, 0, 0)"]],
           'cmax': 1,
           'cmin': 0,
           'showscale': False
          },
          {'type': 'scatter3d', 
           'mode': 'lines', 
           'name': 's3', 
           'x': [x_ini, x_ini], 
           'y': [y_ini, y_ini], 
           'z': [z_min, z_max], 
           'line': {'color': 'blue', 'width': 3}
          }         ]

fig = dict(
    layout = dict(
        width=1200, height=600, autosize=False,
        showlegend = False,
        scene = {'domain': { 'x': [0.0, 1.0], 'y': [0, 1] },
                'zaxis' : {'range': [z_min, z_max]}},
        title  = 'Approximation',
        margin = {'t': 50, 'b': 50, 'l': 50, 'r': 50},
        annotations = [{'text': txt, 
                        'xref':'paper', 'yref': 'paper', 
                        'x':0.0, 'y':0.5,
                        'showarrow': False,
                        'font': {'family': "sans serif",
                                 'size': 20,
                                 'color': "Red"}
                       }]
    ),
    data = plot_data(0.01),
)
#plot(fig, auto_open=False)
f = go.Figure(fig)
f.show()      

#Exploring Newton's method

Newton's method is well-known method for minimization problems (many modifications are known).
Try the following demo with the simplest implementation of the Newton's method and gain some intuition on the following
* Does it always find minimum or maximum? Or both?
* How does it behave at the inflection points?
* How many iterations cdoes it need if the function is quadratic?
* Which problems of the simplest gradient and modified gradient from the previous examples does it fix?

In [None]:
#@title # Exploring Newton's method

from plotly.subplots import make_subplots
import plotly.graph_objects as go
import numpy as np

#@markdown ---
#@markdown ##Gradient descent options
#@markdown These variables correspond to the parameters of the minimizator
x_ini = 4.2 #@param {type:"slider", min:-10.0, max:10.0, step:0.1}

y_ini = -1.5 #@param {type:"slider", min:-10.0, max:10.0, step:0.1}

max_iter = 50 #@param {type:"slider", min:2, max:50, step:1}

#@markdown ---
#@markdown ##Function to minimize
#@markdown Write function using sympy syntax.
#@markdown Use x and y as variables.
#@markdown You can use all elementary functions, (inverse)trigonometric, (inverse)hyperbolic functions, etc.
#@markdown for more details visit http://www.cfm.brown.edu/people/dobrush/am33/SymPy/function.html

function_name = 'sin(x) + sin(y)*2' #@param {type:"string"}

from sympy.parsing.sympy_parser import standard_transformations, implicit_multiplication_application, parse_expr
transformations = (standard_transformations + (implicit_multiplication_application,))
f = parse_expr(function_name, transformations=transformations)

from sympy import diff
Gx = diff(f, 'x')
Gy = diff(f, 'y')
g_xx = lambda x,y: diff(diff(f, 'x'), 'x').evalf(subs={'x':x, 'y':y})
g_xy = lambda x,y: diff(diff(f, 'x'), 'y').evalf(subs={'x':x, 'y':y})
g_yy = lambda x,y: diff(diff(f, 'y'), 'y').evalf(subs={'x':x, 'y':y})
H = lambda x,y: np.array([[g_xx(x, y), g_xy(x, y)], [g_xy(x, y), g_yy(x, y)]], dtype=float)

trace = [[x_ini, y_ini, f.evalf(subs={'x':x_ini, 'y':y_ini})]]
for i in range(max_iter):
  g_x = Gx.evalf(subs={'x':trace[-1][0], 'y':trace[-1][1]})
  g_y = Gy.evalf(subs={'x':trace[-1][0], 'y':trace[-1][1]})
  h = np.array([g_x, g_y])
  H_ = H(trace[-1][0], trace[-1][1])
  H_inv = np.linalg.inv(H_)
  theta = H_inv.dot(h)
  x_new = trace[-1][0] - theta[0]
  y_new = trace[-1][1] - theta[1]
  z_new = f.evalf(subs={'x':x_new, 'y':y_new})
  trace.append([x_new, y_new, z_new])
trace = np.array(trace, dtype=float)

x_min = min(-10.0, np.min(trace[:,0]))
x_max = max(10.0, np.max(trace[:,0]))
y_min = min(-10.0, np.min(trace[:,1]))
y_max = max(10.0, np.max(trace[:,1]))

# function calculated for 3d plot
x_ = np.linspace(x_min, x_max, num=50)
y_ = np.linspace(y_min, y_max, num=50)
z_ = np.array([[f.evalf(subs={'x':x__, 'y':y__}) for x__ in x_] for y__ in y_], dtype=float)

fig = make_subplots(rows=1, cols=2, specs=[[{"type": "scene"}, {"type": "xy"}]])

fig.add_trace(go.Scatter3d(x=trace[:, 0], y=trace[:, 1], z=trace[:, 2], 
                           marker=dict(size=4, colorscale='Viridis'),
                           line=dict(color='red', width=2)),
              row=1, col=1)

fig.add_trace(go.Surface(x=x_, y=y_, z=z_, opacity=0.9, showscale=False),
              row=1, col=1)

fig.add_trace(go.Contour(z=z_, x=x_, y=y_, contours=dict(showlabels=True)),
              row=1, col=2)

fig.add_trace(go.Scatter(x=trace[:, 0], y=trace[:, 1], line=dict(color='red', width=2)),
              row=1, col=2)

fig.update_layout(width=1200, height=600, autosize=False, 
                  title_text="Gradient descent demonstration",
                  #scene=dict(aspectratio = dict(x=1, y=1, z=1)),
                  showlegend=False)
fig.show()

In [60]:
#@title #Exploring Newton's method


from IPython.display import display
import ipywidgets as widgets


class Memory:
  def __init__(self):
    self.x_ini = 0.0
    self.y_ini = 0.0
    self.iters = 2

current_memory = Memory()

button = widgets.Button(description="Recalculate")
iters = widgets.IntSlider(min=2, max=50, value=current_memory.iters)

fnc = widgets.Text(value='sin(x) + sin(y)*2',
                   #placeholder='function',
                   description='Function:',
                   disabled=False)

def set_iter(val):
  current_memory.iters = val.new
iters.observe(set_iter, names='value')

display(widgets.HBox([button, fnc, iters]))

def on_button_clicked(b):
  function = fnc.value
  from IPython.display import clear_output
  clear_output()
  display(widgets.HBox([button, fnc, iters]))
  print('Doing Science...')

  x_ini = current_memory.x_ini
  y_ini = current_memory.y_ini
  max_iter = current_memory.iters


  from plotly.subplots import make_subplots
  import plotly.graph_objects as go
  import numpy as np
  from scipy.optimize import minimize

  from sympy.parsing.sympy_parser import standard_transformations, implicit_multiplication_application, parse_expr
  transformations = (standard_transformations + (implicit_multiplication_application,))
  f = parse_expr(function, transformations=transformations)

  from sympy import diff
  g_x = lambda x,y: diff(f, 'x').evalf(subs={'x':x, 'y':y})
  g_y = lambda x,y: diff(f, 'y').evalf(subs={'x':x, 'y':y})

  jacobian = lambda x: np.array([g_x(x[0], x[1]), g_y(x[0], x[1])], dtype=float)


  g_xx = lambda x,y: diff(diff(f, 'x'), 'x').evalf(subs={'x':x, 'y':y})
  g_xy = lambda x,y: diff(diff(f, 'x'), 'y').evalf(subs={'x':x, 'y':y})
  g_yy = lambda x,y: diff(diff(f, 'y'), 'y').evalf(subs={'x':x, 'y':y})
  H = lambda x,y: [[g_xx(x, y), g_xy(x, y)], [g_xy(x, y), g_yy(x, y)]]

  hessian = lambda x: np.array(H(x[0], x[1]), dtype=float)

  func_to_minimize = lambda x: float(f.evalf(subs={'x':x[0], 'y':x[1]}))
  history = [np.array([x_ini, y_ini])]

  for i in range(max_iter):
    h = jacobian(history[-1])
    H_ =  hessian(history[-1])
    if np.linalg.det(H_) < 1E-10:
      H_ =  hessian(history[-1] + np.array([1E-5, 1E-5]))
      #H_ += np.array([[1E-5, 0.0], [0.0, 1E-5]])
    H_inv = np.linalg.inv(H_)
    theta = H_inv.dot(h)
    x_new = history[-1][0] - theta[0]
    y_new = history[-1][1] - theta[1]
    history.append(np.array([x_new, y_new], dtype='float'))

  _h = np.array(history)
  _x = _h.T[0]
  _y = _h.T[1]
  _z = np.array([func_to_minimize(x) for x in history], dtype=float)

  x_min = min(-10.0, np.min(_x))
  x_max = max(10.0, np.max(_x))
  y_min = min(-10.0, np.min(_y))
  y_max = max(10.0, np.max(_y))

  x_ = np.linspace(x_min, x_max, num=50)
  y_ = np.linspace(y_min, y_max, num=50)
  z_ = np.array([[func_to_minimize([x,y]) for x in x_] for y in y_], dtype=float)


  import matplotlib.pyplot as plt
  import base64
  import io
  fig = plt.figure(figsize=(20,20))
  ax = fig.gca()
  ax.axis('off')
  ax.contourf(x_, y_, z_, alpha=0.3)
  plt.close(fig)
  buf = io.BytesIO()
  fig.savefig(buf, format='png', bbox_inches='tight', pad_inches=0);
  image_base64 = u'data:image/  png;base64,' + base64.b64encode(buf.getvalue()).decode('utf-8').replace('\n', '')
  buf.close()


  fig = dict(
      layout = dict(
          width=1200, height=600, autosize=False,
          showlegend = False,
          scene = {'domain': { 'x': [0.0, 0.44], 'y': [0, 1] } },
          xaxis1 = {'domain': [0.55, 1], 'range': [x_min, x_max], 'fixedrange': True},
          yaxis1 = {'domain': [0.0, 1.0], 'range': [y_min, y_max], 'fixedrange': True},
          title  = 'Minimization',
          margin = {'t': 50, 'b': 50, 'l': 50, 'r': 50},
          updatemenus = [{'buttons': [{'args': [[k for k in range(len(_h))],
                                                {'frame': {'duration': 500.0, 'redraw': True},
                                                'fromcurrent': False, 'transition': {'duration': 0, 'easing': 'linear'}}],
                                      'label': 'Play',
                                      'method': 'animate'},
                                      {'args': [[None], {'frame': {'duration': 0, 'redraw': True},
                                                        'mode': 'immediate',
                                                        'transition': {'duration': 0}}
                                                ],
                                      'label': 'Pause',
                                      'method': 'animate'
                                      }
                                      ],
                          'direction': 'left',
                          'pad': {'r': 10, 't': 85},
                          'showactive': True,
                          'type': 'buttons', 'x': 0.1, 'y': 0, 'xanchor': 'right', 'yanchor': 'top'}],
          sliders = [{'yanchor': 'top',
                      'xanchor': 'left',
                      'currentvalue': {'font': {'size': 16}, 'prefix': 'Step: ', 'visible': True, 'xanchor': 'right'},
                      'transition': {'duration': 0.0},
                      'pad': {'b': 10, 't': 50},
                      'len': 0.9,
                      'x': 0.1,
                      'y': 0,
                      'steps': [{'args': [[k], {'frame': {'duration': 500.0, 'easing': 'linear', 'redraw': True},
                                                'transition': {'duration': 0, 'easing': 'linear'}}
                                          ],
                                'label': k,
                                'method': 'animate'} for k in range(len(_h))
                      ]}],
          images = [{'source' : image_base64,
                    'xref': 'x', 'yref': 'y',
                    'sizing': 'stretch',
                    'sizex': x_max - x_min, 'sizey': y_max - y_min,
                    'layer': 'below', 'opacity':1.0,
                    'x': x_min, 'y': y_max}]
      ),
      data = [
          {'type': 'scatter3d', 'name': 's3', 'x': _x, 'y': _y, 'z': _z, 'line': {'color': 'red', 'width': 2}, 'marker': {'size': 4, 'colorscale': 'Viridis'}},
          {'type': 'surface', 'name': 'f2', 'x': x_, 'y': y_, 'z': z_, 'opacity': 0.8, 'showscale': False},
          #{'type': 'contour', 'name': 'c1', 'x':x_, 'y':y_, 'z':z_, 'contours': {'showlabels': True}},
          {'type': 'scatter', 'name': 's2', 'x': _x, 'y': _y, 
          'line': {'color': 'red', 'width': 2}
          },
          #{
          #    'type': 'scatter', 'name': 'trust radii', 
          #    'x': _x, 'y': _y, 'mode': 'markers',
          #    'marker': {'size': trust_radii, 'sizeref': 0.05}
          #} if len(trust_radii) > 0 else {}
      ],
      frames=[
          {'name': str(k),
          'data': [
            {'type': 'scatter3d', 'name': 's3', 'x': _x[:k], 'y': _y[:k], 'z': _z[:k], 'line': {'color': 'red', 'width': 2}, 'marker': {'size': 4, 'colorscale': 'Viridis'}},
            {'type': 'surface', 'name': 'f2', 'x': x_, 'y': y_, 'z': z_, 'opacity': 0.8, 'showscale': False},
            #{'type': 'contour', 'name': 'c1', 'x':x_, 'y':y_, 'z':z_, 'contours': {'showlabels': True}},
            {'type': 'scatter', 'name': 's2', 'x': _x[:k], 'y': _y[:k], 
            'line': {'color': 'red', 'width': 2}
            },
          ]} for k in range(len(_h)-1) ]
  )
  #plot(fig, auto_open=False)
  clear_output()
  display(widgets.HBox([button, fnc, iters]))
  f = go.Figure(fig)
  f.show()

  def save_pos(pos):
    global current_memory
    current_memory.x_ini = (x_max - x_min) * pos[0] + x_min
    current_memory.y_ini = (y_max - y_min) * (1.0 - pos[1]) + y_min

  main_str = '''
  <canvas id="paint_here"
          onmousedown="mdown_handle(event)"
          onmousemove="mmove_handle(event)"
          onmouseup="mup_handle(event)"></canvas>
  <script>

  var el = document.getElementsByClassName("layer-subplot")[0];
  var rect = el.getBoundingClientRect();

  var canvas = document.getElementById("paint_here");
  canvas.style.cssText = "position:absolute; top:" + rect.top
                      + "px; left: " + rect.left
                      + "px; width:" + rect.width
                      + "px; height:" + rect.height
                      + "px; z-index:1000;";
  canvas.width = rect.width;
  canvas.height = rect.height;
  var ctx = canvas.getContext('2d');
  ctx.clearRect(0, 0, canvas.width, canvas.height); // cleanup before start
  //ctx.fillStyle="#00FF00";
  //ctx.fillRect(0, 0, canvas.width, canvas.height); // field
  ''' + 'var x_ini = ' + str((current_memory.x_ini - x_min)/(x_max - x_min)) + ';' + 'var y_ini = ' + str(1.0 - (current_memory.y_ini - y_min)/(y_max - y_min)) + ';' + '''
  var active_pt = [canvas.width * x_ini, canvas.height * y_ini];

  function draw() {
      ctx.clearRect(0, 0, canvas.width, canvas.height); // cleanup before start
      //ctx.fillText("drawing", 20, 20);

      ctx.beginPath();
      ctx.arc(active_pt[0], active_pt[1], 10, 0.0, 2.0 * Math.PI, 0);
      ctx.fillStyle = "rgba(210, 0, 0, 0.75)";
      ctx.fill();
  }

  var do_move = false;

  function is_close(pt1, pt2) {
    return   (pt1[0] - pt2[0])*(pt1[0] - pt2[0])
          +  (pt1[1] - pt2[1])*(pt1[1] - pt2[1])
          <= 10*10;
  }

  function mdown_handle(evt) {
    x = evt.offsetX;
    y = evt.offsetY;
    do_move = is_close(active_pt, [x, y]);
  }
      
  function mmove_handle(evt) {
    if (!do_move)
        return;
    active_pt[0] = evt.offsetX;
    active_pt[1] = evt.offsetY;
  }
      
  function mup_handle(evt) {
    do_move = false;
    remember();
  }

  var w = canvas.width;
  var h = canvas.height;

  async function remember() {
    var x = active_pt[0] / w;
    var y = active_pt[1] / h;
    const result = await google.colab.kernel.invokeFunction('notebook.rememberPos', [[x, y]], {});
  }

  var timer = setInterval(draw, 10);

  </script>
  '''

  import IPython
  from google.colab import output
  display(IPython.display.HTML(main_str))
  output.register_callback('notebook.rememberPos', save_pos)


button.on_click(on_button_clicked)

HBox(children=(Button(description='Recalculate', style=ButtonStyle()), Text(value='sin(x+1) + sin(y+1)*2', des…

#Exploring minimization methods from scikit-learn

Explore different minimization methods available through the function `minimize`.
Try to find out their strengths and weaknesses.
Try different functions and initial conditions.
If the method needs you to calculate gradient (jacobian) and Hessian matrix -- set corresponding values to 'custom' (in contrary to other options this is NOT an option of scikit-learn -- I implemented that for you using sympy).
Consult [scipy.optimize.minimize help](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html) to undestand parameters better.

In [36]:
#@title #Exploring minimization methods


from IPython.display import display
import ipywidgets as widgets

possible_methods = ['Nelder-Mead','Powell','CG','BFGS','Newton-CG','L-BFGS-B','TNC','COBYLA','SLSQP','trust-constr','dogleg','trust-ncg','trust-exact','trust-krylov']
possible_jacs = ['none', '2-point', '3-point', 'cs', 'custom']
possible_hess = ['none', '2-point', '3-point', 'cs', 'custom']

class Memory:
  def __init__(self):
    self.x_ini = 0.0
    self.y_ini = 0.0
    self.method = 'Nelder-Mead'
    self.jac = 'none'
    self.hess = 'none'
    self.iters = 2

current_memory = Memory()

#if not 'current_memory' in globals():
#  global current_memory
#  current_memory = Memory()
#
#for attr in ['iters', 'jac', 'hess', 'method', 'x_ini', 'y_ini']:
#  if not hasattr(current_memory, attr):
#    current_memory = Memory()

button = widgets.Button(description="Recalculate")
iters = widgets.IntSlider(min=2, max=50, value=current_memory.iters)
method_widg = widgets.Dropdown(options=possible_methods,
                          value=current_memory.method,
                          description='Method:',
                           disabled=False)
jac = widgets.Dropdown(options=possible_jacs,
                       value=current_memory.jac,
                       description='Jacobian:',
                       disabled=False)
hess = widgets.Dropdown(options=possible_hess,
                        value=current_memory.hess,
                        description='Hessian:',
                        disabled=False)
fnc = widgets.Text(value='x**2+sin(y)*20',
                   #placeholder='function',
                   description='Function:',
                   disabled=False)

def set_method(val):
  current_memory.method = val.new
method_widg.observe(set_method, names='value')
def set_jac(val):
  current_memory.jac = val.new
jac.observe(set_jac, names='value')
def set_hess(val):
  current_memory.hess = val.new
hess.observe(set_hess, names='value')
def set_iter(val):
  current_memory.iters = val.new
iters.observe(set_iter, names='value')

display(widgets.HBox([button, fnc, iters, method_widg, jac, hess]))

def on_button_clicked(b):
  function = fnc.value
  from IPython.display import clear_output
  clear_output()
  display(widgets.HBox([button, fnc, iters, method_widg, jac, hess]))
  print('Doing Science...')

  x_ini = current_memory.x_ini
  y_ini = current_memory.y_ini
  method = current_memory.method
  max_iter = current_memory.iters
  jacobian = current_memory.jac
  hessian = current_memory.hess


  from plotly.subplots import make_subplots
  import plotly.graph_objects as go
  import numpy as np
  from scipy.optimize import minimize

  from sympy.parsing.sympy_parser import standard_transformations, implicit_multiplication_application, parse_expr
  transformations = (standard_transformations + (implicit_multiplication_application,))
  f = parse_expr(function, transformations=transformations)

  from sympy import diff
  g_x = lambda x,y: diff(f, 'x').evalf(subs={'x':x, 'y':y})
  g_y = lambda x,y: diff(f, 'y').evalf(subs={'x':x, 'y':y})

  if jacobian == 'custom':
    jacobian = lambda x: np.array([g_x(x[0], x[1]), g_y(x[0], x[1])], dtype=float)
  if jacobian == 'none':
    jacobian = None

  g_xx = lambda x,y: diff(diff(f, 'x'), 'x').evalf(subs={'x':x, 'y':y})
  g_xy = lambda x,y: diff(diff(f, 'x'), 'y').evalf(subs={'x':x, 'y':y})
  g_yy = lambda x,y: diff(diff(f, 'y'), 'y').evalf(subs={'x':x, 'y':y})
  H = lambda x,y: [[g_xx(x, y), g_xy(x, y)], [g_xy(x, y), g_yy(x, y)]]

  if hessian == 'custom':
    hessian = lambda x: np.array(H(x[0], x[1]), dtype=float)
  if hessian == 'none':
    hessian = None


  func_to_minimize = lambda x: float(f.evalf(subs={'x':x[0], 'y':x[1]}))
  history = [np.array([x_ini, y_ini])]
  def store_data(xk, *args): 
    history.append(xk) 

  minimize(func_to_minimize, [x_ini, y_ini], 
          method=method, jac=jacobian, hess=hessian,
          options={'maxiter':max_iter}, callback=store_data)

  _h = np.array(history)
  _x = _h.T[0]
  _y = _h.T[1]
  _z = np.array([func_to_minimize(x) for x in history], dtype=float)

  x_min = min(-10.0, np.min(_x))
  x_max = max(10.0, np.max(_x))
  y_min = min(-10.0, np.min(_y))
  y_max = max(10.0, np.max(_y))

  x_ = np.linspace(x_min, x_max, num=50)
  y_ = np.linspace(y_min, y_max, num=50)
  z_ = np.array([[func_to_minimize([x,y]) for x in x_] for y in y_], dtype=float)



  import matplotlib.pyplot as plt
  import base64
  import io
  fig = plt.figure(figsize=(20,20))
  ax = fig.gca()
  ax.axis('off')
  ax.contourf(x_, y_, z_, alpha=0.3)
  plt.close(fig)
  buf = io.BytesIO()
  fig.savefig(buf, format='png', bbox_inches='tight', pad_inches=0);
  image_base64 = u'data:image/  png;base64,' + base64.b64encode(buf.getvalue()).decode('utf-8').replace('\n', '')
  buf.close()


  fig = dict(
      layout = dict(
          width=1200, height=600, autosize=False,
          showlegend = False,
          scene = {'domain': { 'x': [0.0, 0.44], 'y': [0, 1] } },
          xaxis1 = {'domain': [0.55, 1], 'range': [x_min, x_max], 'fixedrange': True},
          yaxis1 = {'domain': [0.0, 1.0], 'range': [y_min, y_max], 'fixedrange': True},
          title  = 'Minimization',
          margin = {'t': 50, 'b': 50, 'l': 50, 'r': 50},
          updatemenus = [{'buttons': [{'args': [[k for k in range(len(_h))],
                                                {'frame': {'duration': 500.0, 'redraw': True},
                                                'fromcurrent': False, 'transition': {'duration': 0, 'easing': 'linear'}}],
                                      'label': 'Play',
                                      'method': 'animate'},
                                      {'args': [[None], {'frame': {'duration': 0, 'redraw': True},
                                                        'mode': 'immediate',
                                                        'transition': {'duration': 0}}
                                                ],
                                      'label': 'Pause',
                                      'method': 'animate'
                                      }
                                      ],
                          'direction': 'left',
                          'pad': {'r': 10, 't': 85},
                          'showactive': True,
                          'type': 'buttons', 'x': 0.1, 'y': 0, 'xanchor': 'right', 'yanchor': 'top'}],
          sliders = [{'yanchor': 'top',
                      'xanchor': 'left',
                      'currentvalue': {'font': {'size': 16}, 'prefix': 'Step: ', 'visible': True, 'xanchor': 'right'},
                      'transition': {'duration': 0.0},
                      'pad': {'b': 10, 't': 50},
                      'len': 0.9,
                      'x': 0.1,
                      'y': 0,
                      'steps': [{'args': [[k], {'frame': {'duration': 500.0, 'easing': 'linear', 'redraw': True},
                                                'transition': {'duration': 0, 'easing': 'linear'}}
                                          ],
                                'label': k,
                                'method': 'animate'} for k in range(len(_h))
                      ]}],
          images = [{'source' : image_base64,
                    'xref': 'x', 'yref': 'y',
                    'sizing': 'stretch',
                    'sizex': x_max - x_min, 'sizey': y_max - y_min,
                    'layer': 'below', 'opacity':1.0,
                    'x': x_min, 'y': y_max}]
      ),
      data = [
          {'type': 'scatter3d', 'name': 's3', 'x': _x, 'y': _y, 'z': _z, 'line': {'color': 'red', 'width': 2}, 'marker': {'size': 4, 'colorscale': 'Viridis'}},
          {'type': 'surface', 'name': 'f2', 'x': x_, 'y': y_, 'z': z_, 'opacity': 0.8, 'showscale': False},
          #{'type': 'contour', 'name': 'c1', 'x':x_, 'y':y_, 'z':z_, 'contours': {'showlabels': True}},
          {'type': 'scatter', 'name': 's2', 'x': _x, 'y': _y, 
          'line': {'color': 'red', 'width': 2}
          },
          #{
          #    'type': 'scatter', 'name': 'trust radii', 
          #    'x': _x, 'y': _y, 'mode': 'markers',
          #    'marker': {'size': trust_radii, 'sizeref': 0.05}
          #} if len(trust_radii) > 0 else {}
      ],
      frames=[
          {'name': str(k),
          'data': [
            {'type': 'scatter3d', 'name': 's3', 'x': _x[:k], 'y': _y[:k], 'z': _z[:k], 'line': {'color': 'red', 'width': 2}, 'marker': {'size': 4, 'colorscale': 'Viridis'}},
            {'type': 'surface', 'name': 'f2', 'x': x_, 'y': y_, 'z': z_, 'opacity': 0.8, 'showscale': False},
            #{'type': 'contour', 'name': 'c1', 'x':x_, 'y':y_, 'z':z_, 'contours': {'showlabels': True}},
            {'type': 'scatter', 'name': 's2', 'x': _x[:k], 'y': _y[:k], 
            'line': {'color': 'red', 'width': 2}
            },
            #{
            #  'type': 'scatter', 'name': 'trust radii', 
            #  'x': _x[:k], 'y': _y[:k], 'mode': 'markers',
            #  'marker': {'size': trust_radii[:k]}
            #} if len(trust_radii) > 0 else {}
          ]} for k in range(len(_h)-1) ]
  )
  #plot(fig, auto_open=False)
  clear_output()
  display(widgets.HBox([button, fnc, iters, method_widg, jac, hess]))
  f = go.Figure(fig)
  f.show()

  def save_pos(pos):
    global current_memory
    current_memory.x_ini = (x_max - x_min) * pos[0] + x_min
    current_memory.y_ini = (y_max - y_min) * (1.0 - pos[1]) + y_min

  main_str = '''
  <canvas id="paint_here"
          onmousedown="mdown_handle(event)"
          onmousemove="mmove_handle(event)"
          onmouseup="mup_handle(event)"></canvas>
  <script>

  var el = document.getElementsByClassName("layer-subplot")[0];
  var rect = el.getBoundingClientRect();

  var canvas = document.getElementById("paint_here");
  canvas.style.cssText = "position:absolute; top:" + rect.top
                      + "px; left: " + rect.left
                      + "px; width:" + rect.width
                      + "px; height:" + rect.height
                      + "px; z-index:1000;";
  canvas.width = rect.width;
  canvas.height = rect.height;
  var ctx = canvas.getContext('2d');
  ctx.clearRect(0, 0, canvas.width, canvas.height); // cleanup before start
  //ctx.fillStyle="#00FF00";
  //ctx.fillRect(0, 0, canvas.width, canvas.height); // field
  ''' + 'var x_ini = ' + str((current_memory.x_ini - x_min)/(x_max - x_min)) + ';' + 'var y_ini = ' + str(1.0 - (current_memory.y_ini - y_min)/(y_max - y_min)) + ';' + '''
  var active_pt = [canvas.width * x_ini, canvas.height * y_ini];

  function draw() {
      ctx.clearRect(0, 0, canvas.width, canvas.height); // cleanup before start
      //ctx.fillText("drawing", 20, 20);

      ctx.beginPath();
      ctx.arc(active_pt[0], active_pt[1], 10, 0.0, 2.0 * Math.PI, 0);
      ctx.fillStyle = "rgba(210, 0, 0, 0.75)";
      ctx.fill();
  }

  var do_move = false;

  function is_close(pt1, pt2) {
    return   (pt1[0] - pt2[0])*(pt1[0] - pt2[0])
          +  (pt1[1] - pt2[1])*(pt1[1] - pt2[1])
          <= 10*10;
  }

  function mdown_handle(evt) {
    x = evt.offsetX;
    y = evt.offsetY;
    do_move = is_close(active_pt, [x, y]);
  }
      
  function mmove_handle(evt) {
    if (!do_move)
        return;
    active_pt[0] = evt.offsetX;
    active_pt[1] = evt.offsetY;
  }
      
  function mup_handle(evt) {
    do_move = false;
    remember();
  }

  var w = canvas.width;
  var h = canvas.height;

  async function remember() {
    var x = active_pt[0] / w;
    var y = active_pt[1] / h;
    const result = await google.colab.kernel.invokeFunction('notebook.rememberPos', [[x, y]], {});
  }

  var timer = setInterval(draw, 10);

  </script>
  '''

  import IPython
  from google.colab import output
  display(IPython.display.HTML(main_str))
  output.register_callback('notebook.rememberPos', save_pos)


button.on_click(on_button_clicked)

HBox(children=(Button(description='Recalculate', style=ButtonStyle()), Text(value='x**2+sin(y)*20', descriptio…

#Implementing your own minimization method

Implement your own minimization method.
Visualization and iteration are already implemented, only the core function `next_step` is needed.
Modify it to get different minimization methods, use lecture notes if you need formulas.
Point $\vec{x}$, function $f(\vec{x})$, its gradient and Hessian matrix are already passed to `next_step` as parameters.
If your method requires additional parameters -- modify `initial_additional_args` to store them.

In [48]:
import numpy as np

# put here all additional arguments you need
# initialize with values you need at the first iteration
# update them on function call as needed
# and return them as function finishes
initial_additional_args = {'step': 0}

# the following function takes as input
# x -- numpy array [x,y] that is current point vector
# f -- value of the function at x
# g -- gradient vector (numpy array) [g_x, g_y]
# h -- hessian matrix (numpy array) [[g_xx, g_xy], 
#                                    [g_xy, g_yy]]
def next_step(x, f, g, h, additional_args):
  theta = 0.1
  additional_args['step'] += 1
  if additional_args['step'] == 20: # long jump on 20-th step
    x += np.array([5,5])
  return x - theta * g, additional_args

In [44]:
#@title #Implementing your own minimization method


from IPython.display import display
import ipywidgets as widgets


class Memory:
  def __init__(self):
    self.x_ini = 0.0
    self.y_ini = 0.0
    self.iters = 2

current_memory = Memory()

button = widgets.Button(description="Recalculate")
iters = widgets.IntSlider(min=2, max=50, value=current_memory.iters)

fnc = widgets.Text(value='x**2+sin(y)*20',
                   #placeholder='function',
                   description='Function:',
                   disabled=False)

def set_iter(val):
  current_memory.iters = val.new
iters.observe(set_iter, names='value')

display(widgets.HBox([button, fnc, iters]))

def on_button_clicked(b):
  function = fnc.value
  from IPython.display import clear_output
  clear_output()
  display(widgets.HBox([button, fnc, iters]))
  print('Doing Science...')

  x_ini = current_memory.x_ini
  y_ini = current_memory.y_ini
  max_iter = current_memory.iters


  from plotly.subplots import make_subplots
  import plotly.graph_objects as go
  import numpy as np
  from scipy.optimize import minimize

  from sympy.parsing.sympy_parser import standard_transformations, implicit_multiplication_application, parse_expr
  transformations = (standard_transformations + (implicit_multiplication_application,))
  f = parse_expr(function, transformations=transformations)

  from sympy import diff
  g_x = lambda x,y: diff(f, 'x').evalf(subs={'x':x, 'y':y})
  g_y = lambda x,y: diff(f, 'y').evalf(subs={'x':x, 'y':y})

  jacobian = lambda x: np.array([g_x(x[0], x[1]), g_y(x[0], x[1])], dtype=float)


  g_xx = lambda x,y: diff(diff(f, 'x'), 'x').evalf(subs={'x':x, 'y':y})
  g_xy = lambda x,y: diff(diff(f, 'x'), 'y').evalf(subs={'x':x, 'y':y})
  g_yy = lambda x,y: diff(diff(f, 'y'), 'y').evalf(subs={'x':x, 'y':y})
  H = lambda x,y: [[g_xx(x, y), g_xy(x, y)], [g_xy(x, y), g_yy(x, y)]]

  hessian = lambda x: np.array(H(x[0], x[1]), dtype=float)

  func_to_minimize = lambda x: float(f.evalf(subs={'x':x[0], 'y':x[1]}))
  history = [np.array([x_ini, y_ini])]
  def store_data(xk, *args): 
    history.append(xk) 


  a = initial_additional_args.copy()
  for step in range(max_iter):
    x_next, a = next_step(history[-1], func_to_minimize(history[-1]),
                          jacobian(history[-1]), hessian(history[-1]), a)
    store_data(x_next)


  _h = np.array(history)
  _x = _h.T[0]
  _y = _h.T[1]
  _z = np.array([func_to_minimize(x) for x in history], dtype=float)

  x_min = min(-10.0, np.min(_x))
  x_max = max(10.0, np.max(_x))
  y_min = min(-10.0, np.min(_y))
  y_max = max(10.0, np.max(_y))

  x_ = np.linspace(x_min, x_max, num=50)
  y_ = np.linspace(y_min, y_max, num=50)
  z_ = np.array([[func_to_minimize([x,y]) for x in x_] for y in y_], dtype=float)


  import matplotlib.pyplot as plt
  import base64
  import io
  fig = plt.figure(figsize=(20,20))
  ax = fig.gca()
  ax.axis('off')
  ax.contourf(x_, y_, z_, alpha=0.3)
  plt.close(fig)
  buf = io.BytesIO()
  fig.savefig(buf, format='png', bbox_inches='tight', pad_inches=0);
  image_base64 = u'data:image/  png;base64,' + base64.b64encode(buf.getvalue()).decode('utf-8').replace('\n', '')
  buf.close()


  fig = dict(
      layout = dict(
          width=1200, height=600, autosize=False,
          showlegend = False,
          scene = {'domain': { 'x': [0.0, 0.44], 'y': [0, 1] } },
          xaxis1 = {'domain': [0.55, 1], 'range': [x_min, x_max], 'fixedrange': True},
          yaxis1 = {'domain': [0.0, 1.0], 'range': [y_min, y_max], 'fixedrange': True},
          title  = 'Minimization',
          margin = {'t': 50, 'b': 50, 'l': 50, 'r': 50},
          updatemenus = [{'buttons': [{'args': [[k for k in range(len(_h))],
                                                {'frame': {'duration': 500.0, 'redraw': True},
                                                'fromcurrent': False, 'transition': {'duration': 0, 'easing': 'linear'}}],
                                      'label': 'Play',
                                      'method': 'animate'},
                                      {'args': [[None], {'frame': {'duration': 0, 'redraw': True},
                                                        'mode': 'immediate',
                                                        'transition': {'duration': 0}}
                                                ],
                                      'label': 'Pause',
                                      'method': 'animate'
                                      }
                                      ],
                          'direction': 'left',
                          'pad': {'r': 10, 't': 85},
                          'showactive': True,
                          'type': 'buttons', 'x': 0.1, 'y': 0, 'xanchor': 'right', 'yanchor': 'top'}],
          sliders = [{'yanchor': 'top',
                      'xanchor': 'left',
                      'currentvalue': {'font': {'size': 16}, 'prefix': 'Step: ', 'visible': True, 'xanchor': 'right'},
                      'transition': {'duration': 0.0},
                      'pad': {'b': 10, 't': 50},
                      'len': 0.9,
                      'x': 0.1,
                      'y': 0,
                      'steps': [{'args': [[k], {'frame': {'duration': 500.0, 'easing': 'linear', 'redraw': True},
                                                'transition': {'duration': 0, 'easing': 'linear'}}
                                          ],
                                'label': k,
                                'method': 'animate'} for k in range(len(_h))
                      ]}],
          images = [{'source' : image_base64,
                    'xref': 'x', 'yref': 'y',
                    'sizing': 'stretch',
                    'sizex': x_max - x_min, 'sizey': y_max - y_min,
                    'layer': 'below', 'opacity':1.0,
                    'x': x_min, 'y': y_max}]
      ),
      data = [
          {'type': 'scatter3d', 'name': 's3', 'x': _x, 'y': _y, 'z': _z, 'line': {'color': 'red', 'width': 2}, 'marker': {'size': 4, 'colorscale': 'Viridis'}},
          {'type': 'surface', 'name': 'f2', 'x': x_, 'y': y_, 'z': z_, 'opacity': 0.8, 'showscale': False},
          #{'type': 'contour', 'name': 'c1', 'x':x_, 'y':y_, 'z':z_, 'contours': {'showlabels': True}},
          {'type': 'scatter', 'name': 's2', 'x': _x, 'y': _y, 
          'line': {'color': 'red', 'width': 2}
          },
          #{
          #    'type': 'scatter', 'name': 'trust radii', 
          #    'x': _x, 'y': _y, 'mode': 'markers',
          #    'marker': {'size': trust_radii, 'sizeref': 0.05}
          #} if len(trust_radii) > 0 else {}
      ],
      frames=[
          {'name': str(k),
          'data': [
            {'type': 'scatter3d', 'name': 's3', 'x': _x[:k], 'y': _y[:k], 'z': _z[:k], 'line': {'color': 'red', 'width': 2}, 'marker': {'size': 4, 'colorscale': 'Viridis'}},
            {'type': 'surface', 'name': 'f2', 'x': x_, 'y': y_, 'z': z_, 'opacity': 0.8, 'showscale': False},
            #{'type': 'contour', 'name': 'c1', 'x':x_, 'y':y_, 'z':z_, 'contours': {'showlabels': True}},
            {'type': 'scatter', 'name': 's2', 'x': _x[:k], 'y': _y[:k], 
            'line': {'color': 'red', 'width': 2}
            },
          ]} for k in range(len(_h)-1) ]
  )
  #plot(fig, auto_open=False)
  clear_output()
  display(widgets.HBox([button, fnc, iters]))
  f = go.Figure(fig)
  f.show()

  def save_pos(pos):
    global current_memory
    current_memory.x_ini = (x_max - x_min) * pos[0] + x_min
    current_memory.y_ini = (y_max - y_min) * (1.0 - pos[1]) + y_min

  main_str = '''
  <canvas id="paint_here"
          onmousedown="mdown_handle(event)"
          onmousemove="mmove_handle(event)"
          onmouseup="mup_handle(event)"></canvas>
  <script>

  var el = document.getElementsByClassName("layer-subplot")[0];
  var rect = el.getBoundingClientRect();

  var canvas = document.getElementById("paint_here");
  canvas.style.cssText = "position:absolute; top:" + rect.top
                      + "px; left: " + rect.left
                      + "px; width:" + rect.width
                      + "px; height:" + rect.height
                      + "px; z-index:1000;";
  canvas.width = rect.width;
  canvas.height = rect.height;
  var ctx = canvas.getContext('2d');
  ctx.clearRect(0, 0, canvas.width, canvas.height); // cleanup before start
  //ctx.fillStyle="#00FF00";
  //ctx.fillRect(0, 0, canvas.width, canvas.height); // field
  ''' + 'var x_ini = ' + str((current_memory.x_ini - x_min)/(x_max - x_min)) + ';' + 'var y_ini = ' + str(1.0 - (current_memory.y_ini - y_min)/(y_max - y_min)) + ';' + '''
  var active_pt = [canvas.width * x_ini, canvas.height * y_ini];

  function draw() {
      ctx.clearRect(0, 0, canvas.width, canvas.height); // cleanup before start
      //ctx.fillText("drawing", 20, 20);

      ctx.beginPath();
      ctx.arc(active_pt[0], active_pt[1], 10, 0.0, 2.0 * Math.PI, 0);
      ctx.fillStyle = "rgba(210, 0, 0, 0.75)";
      ctx.fill();
  }

  var do_move = false;

  function is_close(pt1, pt2) {
    return   (pt1[0] - pt2[0])*(pt1[0] - pt2[0])
          +  (pt1[1] - pt2[1])*(pt1[1] - pt2[1])
          <= 10*10;
  }

  function mdown_handle(evt) {
    x = evt.offsetX;
    y = evt.offsetY;
    do_move = is_close(active_pt, [x, y]);
  }
      
  function mmove_handle(evt) {
    if (!do_move)
        return;
    active_pt[0] = evt.offsetX;
    active_pt[1] = evt.offsetY;
  }
      
  function mup_handle(evt) {
    do_move = false;
    remember();
  }

  var w = canvas.width;
  var h = canvas.height;

  async function remember() {
    var x = active_pt[0] / w;
    var y = active_pt[1] / h;
    const result = await google.colab.kernel.invokeFunction('notebook.rememberPos', [[x, y]], {});
  }

  var timer = setInterval(draw, 10);

  </script>
  '''

  import IPython
  from google.colab import output
  display(IPython.display.HTML(main_str))
  output.register_callback('notebook.rememberPos', save_pos)


button.on_click(on_button_clicked)

HBox(children=(Button(description='Recalculate', style=ButtonStyle()), Text(value='x**2+sin(y)*20', descriptio…

In [None]:
initial_additional_args

{'step': 100}