# Copyright and License.

In [None]:
# Copyright (c) 2021 Kevin P. Murphy (murphyk@gmail.com) and Mahmoud Soliman (mjs@aucegypt.edu)
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.

![GitHub](https://img.shields.io/github/license/probml/pyprobml)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/probml/pyprobml/blob/master/notebooks/figures/chapter5_figures.ipynb)

## Figure 5.4:

  Steepest descent on a simple convex function, starting from $(0,0)$, for 20 steps, using a fixed step size. The global minimum is at $(1,1)$. (a) $\eta =0.1$. (b) $\eta =0.6$. 

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/steepestDescentDemo.m >> _

## Figure 5.6:

  (a) Steepest descent on the same function as \cref  fig:aokiFixed , starting from $(0,0)$, using line search. 

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/steepestDescentDemo.m >> _

## Figure 5.8:

  Illustration of Newton's method for minimizing a 1d function. (a) The solid curve is the function $\mathcal  L (x)$. The dotted line $\mathcal  L _ \mathrm  quad  (\theta )$ is its second order approximation at $\theta _t$. The Newton step $d_t$ is what must be added to $\theta _t$ to get to the minimum of $\mathcal  L _ \mathrm  quad  (\theta )$. Adapted from Figure 13.4 of \citep  Vandenberghe06 . 

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/newtonsMethodMinQuad.m >> _

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/newtonsMethodNonConvex.m >> _

## Figure 5.11:

  Illustration of the benefits of natural gradient vs steepest descent on a 2D problem. (a) Trajectories of the two methods in parameter space (red = steepest descent, blue = NG). They both start at $(1,-1)$, bottom right location. (b) Objective vs number of iterations. (c) Gradient field in the $\boldsymbol  \theta  $ parameter space. (d) Gradient field in the whitened $\boldsymbol  \phi  = \mathbf  F ^ \frac  1  2   \boldsymbol  \theta  $ parameter space used by NG. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/nat/_grad/_demo.py

## Figure 5.12:

  Illustration of the LMS algorithm. Left: we start from $\boldsymbol  \theta  =(-0.5,2)$ and slowly converging to the least squares solution of $ \boldsymbol  \theta   =(1.45, 0.93)$ (red cross). Right: plot of objective function over time. Note that it does not decrease monotonically. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/lms/_demo.py

## Figure 5.16:

  Illustration of a bound optimization algorithm. Adapted from Figure 9.14 of \citep  BishopBook . 

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/emLogLikelihoodMax.m >> _

## Figure 5.18:

  Illustration of the EM for a GMM applied to the Old Faithful data for the first 6 steps. The degree of redness indicates the degree to which the point belongs to the red cluster, and similarly for blue; thus purple points have a roughly 50/50 split in their responsibilities to the two clusters. Adapted from \citep  BishopBook  Figure 9.8. 

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/mixGaussDemoFaithful.m >> _

## Figure 5.19:

  (a) Illustration of how singularities can arise in the likelihood function of GMMs. Here $K=2$, but the first mixture component is a narrow spike (with $\sigma _1 \approx 0$) centered on a single data point $x_1$. Adapted from Figure 9.7 of \citep  BishopBook . 

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/mixGaussSingularity.m >> _

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/mixGaussMLvsMAP.m >> _

## Figure 5.20:

  Left: $N=200$ data points sampled from a mixture of 2 Gaussians in 1d, with $\pi _k=0.5$, $\sigma _k=5$, $\mu _1=-10$ and $\mu _2=10$. Right: Likelihood surface $p( \mathcal  D  |\mu _1,\mu _2)$, with all other parameters set to their true values. We see the two symmetric modes, reflecting the unidentifiability of the parameters. 

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/mixGaussLikSurfaceDemo.m >> _

## Figure 5.21:

  Illustration of data imputation. (a) Scatter plot of true values vs imputed values using true parameters. (b) Same as (a), but using parameters estimated with EM. We just show the first four variables, for brevity. 

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/gaussImputationDemoEM.m >> _