In [None]:
# Copyright 2021 Google LLC
# Use of this source code is governed by an MIT-style
# license that can be found in the LICENSE file or at
# https://opensource.org/licenses/MIT.

# Author(s): Kevin P. Murphy (murphyk@gmail.com) and Mahmoud Soliman (mjs@aucegypt.edu)

<a href="https://opensource.org/licenses/MIT" target="_parent"><img src="https://img.shields.io/github/license/probml/pyprobml"/></a>

<a href="https://colab.research.google.com/github/probml/pyprobml/blob/master/notebooks/figures//chapter5_figures.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Cloning the pyprobml repo

In [None]:
!git clone https://github.com/probml/pyprobml 
%cd pyprobml/scripts

# Installing required software (This may take few minutes)

In [None]:
!apt install octave  -qq > /dev/null
!apt-get install liboctave-dev -qq > /dev/null

## Figure 5.4:

  Steepest descent on a simple convex function, starting from $(0,0)$, for 20 steps, using a fixed step size. The global minimum is at $(1,1)$. (a) $\eta =0.1$. (b) $\eta =0.6$.  
Figure(s) generated by [steepestDescentDemo.m](https://github.com/probml/pmtk3/blob/master/demos/steepestDescentDemo.m) 

In [None]:
!octave -W steepestDescentDemo.m >> _

## Figure 5.6:

  (a) Steepest descent on the same function as \cref  fig:aokiFixed , starting from $(0,0)$, using line search.  
Figure(s) generated by [steepestDescentDemo.m](https://github.com/probml/pmtk3/blob/master/demos/steepestDescentDemo.m) 

In [None]:
!octave -W steepestDescentDemo.m >> _

## Figure 5.8:

  Illustration of Newton's method for minimizing a 1d function. (a) The solid curve is the function $\mathcal  L (x)$. The dotted line $\mathcal  L _ \mathrm  quad  (\theta )$ is its second order approximation at $\theta _t$. The Newton step $d_t$ is what must be added to $\theta _t$ to get to the minimum of $\mathcal  L _ \mathrm  quad  (\theta )$. Adapted from Figure 13.4 of \citep  Vandenberghe06 .  
Figure(s) generated by [newtonsMethodMinQuad.m](https://github.com/probml/pmtk3/blob/master/demos/newtonsMethodMinQuad.m) [newtonsMethodNonConvex.m](https://github.com/probml/pmtk3/blob/master/demos/newtonsMethodNonConvex.m) 

In [None]:
!octave -W newtonsMethodMinQuad.m >> _

In [None]:
!octave -W newtonsMethodNonConvex.m >> _

## Figure 5.11:

  Illustration of the benefits of natural gradient vs steepest descent on a 2D problem. (a) Trajectories of the two methods in parameter space (red = steepest descent, blue = NG). They both start at $(1,-1)$, bottom right location. (b) Objective vs number of iterations. (c) Gradient field in the $\boldsymbol  \theta  $ parameter space. (d) Gradient field in the whitened $\boldsymbol  \phi  = \mathbf  F ^ \frac  1  2   \boldsymbol  \theta  $ parameter space used by NG.  
Figure(s) generated by [nat_grad_demo.py](https://github.com/probml/pyprobml/blob/master/scripts/nat_grad_demo.py) 

In [None]:
%run ./nat_grad_demo.py

## Figure 5.12:

  Illustration of the LMS algorithm. Left: we start from $\boldsymbol  \theta  =(-0.5,2)$ and slowly converging to the least squares solution of $ \boldsymbol  \theta   =(1.45, 0.93)$ (red cross). Right: plot of objective function over time. Note that it does not decrease monotonically.  
Figure(s) generated by [lms_demo.py](https://github.com/probml/pyprobml/blob/master/scripts/lms_demo.py) 

In [None]:
%run ./lms_demo.py

## Figure 5.16:

  Illustration of a bound optimization algorithm. Adapted from Figure 9.14 of \citep  BishopBook .  
Figure(s) generated by [emLogLikelihoodMax.m](https://github.com/probml/pmtk3/blob/master/demos/emLogLikelihoodMax.m) 

In [None]:
!octave -W emLogLikelihoodMax.m >> _

## Figure 5.18:

  Illustration of the EM for a GMM applied to the Old Faithful data for the first 6 steps. The degree of redness indicates the degree to which the point belongs to the red cluster, and similarly for blue; thus purple points have a roughly 50/50 split in their responsibilities to the two clusters. Adapted from \citep  BishopBook  Figure 9.8.  
Figure(s) generated by [mixGaussDemoFaithful.m](https://github.com/probml/pmtk3/blob/master/demos/mixGaussDemoFaithful.m) 

In [None]:
!octave -W mixGaussDemoFaithful.m >> _

## Figure 5.19:

  (a) Illustration of how singularities can arise in the likelihood function of GMMs. Here $K=2$, but the first mixture component is a narrow spike (with $\sigma _1 \approx 0$) centered on a single data point $x_1$. Adapted from Figure 9.7 of \citep  BishopBook .  
Figure(s) generated by [mixGaussSingularity.m](https://github.com/probml/pmtk3/blob/master/demos/mixGaussSingularity.m) [mixGaussMLvsMAP.m](https://github.com/probml/pmtk3/blob/master/demos/mixGaussMLvsMAP.m) 

In [None]:
!octave -W mixGaussSingularity.m >> _

In [None]:
!octave -W mixGaussMLvsMAP.m >> _

## Figure 5.20:

  Left: $N=200$ data points sampled from a mixture of 2 Gaussians in 1d, with $\pi _k=0.5$, $\sigma _k=5$, $\mu _1=-10$ and $\mu _2=10$. Right: Likelihood surface $p( \mathcal  D  |\mu _1,\mu _2)$, with all other parameters set to their true values. We see the two symmetric modes, reflecting the unidentifiability of the parameters.  
Figure(s) generated by [mixGaussLikSurfaceDemo.m](https://github.com/probml/pmtk3/blob/master/demos/mixGaussLikSurfaceDemo.m) 

In [None]:
!octave -W mixGaussLikSurfaceDemo.m >> _

## Figure 5.21:

  Illustration of data imputation. (a) Scatter plot of true values vs imputed values using true parameters. (b) Same as (a), but using parameters estimated with EM. We just show the first four variables, for brevity.  
Figure(s) generated by [gaussImputationDemoEM.m](https://github.com/probml/pmtk3/blob/master/demos/gaussImputationDemoEM.m) 

In [None]:
!octave -W gaussImputationDemoEM.m >> _