In [None]:
# Copyright 2021 Google LLC
# Use of this source code is governed by an MIT-style
# license that can be found in the LICENSE file or at
# https://opensource.org/licenses/MIT.

# Author(s): Kevin P. Murphy (murphyk@gmail.com) and Mahmoud Soliman (mjs@aucegypt.edu)

<a href="https://opensource.org/licenses/MIT" target="_parent"><img src="https://img.shields.io/github/license/probml/pyprobml"/></a>

<a href="https://colab.research.google.com/github/probml/pyprobml/blob/master/notebooks/figures//chapter3_figures.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Cloning the pyprobml repo

In [None]:
!git clone https://github.com/probml/pyprobml 
%cd pyprobml/scripts

# Installing required software (This may take few minutes)

In [None]:
!apt install octave  -qq > /dev/null
!apt-get install liboctave-dev -qq > /dev/null

## Figure 3.1:

  Illustration of the binomial distribution with $N=10$ and (a) $\theta =0.25$ and (b) $\theta =0.9$.  
Figure(s) generated by [binom_dist_plot.py](https://github.com/probml/pyprobml/blob/master/scripts/binom_dist_plot.py) 

In [None]:
%run ./binom_dist_plot.py

## Figure 3.2:

  (a) The sigmoid (logistic) function $\sigma (a)=(1+e^ -a )^ -1 $. (b) The Heaviside function $\mathbb  I \left ( a>0 \right )$.  
Figure(s) generated by [activation_fun_plot.py](https://github.com/probml/pyprobml/blob/master/scripts/activation_fun_plot.py) 

In [None]:
%run ./activation_fun_plot.py

## Figure 3.3:

  Logistic regression applied to a 1-dimensional, 2-class version of the Iris dataset.  
Figure(s) generated by [iris_logreg.py](https://github.com/probml/pyprobml/blob/master/scripts/iris_logreg.py) 

In [None]:
%run ./iris_logreg.py

## Figure 3.4:

  Softmax distribution $\mathcal  S (\mathbf  a /T)$, where $\mathbf  a =(3,0,1)$, at temperatures of $T=100$, $T=2$ and $T=1$. When the temperature is high (left), the distribution is uniform, whereas when the temperature is low (right), the distribution is ``spiky'', with most of its mass on the largest element.  
Figure(s) generated by [softmax_plot.py](https://github.com/probml/pyprobml/blob/master/scripts/softmax_plot.py) 

In [None]:
%run ./softmax_plot.py

## Figure 3.5:

  Logistic regression on the 3-class, 2-feature version of the Iris dataset. Adapted from Figure of 4.25 \citep  Geron2019 .  
Figure(s) generated by [iris_logreg.py](https://github.com/probml/pyprobml/blob/master/scripts/iris_logreg.py) 

In [None]:
%run ./iris_logreg.py

## Figure 3.6:

  (a) Cumulative distribution function (cdf) for the standard normal.  
Figure(s) generated by [gauss_plot.py](https://github.com/probml/pyprobml/blob/master/scripts/gauss_plot.py) [quantile_plot.py](https://github.com/probml/pyprobml/blob/master/scripts/quantile_plot.py) 

In [None]:
%run ./gauss_plot.py

In [None]:
%run ./quantile_plot.py

## Figure 3.7:

  Linear regression using Gaussian output with mean $\mu (x)=b + w x$ and (a) fixed variance $\sigma ^2$ (homoskedastic) or (b) input-dependent variance $\sigma (x)^2$ (heteroscedastic).  
Figure(s) generated by [linreg_1d_hetero_tfp.py](https://github.com/probml/pyprobml/blob/master/scripts/linreg_1d_hetero_tfp.py) 

In [None]:
%run ./linreg_1d_hetero_tfp.py

## Figure 3.8:

  (a) The pdf's for a $\mathcal  N (0,1)$, $\mathcal  T (\mu =0,\sigma =1,\nu =1)$, $\mathcal  T (\mu =0,\sigma =1,\nu =2)$, and $\mathrm  Lap (0,1/\sqrt  2 )$. The mean is 0 and the variance is 1 for both the Gaussian and Laplace. When $\nu =1$, the Student is the same as the Cauchy, which does not have a well-defined mean and variance. (b) Log of these pdf's. Note that the Student distribution is not log-concave for any parameter value, unlike the Laplace distribution. Nevertheless, both are unimodal.  
Figure(s) generated by [student_laplace_pdf_plot.py](https://github.com/probml/pyprobml/blob/master/scripts/student_laplace_pdf_plot.py) 

In [None]:
%run ./student_laplace_pdf_plot.py

## Figure 3.9:

  Illustration of the effect of outliers on fitting Gaussian, Student and Laplace distributions. (a) No outliers (the Gaussian and Student curves are on top of each other). (b) With outliers. We see that the Gaussian is more affected by outliers than the Student and Laplace distributions. Adapted from Figure 2.16 of \citep  BishopBook .  
Figure(s) generated by [robust_pdf_plot.py](https://github.com/probml/pyprobml/blob/master/scripts/robust_pdf_plot.py) 

In [None]:
%run ./robust_pdf_plot.py

## Figure 3.10:

  (a) Some beta distributions.  
Figure(s) generated by [beta_dist_plot.py](https://github.com/probml/pyprobml/blob/master/scripts/beta_dist_plot.py) [gamma_dist_plot.py](https://github.com/probml/pyprobml/blob/master/scripts/gamma_dist_plot.py) 

In [None]:
%run ./beta_dist_plot.py

In [None]:
%run ./gamma_dist_plot.py

## Figure 3.11:

  Visualization of a 2d Gaussian density as a surface plot. (a) Distribution using a full covariance matrix can be oriented at any angle. (b) Distribution using a diagonal covariance matrix must be parallel to the axis. (c) Distribution using a spherical covariance matrix must have a symmetric shape.  
Figure(s) generated by [gauss_plot_2d.py](https://github.com/probml/pyprobml/blob/master/scripts/gauss_plot_2d.py) 

In [None]:
%run ./gauss_plot_2d.py

## Figure 3.12:

  Visualization of a 2d Gaussian density in terms of level sets of constant probability density. (a) A full covariance matrix has elliptical contours. (b) A diagonal covariance matrix is an  \bf axis aligned  ellipse. (c) A spherical covariance matrix has a circular shape.  
Figure(s) generated by [gauss_plot_2d.py](https://github.com/probml/pyprobml/blob/master/scripts/gauss_plot_2d.py) 

In [None]:
%run ./gauss_plot_2d.py

## Figure 3.13:

  Illustration of data imputation using an MVN. (a) Visualization of the data matrix. Blank entries are missing (not observed). Red are positive, green are negative. Area of the square is proportional to the value. (This is known as a  \bf Hinton diagram , named after Geoff Hinton, a famous ML researcher.) (b) True data matrix (hidden). (c) Mean of the posterior predictive distribution, based on partially observed data in that row, using the true model parameters.  
Figure(s) generated by [gaussImputationDemo.m](https://github.com/probml/pmtk3/blob/master/demos/gaussImputationDemo.m) 

In [None]:
!octave -W gaussImputationDemo.m >> _

## Figure 3.14:

  Illustration of Bayesian inference for a 2d Gaussian random vector $\mathbf  z $. (a) The data is generated from $\mathbf  y _n \sim \mathcal  N (\mathbf  z ,\boldsymbol  \Sigma  _y)$, where $\mathbf  z =[0.5, 0.5]^ \top  $ and $\boldsymbol  \Sigma  _y=0.1 [2, 1; 1, 1])$. We assume the sensor noise covariance $\boldsymbol  \Sigma  _y$ is known but $\mathbf  z $ is unknown. The black cross represents $\mathbf  z $. (b) The prior is $p(\mathbf  z ) = \mathcal  N (\mathbf  z |\boldsymbol  0 ,0.1 \mathbf  I _2)$. (c) We show the posterior after 10 data points have been observed.  
Figure(s) generated by [gaussInferParamsMean2d.m](https://github.com/probml/pmtk3/blob/master/demos/gaussInferParamsMean2d.m) 

In [None]:
!octave -W gaussInferParamsMean2d.m >> _

## Figure 3.15:

  We observe $\mathbf  y _1=(0,-1)$ (red cross) and $\mathbf  y _2=(1,0)$ (green cross) and estimate $\mathbb  E \left [ \mathbf  z |\mathbf  y _1,\mathbf  y _2 \right ]$ (black cross). (a) Equally reliable sensors, so the posterior mean estimate is in between the two circles. (b) Sensor 2 is more reliable, so the estimate shifts more towards the green circle. (c) Sensor 1 is more reliable in the vertical direction, Sensor 2 is more reliable in the horizontal direction. The estimate is an appropriate combination of the two measurements.  
Figure(s) generated by [sensor_fusion_2d.py](https://github.com/probml/pyprobml/blob/master/scripts/sensor_fusion_2d.py) 

In [None]:
%run ./sensor_fusion_2d.py

## Figure 3.16:

  A mixture of 3 Gaussians in 2d. (a) We show the contours of constant probability for each component in the mixture. (b) A surface plot of the overall density. Adapted from Figure 2.23 of \citep  BishopBook .  
Figure(s) generated by [mixGaussPlotDemo.m](https://github.com/probml/pmtk3/blob/master/demos/mixGaussPlotDemo.m) 

In [None]:
!octave -W mixGaussPlotDemo.m >> _

## Figure 3.17:

  (a) Some data in 2d. (b) A possible clustering using $K=3$ clusters computed using a GMM.  
Figure(s) generated by [gmm_2d.py](https://github.com/probml/pyprobml/blob/master/scripts/gmm_2d.py) 

In [None]:
%run ./gmm_2d.py

## Figure 3.19:

  (a) (a) Visualization of the first 25 digits from the  \bf MNIST  dataset \citep  LeCun98,Yadav19 .  
Figure(s) generated by [mnist_viz_tf.py](https://github.com/probml/pyprobml/blob/master/scripts/mnist_viz_tf.py) [mixBerMnistEM.m](https://github.com/probml/pmtk3/blob/master/demos/mixBerMnistEM.m) 

In [None]:
%run ./mnist_viz_tf.py

In [None]:
!octave -W mixBerMnistEM.m >> _

## Figure 3.20:

  Water sprinkler PGM with corresponding binary CPTs. T and F stand for true and false. See   https://github.com/probml/pyprobml/blob/master/scripts/sprinkler\_pgm.py  sprinkler\_pgm.py  for some code that implements inference in this model.  
Figure(s) generated by [sprinkler_pgm.py](https://github.com/probml/pyprobml/blob/master/scripts/sprinkler_pgm.py) 

In [None]:
%run ./sprinkler_pgm.py