# Copyright and License.

In [None]:
# Copyright (c) 2021 Kevin P. Murphy (murphyk@gmail.com) and Mahmoud Soliman (mjs@aucegypt.edu)
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.

![GitHub](https://img.shields.io/github/license/probml/pyprobml)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/probml/pyprobml/blob/master/notebooks/figures/chapter3_figures.ipynb)

## Figure 3.1:

  Illustration of the binomial distribution with $N=10$ and (a) $\theta =0.25$ and (b) $\theta =0.9$. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/binom/_dist/_plot.py

## Figure 3.2:

  (a) The sigmoid (logistic) function $\sigma (a)=(1+e^ -a )^ -1 $. (b) The Heaviside function $\mathbb  I \left ( a>0 \right )$. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/activation/_fun/_plot.py

## Figure 3.3:

  Logistic regression applied to a 1-dimensional, 2-class version of the Iris dataset. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/iris/_logreg.py

## Figure 3.4:

  Softmax distribution $\mathcal  S (\mathbf  a /T)$, where $\mathbf  a =(3,0,1)$, at temperatures of $T=100$, $T=2$ and $T=1$. When the temperature is high (left), the distribution is uniform, whereas when the temperature is low (right), the distribution is ``spiky'', with most of its mass on the largest element. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/softmax/_plot.py

## Figure 3.5:

  Logistic regression on the 3-class, 2-feature version of the Iris dataset. Adapted from Figure of 4.25 \citep  Geron2019 . 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/iris/_logreg.py

## Figure 3.6:

  (a) Cumulative distribution function (cdf) for the standard normal. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/gauss/_plot.py

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/quantile/_plot.py

## Figure 3.7:

  Linear regression using Gaussian output with mean $\mu (x)=b + w x$ and (a) fixed variance $\sigma ^2$ (homoskedastic) or (b) input-dependent variance $\sigma (x)^2$ (heteroscedastic). 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/linreg/_1d/_hetero/_tfp.py

## Figure 3.8:

  (a) The pdf's for a $\mathcal  N (0,1)$, $\mathcal  T (\mu =0,\sigma =1,\nu =1)$, $\mathcal  T (\mu =0,\sigma =1,\nu =2)$, and $\mathrm  Lap (0,1/\sqrt  2 )$. The mean is 0 and the variance is 1 for both the Gaussian and Laplace. When $\nu =1$, the Student is the same as the Cauchy, which does not have a well-defined mean and variance. (b) Log of these pdf's. Note that the Student distribution is not log-concave for any parameter value, unlike the Laplace distribution. Nevertheless, both are unimodal. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/student/_laplace/_pdf/_plot.py

## Figure 3.9:

  Illustration of the effect of outliers on fitting Gaussian, Student and Laplace distributions. (a) No outliers (the Gaussian and Student curves are on top of each other). (b) With outliers. We see that the Gaussian is more affected by outliers than the Student and Laplace distributions. Adapted from Figure 2.16 of \citep  BishopBook . 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/robust/_pdf/_plot.py

## Figure 3.10:

  (a) Some beta distributions. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/beta/_dist/_plot.py

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/gamma/_dist/_plot.py

## Figure 3.11:

  Visualization of a 2d Gaussian density as a surface plot. (a) Distribution using a full covariance matrix can be oriented at any angle. (b) Distribution using a diagonal covariance matrix must be parallel to the axis. (c) Distribution using a spherical covariance matrix must have a symmetric shape. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/gauss/_plot/_2d.py

## Figure 3.12:

  Visualization of a 2d Gaussian density in terms of level sets of constant probability density. (a) A full covariance matrix has elliptical contours. (b) A diagonal covariance matrix is an  \bf axis aligned  ellipse. (c) A spherical covariance matrix has a circular shape. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/gauss/_plot/_2d.py

## Figure 3.13:

  Illustration of data imputation using an MVN. (a) Visualization of the data matrix. Blank entries are missing (not observed). Red are positive, green are negative. Area of the square is proportional to the value. (This is known as a  \bf Hinton diagram , named after Geoff Hinton, a famous ML researcher.) (b) True data matrix (hidden). (c) Mean of the posterior predictive distribution, based on partially observed data in that row, using the true model parameters. 

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/gaussImputationDemo.m >> _

## Figure 3.14:

  Illustration of Bayesian inference for a 2d Gaussian random vector $\mathbf  z $. (a) The data is generated from $\mathbf  y _n \sim \mathcal  N (\mathbf  z ,\boldsymbol  \Sigma  _y)$, where $\mathbf  z =[0.5, 0.5]^ \top  $ and $\boldsymbol  \Sigma  _y=0.1 [2, 1; 1, 1])$. We assume the sensor noise covariance $\boldsymbol  \Sigma  _y$ is known but $\mathbf  z $ is unknown. The black cross represents $\mathbf  z $. (b) The prior is $p(\mathbf  z ) = \mathcal  N (\mathbf  z |\boldsymbol  0 ,0.1 \mathbf  I _2)$. (c) We show the posterior after 10 data points have been observed. 

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/gaussInferParamsMean2d.m >> _

## Figure 3.15:

  We observe $\mathbf  y _1=(0,-1)$ (red cross) and $\mathbf  y _2=(1,0)$ (green cross) and estimate $\mathbb  E \left [ \mathbf  z |\mathbf  y _1,\mathbf  y _2 \right ]$ (black cross). (a) Equally reliable sensors, so the posterior mean estimate is in between the two circles. (b) Sensor 2 is more reliable, so the estimate shifts more towards the green circle. (c) Sensor 1 is more reliable in the vertical direction, Sensor 2 is more reliable in the horizontal direction. The estimate is an appropriate combination of the two measurements. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/sensor/_fusion/_2d.py

## Figure 3.16:

  A mixture of 3 Gaussians in 2d. (a) We show the contours of constant probability for each component in the mixture. (b) A surface plot of the overall density. Adapted from Figure 2.23 of \citep  BishopBook . 

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/mixGaussPlotDemo.m >> _

## Figure 3.17:

  (a) Some data in 2d. (b) A possible clustering using $K=3$ clusters computed using a GMM. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/gmm/_2d.py

## Figure 3.19:

  (a) (a) Visualization of the first 25 digits from the  \bf MNIST  dataset \citep  LeCun98,Yadav19 . 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/mnist/_viz/_tf.py

In [None]:
!octave -W https://github.com/probml/pmtk3/blob/master/demos/mixBerMnistEM.m >> _

## Figure 3.20:

  Water sprinkler PGM with corresponding binary CPTs. T and F stand for true and false. See   https://github.com/probml/pyprobml/blob/master/scripts/sprinkler\_pgm.py  sprinkler\_pgm.py ) for some code that implements inference in this model. 

In [None]:
%run https://github.com/probml/pyprobml/blob/master/scripts/sprinkler/_pgm.py