In [None]:
# Copyright 2021 Google LLC
# Use of this source code is governed by an MIT-style
# license that can be found in the LICENSE file or at
# https://opensource.org/licenses/MIT.
# Notebook authors: Kevin P. Murphy (murphyk@gmail.com)
# and Mahmoud Soliman (mjs@aucegypt.edu)

# This notebook reproduces figures for chapter 13 from the book
# "Probabilistic Machine Learning: An Introduction"
# by Kevin Murphy (MIT Press, 2021).
# Book pdf is available from http://probml.ai

<a href="https://opensource.org/licenses/MIT" target="_parent"><img src="https://img.shields.io/github/license/probml/pyprobml"/></a>

<a href="https://colab.research.google.com/github/probml/pml-book/blob/main/pml1/figure_notebooks/chapter13_neural_networks_for_structured_data_figures.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Figure 13.1:<a name='13.1'></a> <a name='xor'></a> 


(a) Illustration of the fact that the XOR function is not linearly separable, but can be separated by the two layer model using Heaviside activation functions. Adapted from Figure 10.6 of <a href='#Geron2019'>[Aur19]</a> .  
Figure(s) generated by [xor_heaviside.py](https://github.com/probml/pyprobml/blob/master/scripts/xor_heaviside.py) 

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

In [None]:
try_deimport()
%run -n xor_heaviside.py

## Figure 13.2:<a name='13.2'></a> <a name='activationFns2'></a> 


(a) Illustration of how the sigmoid function is linear for inputs near 0, but saturates for large positive and negative inputs. Adapted from 11.1 of <a href='#Geron2019'>[Aur19]</a> . (b) Plots of some neural network activation functions.  
Figure(s) generated by [activation_fun_plot.py](https://github.com/probml/pyprobml/blob/master/scripts/activation_fun_plot.py) 

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

In [None]:
try_deimport()
%run -n activation_fun_plot.py

## Figure 13.3:<a name='13.3'></a> <a name='mlp-playground'></a> 


 An MLP with 2 hidden layers applied to a set of 2d points from 2 classes, shown in the top left corner. The visualizations associated with each hidden unit show the decision boundary at that part of the network. The final output is shown on the right. The input is $ \bm x  \in \mathbb R ^2$, the first layer activations are $ \bm z  _1 \in \mathbb R ^4$, the second layer activations are $ \bm z  _2 \in \mathbb R ^2$, and the final logit is $a_3 \in \mathbb R $, which is converted to a probability using the sigmoid function. This is a screenshot from the interactive demo at  http://playground.tensorflow.org 

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.3.png" width="256"/>

## Figure 13.4:<a name='13.4'></a> <a name='mlpMnist'></a> 


 Results of applying an MLP (with 2 hidden layers with 128 units and 1 output layer with 10 units) to some MNIST images (cherry picked to include some errors). Red is incorrect, blue is correct. (a) After 1 epoch of training. (b) After 2 epochs. 

To reproduce this figure, click the open in colab button: <a href="https://colab.research.google.com/github/probml/probml-notebooks/blob/master/notebooks/mlp_mnist_tf.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.4_A.png" width="256"/>

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.4_B.png" width="256"/>

## Figure 13.5:<a name='13.5'></a> <a name='twoHeaded'></a> 


 Illustration of an MLP with a shared ``backbone'' and two output ``heads'', one for predicting the mean and one for predicting the variance. From  https://brendanhasz.github.io/2019/07/23/bayesian-density-net.html . Used with kind permission of Brendan Hasz

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.5.png" width="256"/>

## Figure 13.6:<a name='13.6'></a> <a name='twoHeadedSineWaves'></a> 


 Illustration of predictions from an MLP fit using MLE to a 1d regression dataset with growing noise. (a) Output variance is input-dependent, as in \cref fig:twoHeaded . (b) Mean is computed using same model as in (a), but output variance is treated as a fixed parameter $\sigma ^2$, which is estimated by MLE after training, as in \cref sec:linregSigmaMLE . 

To reproduce this figure, click the open in colab button: <a href="https://colab.research.google.com/github/probml/probml-notebooks/blob/master/notebooks/mlp_1d_regression_hetero_tfp.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.6_A.png" width="256"/>

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.6_B.png" width="256"/>

## Figure 13.7:<a name='13.7'></a> <a name='reluPolytope2d'></a> 


 A decomposition of $\mathbb R ^2$ into a finite set of linear decision regions produced by an MLP with \ensuremath \mathrm ReLU  \xspace activations with (a) one hidden layer of 25 hidden units and (b) two hidden layers. From Figure 1 of <a href='#Hein2019'>[MMJ19]</a> . Used with kind permission of Maksym Andriuschenko

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.7_A.png" width="256"/>

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.7_B.png" width="256"/>

## Figure 13.8:<a name='13.8'></a> <a name='axons'></a> 


 Illustration of two neurons connected together in a ``circuit''. The output axon of the left neuron makes a synaptic connection with the dendrites of the cell on the right. Electrical charges, in the form of ion flows, allow the cells to communicate. From  https://en.wikipedia.org/wiki/Neuron . Used with kind permission of Wikipedia author BruceBlaus

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.8.png" width="256"/>

## Figure 13.9:<a name='13.9'></a> <a name='DNN-size-vs-time'></a> 


 Plot of neural network sizes over time. Models 1, 2, 3 and 4 correspond to the perceptron <a href='#Rosenblatt58'>[Ros58]</a> , the adaptive linear unit <a href='#Widrow1960'>[BH60]</a>  the neocognitron <a href='#Fukushima1980'>[K80]</a> , and the first MLP trained by backprop <a href='#Rumelhart86'>[RHW86]</a> . Approximate number of neurons for some living organisms are shown on the right scale (the sponge has 0 neurons), based on  https://en.wikipedia.org/wiki/List_of_animals_by_number_of_neurons . From Figure 1.11 of <a href='#GoodfellowBook'>[GBC16]</a> . Used with kind permission of Ian Goodfellow

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.9.png" width="256"/>

## Figure 13.10:<a name='13.10'></a> <a name='feedforward-graph'></a> 


 A simple linear-chain feedforward model with 4 layers. Here $ \bm x  $ is the input and $ \bm o  $ is the output. From <a href='#Blondel2020'>[Mat20]</a> 

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.10.png" width="256"/>

## Figure 13.11:<a name='13.11'></a> <a name='computation-graph'></a> 


 An example of a computation graph with 2 (scalar) inputs and 1 (scalar) output. From <a href='#Blondel2020'>[Mat20]</a> 

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.11.png" width="256"/>

## Figure 13.12:<a name='13.12'></a> <a name='backwardsDiff'></a> 


 Notation for automatic differentiation at node $j$ in a computation graph. From <a href='#Blondel2020'>[Mat20]</a> 

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.12.png" width="256"/>

## Figure 13.13:<a name='13.13'></a> <a name='compGraphD2l'></a> 


 Computation graph for an MLP with input $ \bm x  $, hidden layer $ \bm h  $, output $ \bm o  $, loss function $L=\ell ( \bm o  ,y)$, an $\ell _2$ regularizer $s$ on the weights, and total loss $J=L+s$. From Figure 4.7.1 of <a href='#dive'>[Zha+20]</a> . Used with kind permission of Aston Zhang

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.13.png" width="256"/>

## Figure 13.14:<a name='13.14'></a> <a name='activationWithGrad'></a> 


(a) Some popular activation functions. (b) Plot of their gradients. 

To reproduce this figure, click the open in colab button: <a href="https://colab.research.google.com/github/probml/probml-notebooks/blob/master/notebooks/activation_fun_deriv_torch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.14_A.png" width="256"/>

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.14_B.png" width="256"/>

## Figure 13.15:<a name='13.15'></a> <a name='residualVanishing'></a> 


(a) Illustration of a residual block. (b) Illustration of why adding residual connections can help when training a very deep model. Adapted from Figure 14.16 of <a href='#Geron2019'>[Aur19]</a> 

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.15_A.png" width="256"/>

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.15_B.png" width="256"/>

## Figure 13.16:<a name='13.16'></a> <a name='multiGPU'></a> 


 Calculation of minibatch stochastic gradient using data parallelism and two GPUs. From Figure 12.5.2 of <a href='#dive'>[Zha+20]</a> . Used with kind permission of Aston Zhang

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.16.png" width="256"/>

## Figure 13.17:<a name='13.17'></a> <a name='sparseNnet'></a> 


(a) A deep but sparse neural network. The connections are pruned using $\ell _1$ regularization. At each level, nodes numbered 0 are clamped to 1, so their outgoing weights correspond to the offset/bias terms. (b) Predictions made by the model on the training set. 

To reproduce this figure, click the open in colab button: <a href="https://colab.research.google.com/github/probml/probml-notebooks/blob/master/notebooks/sparse_mlp.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.17_A.png" width="256"/>

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.17_B.png" width="256"/>

## Figure 13.18:<a name='13.18'></a> <a name='dropout'></a> 


 Illustration of dropout. (a) A standard neural net with 2 hidden layers. (b) An example of a thinned net produced by applying dropout with $p_0=0.5$. Units that have been dropped out are marked with an x. From Figure 1 of <a href='#Srivastava2014'>[Nit+14]</a> . Used with kind permission of Geoff Hinton

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.18_A.png" width="256"/>

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.18_B.png" width="256"/>

## Figure 13.19:<a name='13.19'></a> <a name='flatMinima'></a> 


 Flat vs sharp minima. From Figures 1 and 2 of <a href='#Hochreiter1997'>[SJ97]</a> . Used with kind permission of Jürgen Schmidhuber

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.19.png" width="256"/>

## Figure 13.20:<a name='13.20'></a> <a name='sgd-minima-unstable'></a> 


 Each curve shows how the loss varies across parameter values for a given minibatch. (a) A stable local minimum. (b) An unstable local minimum. 

To reproduce this figure, click the open in colab button: <a href="https://colab.research.google.com/github/probml/probml-notebooks/blob/master/notebooks/sgd_minima_variance.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.20_A.png" width="256"/>

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.20_B.png" width="256"/>

## Figure 13.21:<a name='13.21'></a> <a name='xorRBF'></a> 


(a) xor truth table. (b) Fitting a linear logistic regression classifier using degree 10 polynomial expansion. (c) Same model, but using an RBF kernel with centroids specified by the 4 black crosses.  
Figure(s) generated by [logregXorDemo.py](https://github.com/probml/pyprobml/blob/master/scripts/logregXorDemo.py) 

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

In [None]:
try_deimport()
%run -n logregXorDemo.py

## Figure 13.22:<a name='13.22'></a> <a name='rbfDemo'></a> 


 Linear regression using 10 equally spaced RBF basis functions in 1d. Left column: fitted function. Middle column: basis functions evaluated on a grid. Right column: design matrix. Top to bottom we show different bandwidths for the kernel function: $\sigma =0.5, 10, 50$.  
Figure(s) generated by [linregRbfDemo.py](https://github.com/probml/pyprobml/blob/master/scripts/linregRbfDemo.py) 

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

In [None]:
try_deimport()
%run -n linregRbfDemo.py

## Figure 13.23:<a name='13.23'></a> <a name='mixexp'></a> 


(a) Some data from a one-to-many function. (b) The responsibilities of each expert for the input domain. (c) Prediction of each expert. (d) Overeall prediction. Mean is red cross, mode is black square. Adapted from Figures 5.20 and 5.21 of <a href='#BishopBook'>[Bis06]</a> .  
Figure(s) generated by [mixexpDemoOneToMany.py](https://github.com/probml/pyprobml/blob/master/scripts/mixexpDemoOneToMany.py) 

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

In [None]:
try_deimport()
%run -n mixexpDemoOneToMany.py

## Figure 13.24:<a name='13.24'></a> <a name='deepMOE'></a> 


 Deep MOE with $m$ experts, represented as a neural network. From Figure 1 of <a href='#Chazan2017'>[SJS17]</a> . Used with kind permission of Jacob Goldberger

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.24.png" width="256"/>

## Figure 13.25:<a name='13.25'></a> <a name='HMENN'></a> 


 A 2-level hierarchical mixture of experts as a neural network. The top gating network chooses between the left and right expert, shown by the large boxes; the left and right experts themselves choose between their left and right sub-experts

In [None]:
#@title Click me to run setup { display-mode: "form" }
try:
  if PYPROBML_SETUP_ALREADY_RUN:
    print('skipping setup')
except:
  PYPROBML_SETUP_ALREADY_RUN = True
  print('running setup...')
  !git clone --depth 1 https://github.com/probml/pyprobml  /pyprobml &> /dev/null 
  %cd -q /pyprobml/scripts
  %reload_ext autoreload 
  %autoreload 2
  !pip install superimport deimport -qqq
  import superimport
def try_deimport():
  try: 
    from deimport.deimport import deimport
    deimport(superimport)
  except Exception as e:
    print(e)
  print('finished!')

<img src="https://raw.githubusercontent.com/probml/pml-book/main/pml1/figures/images/Figure_13.25.png" width="256"/>

## References:
 <a name='Geron2019'>[Aur19]</a> G. Aur'elien "Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques for BuildingIntelligent Systems (2nd edition)". (2019). 

<a name='Widrow1960'>[BH60]</a> W. Bernard and H. HoffMarcianE. "Adaptive Switching Circuits". (1960). 

<a name='BishopBook'>[Bis06]</a> C. Bishop "Pattern recognition and machine learning". (2006). 

<a name='GoodfellowBook'>[GBC16]</a> I. Goodfellow, Y. Bengio and A. Courville. "Deep Learning". (2016). 

<a name='Fukushima1980'>[K80]</a> F. K "Neocognitron: a self organizing neural network model for amechanism of pattern recognition unaffected by shift in position". In: Biol. Cybern. (1980). 

<a name='Hein2019'>[MMJ19]</a> H. Matthias, A. Maksym and B. Julian. "Why ReLU networks yield high-confidence predictions far awayfrom the training data and how to mitigate the problem". (2019). 

<a name='Blondel2020'>[Mat20]</a> B. Mathieu "Automatic differentiation". (2020). 

<a name='Srivastava2014'>[Nit+14]</a> S. Nitish, H. GeoffRey, K. Alex, S. Ilya and S. Ruslan. "Dropout: A Simple Way to Prevent Neural Networks from Over tting". In: jmlr (2014). 

<a name='Rumelhart86'>[RHW86]</a> D. Rumelhart, G. Hinton and R. Williams. "Learning internal representations by error propagation". (1986). 

<a name='Rosenblatt58'>[Ros58]</a> F. Rosenblatt "The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain". In: Psychological Review (1958). 

<a name='Hochreiter1997'>[SJ97]</a> H. S and S. J. "Flat minima". In: Neural Comput. (1997). 

<a name='Chazan2017'>[SJS17]</a> C. ShlomoE, G. Jacob and G. Sharon. "Speech Enhancement using a Deep Mixture of Experts". abs/1703.09302 (2017). arXiv: 1703.09302 

<a name='dive'>[Zha+20]</a> A. Zhang, Z. Lipton, M. Li and A. Smola. "Dive into deep learning". (2020). 

