Skip to content

Commit

Permalink
DOC Restructure docs and fix some heading problems. Also include new …
Browse files Browse the repository at this point in the history
…NUTS scaling NB.
  • Loading branch information
twiecki committed Oct 18, 2016
1 parent 370b5b9 commit ce225df
Show file tree
Hide file tree
Showing 4 changed files with 43 additions and 36 deletions.
34 changes: 22 additions & 12 deletions docs/source/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,26 @@
Examples
********

Howto
=====

.. toctree::
notebooks/BEST.ipynb
notebooks/posterior_predictive.ipynb
notebooks/NUTS_scaling_using_ADVI.ipynb
notebooks/howto_debugging.ipynb
notebooks/LKJ.ipynb

Applied
=======

.. toctree::
notebooks/BEST.ipynb
notebooks/stochastic_volatility.ipynb
notebooks/pmf-pymc.ipynb
notebooks/rugby_analytics.ipynb
notebooks/survival_analysis.ipynb
notebooks/posterior_predictive.ipynb
notebooks/GP-smoothing.ipynb
notebooks/howto_debugging.ipynb


GLM
===

Expand All @@ -28,6 +37,15 @@ GLM
notebooks/GLM-hierarchical.ipynb
notebooks/GLM-poisson-regression.ipynb

Mixture Models
==============

.. toctree::
notebooks/gaussian_mixture_model.ipynb
notebooks/marginalized_gaussian_mixture_model.ipynb
notebooks/gaussian-mixture-model-advi.ipynb
notebooks/dp_mix.ipynb

ADVI
====

Expand All @@ -37,12 +55,4 @@ ADVI
notebooks/lda-advi-aevb.ipynb
notebooks/bayesian_neural_network_advi.ipynb

Mixture Models
==============

.. toctree::
notebooks/gaussian_mixture_model.ipynb
notebooks/marginalized_gaussian_mixture_model.ipynb
notebooks/gaussian-mixture-model-advi.ipynb
notebooks/dp_mix.ipynb

8 changes: 3 additions & 5 deletions docs/source/notebooks/BEST.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
"\n",
"The original pymc2 implementation was written by Andrew Straw and can be found here: https://github.com/strawlab/best\n",
"\n",
"Ported to PyMC3 by Thomas Wiecki (c) 2015, updated by Chris Fonnesbeck."
"Ported to PyMC3 by [Thomas Wiecki](https://twitter.com/twiecki) (c) 2015, updated by Chris Fonnesbeck."
]
},
{
Expand Down Expand Up @@ -108,10 +108,7 @@
"source": [
"The first step in a Bayesian approach to inference is to specify the full probability model that corresponds to the problem. For this example, Kruschke chooses a Student-t distribution to describe the distributions of the scores in each group. This choice adds robustness to the analysis, as a T distribution is less sensitive to outlier observations, relative to a normal distribution. The three-parameter Student-t distribution allows for the specification of a mean $\\mu$, a precision (inverse-variance) $\\lambda$ and a degrees-of-freedom parameter $\\nu$:\n",
"\n",
"$$ f(x|\\mu,\\lambda,\\nu) =\n",
" \\frac{\\Gamma(\\frac{\\nu + 1}{2})}{\\Gamma(\\frac{\\nu}{2})}\n",
" \\left(\\frac{\\lambda}{\\pi\\nu}\\right)^{\\frac{1}{2}}\n",
" \\left[1+\\frac{\\lambda(x-\\mu)^2}{\\nu}\\right]^{-\\frac{\\nu+1}{2}}$$\n",
"$$f(x|\\mu,\\lambda,\\nu) = \\frac{\\Gamma(\\frac{\\nu + 1}{2})}{\\Gamma(\\frac{\\nu}{2})} \\left(\\frac{\\lambda}{\\pi\\nu}\\right)^{\\frac{1}{2}} \\left[1+\\frac{\\lambda(x-\\mu)^2}{\\nu}\\right]^{-\\frac{\\nu+1}{2}}$$\n",
" \n",
"the degrees-of-freedom parameter essentially specifies the \"normality\" of the data, since larger values of $\\nu$ make the distribution converge to a normal distribution, while small values (close to zero) result in heavier tails.\n",
"\n",
Expand Down Expand Up @@ -485,6 +482,7 @@
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python [default]",
"language": "python",
Expand Down
26 changes: 11 additions & 15 deletions docs/source/notebooks/NUTS_scaling_using_ADVI.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,10 @@
"collapsed": true
},
"source": [
"##### PyMC3 Examples\n",
"\n",
"# NUTS scaling using ADVI Outputs\n",
"# NUTS scaling using ADVI\n",
"\n",
"#### A minimal reproducable example of using the stdevs of ADVI to set the scaling matrix of the NUTS sampler.\n",
"\n",
"\n",
"\n",
"I caught up with [Thomas Wiecki](https://twiecki.github.io) after his talk at [ODSC London](https://www.odsc.com/london) and he mentioned a potential speed increase for NUTS sampling by using ADVI outputs to set the covariance scaling matrix.\n",
"\n",
"This seems like a great idea and there's already a [good example in the docs](http://pymc-devs.github.io/pymc3/notebooks/stochastic_volatility.html#Fit-Model) but I wanted to try it myself, and get a feel for the speed increase.\n",
Expand Down Expand Up @@ -90,7 +86,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Setup"
"## Setup"
]
},
{
Expand Down Expand Up @@ -161,7 +157,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Local Functions"
"### Local Functions"
]
},
{
Expand Down Expand Up @@ -215,7 +211,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate Data"
"### Generate Data"
]
},
{
Expand Down Expand Up @@ -326,7 +322,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Create and Run Linear Model"
"## Create and Run Linear Model"
]
},
{
Expand Down Expand Up @@ -364,7 +360,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Metropolis Sampling"
"### Metropolis Sampling"
]
},
{
Expand Down Expand Up @@ -471,7 +467,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## ADVI Estimation"
"### ADVI Estimation"
]
},
{
Expand Down Expand Up @@ -596,14 +592,14 @@
"collapsed": true
},
"source": [
"# Test NUTS Sampling"
"## Test NUTS Sampling"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. NUTS initialise MAP at test_point"
"### 1. NUTS initialise MAP at test_point"
]
},
{
Expand Down Expand Up @@ -699,7 +695,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. NUTS initialise MAP using ADVI mean"
"### 2. NUTS initialise MAP using ADVI mean"
]
},
{
Expand Down Expand Up @@ -795,7 +791,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. NUTS initialise MAP using ADVI mean and scale using ADVI stdevs"
"### 3. NUTS initialise MAP using ADVI mean and scale using ADVI stdevs"
]
},
{
Expand Down
11 changes: 7 additions & 4 deletions docs/source/notebooks/rugby_analytics.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,14 @@
"metadata": {},
"source": [
"I came across the following blog post on http://danielweitzenfeld.github.io/passtheroc/blog/2014/10/28/bayes-premier-league/\n",
"\n",
"* Based on the work of [Baio and Blangiardo](www.statistica.it/gianluca/Research/BaioBlangiardo.pdf)\n",
"\n",
"In this example, we're going to reproduce the first model described in the paper using PyMC3.\n",
"\n",
"Since I am a rugby fan I decide to apply the results of the paper Bayesian Football to the Six Nations.\n",
"Rugby is a physical sport popular worldwide.\n",
"\n",
"* Six Nations consists of Italy, Ireland, Scotland, England, France and Wales\n",
"* Game consists of scoring tries (similar to touch downs) or kicking the goal.\n",
"* Average player is something like 100kg and 1.82m tall.\n",
Expand All @@ -43,8 +46,7 @@
"Ireland are a stronger team than Italy for example - but by how much?\n",
"\n",
"Source for Results 2014 are Wikipedia.\n",
"I handcrafted these results\n",
"Small data\n",
"\n",
"* We want to infer a latent parameter - that is the 'strength' of a team based only on their **scoring intensity**, and all we have are their scores and results, we can't accurately measure the 'strength' of a team. \n",
"* Probabilistic Programming is a brilliant paradigm for modeling these **latent** parameters"
]
Expand Down Expand Up @@ -124,16 +126,16 @@
"metadata": {},
"source": [
"## What do we want to infer?\n",
"\n",
"* We want to infer the latent paremeters (every team's strength) that are generating the data we observe (the scorelines).\n",
"* Moreover, we know that the scorelines are a noisy measurement of team strength, so ideally, we want a model that makes it easy to quantify our uncertainty about the underlying strengths.\n",
"\n",
"* Often we don't know what the Bayesian Model is explicitly, so we have to 'estimate' the Bayesian Model'\n",
"* If we can't solve something, approximate it.\n",
"\n",
"* Markov-Chain Monte Carlo (MCMC) instead draws samples from the posterior.\n",
"* Fortunately, this algorithm can be applied to almost any model.\n",
"\n",
"## What do we want?\n",
"\n",
"* We want to quantify our uncertainty\n",
"* We want to also use this to generate a model\n",
"* We want the answers as distributions not point estimates"
Expand All @@ -144,6 +146,7 @@
"metadata": {},
"source": [
"## What assumptions do we know for our 'generative story'?\n",
"\n",
"* We know that the Six Nations in Rugby only has 6 teams - they each play each other once\n",
"* We have data from last year!\n",
"* We also know that in sports scoring is modelled as a Poisson distribution\n",
Expand Down

0 comments on commit ce225df

Please sign in to comment.