# Bonus Bonus
Make sliders that change the parameter values of the distributions in order to visualise their effects

* Bernoulli Distribution
* Beta Distribution
* Categorical Distribution
* Dirichlet Distribution
* Univariate Normal Distribution
* Normal-scaled inverse gamma distribution
* Multivariate Normal distribution
* Normal inverse wishart distribution

In [46]:
from scipy.stats import bernoulli, beta, dirichlet
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets
import seaborn as sns
import numpy as np
import math
import matplotlib.pyplot as plt

# Bernoulli Distribution
* Discrete
* Models binary trials
* Describes the situation where there are only two possible outcomes
* $x \in {0,1}$
* Single parameter, $\lambda$ which defines the probability of observing a success $x=1$

Can be expressed as:
$$
\begin{array}{l}{\operatorname{Pr}(x=0)=1-\lambda} \\ {\operatorname{Pr}(x=1)=\lambda}\end{array}
$$

Or alternatively:
$$
\operatorname{Pr}(x)=\lambda^{x}(1-\lambda)^{1-x}
$$

Or:
$$
\operatorname{Pr}(x)=\operatorname{Bern}_{x}[\lambda]
$$

In [2]:
def bern(𝜆):
    data_bern = bernoulli.rvs(size=1000,p=𝜆)
    ax = sns.distplot(data_bern,kde=True,color='crimson',hist_kws={"linewidth": 25,'alpha':1})
    ax.set(xlabel='Bernoulli', ylabel='Frequency')

In [3]:
interact(bern, 𝜆=(0,1,0.01));

interactive(children=(FloatSlider(value=0.0, description='λ', max=1.0, step=0.01), Output()), _dom_classes=('w…

# Beta Distribution
* Continuous distribution defined on single variable 𝜆 where $\lambda \in [0,1]$
* Used for representing uncertainty in the parameter $\lambda$ of the Bernoulli distribution
* Has parameters $(\alpha, \beta)$ whose values determine the expected value so $$
\mathrm{E}[\lambda]=\alpha /(\alpha+\beta)
$$
* $\alpha, \beta \in (0,∞)$

Beta distribution is represented as:
$$
\operatorname{Pr}(\lambda)=\frac{\Gamma[\alpha+\beta]}{\Gamma[\alpha] \Gamma[\beta]} \lambda^{\alpha-1}(1-\lambda)^{\beta-1}
$$

In [4]:
def Beta(𝛼,𝛽):
    data_beta = beta.rvs(𝛼, 𝛽, size=1000)
    ax = sns.distplot(data_beta,kde=True,color='crimson',hist_kws={"linewidth": 25,'alpha':1})
    ax.set(xlabel='Beta Distribution', ylabel='Probability')

In [5]:
interact(Beta, 𝛼=(0.0000001,10.0000001), 𝛽=(0.0000001,10.0000001));

interactive(children=(FloatSlider(value=5.0000001, description='α', max=10.0000001, min=1e-07), FloatSlider(va…

# Categorical Distribution
* Discrete distribution that determines the probability of observing one of $k$ possible outcomes
* The probabilities of observing the $K$ outcomes are held in a $K$ x $1$ parameter vector $\mathbf{\lambda} = [\lambda_1,\lambda_2,...,\lambda_k]$ where $\lambda_k \in [0,1]$ and $\sum^k_{k=1} \lambda_k =1 $
* Can be visualised as a normalised histogram with $K$ bins, and can be written as:
$$
\operatorname{Pr}(x=k)=\lambda_{k}
$$
or:
$$
\operatorname{Pr}(x)=\operatorname{Cat}_{x}[\boldsymbol{\lambda}]
$$

In [6]:
dataset=sns.load_dataset('tips')
data = dataset['sex']
init_male = data[data=='Male'].count()
total = data.count()
init_female = data[data=='Female'].count()
print("Initially M:",init_male,"F:",init_female,"Total:",total)

Initially M: 157 F: 87 Total: 244


In [7]:
def categorical(i, male, female):
    if i > male:
        difference = i - male
        male = i
        female = female - difference
    else:
        difference = male - i
        male = i
        female = female + difference
    mp = male/total
    fp = female/total
    m = plt.bar('Male',mp, color='g',label='Male')
    f = plt.bar('Female',fp, color='r', label='Female')
    plt.xlabel('Gender')
    plt.ylabel('Probability')
    plt.title('Categorical Data')
    plt.legend()
    plt.tight_layout()
    plt.show()
    
def draw_cat(num_men):
    categorical(num_men,init_male,init_female)

In [8]:
interact(draw_cat, num_men=(0,244,1));

interactive(children=(IntSlider(value=122, description='num_men', max=244), Output()), _dom_classes=('widget-i…

In [41]:
'''Functions for drawing contours of Dirichlet distributions.'''

# Author: Thomas Boggs

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.tri as tri
from functools import reduce

_corners = np.array([[0, 0], [1, 0], [0.5, 0.75**0.5]])
_triangle = tri.Triangulation(_corners[:, 0], _corners[:, 1])
_midpoints = [(_corners[(i + 1) % 3] + _corners[(i + 2) % 3]) / 2.0 for i in range(3)]

def xy2bc(xy, tol=1.e-3):
    '''Converts 2D Cartesian coordinates to barycentric.

    Arguments:

        `xy`: A length-2 sequence containing the x and y value.
    '''
    s = [(_corners[i] - _midpoints[i]).dot(xy - _midpoints[i]) / 0.75 for i in range(3)]
    return np.clip(s, tol, 1.0 - tol)

class Dirichlet(object):
    def __init__(self, alpha):
        '''Creates Dirichlet distribution with parameter `alpha`.'''
        from math import gamma
        from operator import mul
        self._alpha = np.array(alpha)
        self._coef = gamma(np.sum(self._alpha)) / reduce(mul, [gamma(a) for a in self._alpha])
    def pdf(self, x):
        '''Returns pdf value for `x`.'''
        from operator import mul
        return self._coef * reduce(mul, [xx ** (aa - 1) for (xx, aa)in zip(x, self._alpha)])
    def sample(self, N):
        '''Generates a random sample of size `N`.'''
        return np.random.dirichlet(self._alpha, N)

def draw_pdf_contours(dist, border=False, nlevels=200, subdiv=8, **kwargs):
    '''Draws pdf contours over an equilateral triangle (2-simplex).

    Arguments:

        `dist`: A distribution instance with a `pdf` method.

        `border` (bool): If True, the simplex border is drawn.

        `nlevels` (int): Number of contours to draw.

        `subdiv` (int): Number of recursive mesh subdivisions to create.

        kwargs: Keyword args passed on to `plt.triplot`.
    '''
    from matplotlib import ticker, cm
    import math

    refiner = tri.UniformTriRefiner(_triangle)
    trimesh = refiner.refine_triangulation(subdiv=subdiv)
    pvals = [dist.pdf(xy2bc(xy)) for xy in zip(trimesh.x, trimesh.y)]

    plt.tricontourf(trimesh, pvals, nlevels, **kwargs)
    plt.axis('equal')
    plt.xlim(0, 1)
    plt.ylim(0, 0.75**0.5)
    plt.axis('off')
    if border is True:
#         plt.hold(1)
        plt.triplot(_triangle, linewidth=1)

def plot_points(X, barycentric=True, border=True, **kwargs):
    '''Plots a set of points in the simplex.

    Arguments:

        `X` (ndarray): A 2xN array (if in Cartesian coords) or 3xN array
                       (if in barycentric coords) of points to plot.

        `barycentric` (bool): Indicates if `X` is in barycentric coords.

        `border` (bool): If True, the simplex border is drawn.

        kwargs: Keyword args passed on to `plt.plot`.
    '''
    if barycentric is True:
        X = X.dot(_corners)
    plt.plot(X[:, 0], X[:, 1], 'k.', ms=1, **kwargs)
    plt.axis('equal')
    plt.xlim(0, 1)
    plt.ylim(0, 0.75**0.5)
    plt.axis('off')
    if border is True:
#         plt.hold(1)
        plt.triplot(_triangle, linewidth=1)


def draw_dirichlet(a,b,c):
    alphas= [a,b,c]
    dist = Dirichlet(alphas)
    draw_pdf_contours(dist)
    title = r'$\alpha$ = (%.3f, %.3f, %.3f)' % tuple(alphas)
    plt.title(title, fontdict={'fontsize': 8})
    plot_points(dist.sample(5000))


In [43]:
interact(draw_dirichlet, a=(0.0000001,15.0000001), b=(0.0000001,15.0000001),c=(0.0000001,15.0000001));

interactive(children=(FloatSlider(value=7.5000001, description='a', max=15.0000001, min=1e-07), FloatSlider(va…

# Univariate Normal Distribution
* Defined on continuous values $x \in [-inf,inf]$
* Two parameters, mean $\mu$ and the variance $\sigma^2$
* Parameter $\mu$ can take any value and determines the position of the peak
* The parameter $\sigma^2$ takes only positive values and determines the width of the distribution.

Defined as:
$$
\operatorname{Pr}(x)=\frac{1}{\sqrt{2 \pi \sigma^{2}}} \exp \left[-0.5(x-\mu)^{2} / \sigma^{2}\right]
$$

Abbreviate to:
$$
\operatorname{Pr}(x)=\operatorname{Norm}_{x}\left[\mu, \sigma^{2}\right]
$$

In [60]:
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
import math

def univariate_normal(𝜇, var):
    value = np.random.normal(loc=𝜇,scale=var,size=1000)
    sns.distplot(value)
    plt.xlabel("Value")
    plt.ylabel("Probability")
    plt.title("Univariate Normal Distribution")

In [64]:
interact(univariate_normal, 𝜇=(-100,100), var=(0.0000001,15.0000001));

interactive(children=(IntSlider(value=0, description='μ', min=-100), FloatSlider(value=7.5000001, description=…

# Normal-Scaled Inverse Gamma
* Defined over a pair of continuous values $\mu, \sigma^2$, the first of which can take any value, the second of which is constrained to be positive
* It can define a distribution over the mean and variance parameters of the normal distribution
* Four parameters $\alpha, \beta, \gamma, \delta$ where $\alpha, \beta, \gamma$ are positive real numbers but $\delta$ can take any value.
* Its pdf is:
$$
\operatorname{Pr}\left(\mu, \sigma^{2}\right)=\frac{\sqrt{\gamma}}{\sigma \sqrt{2 \pi}} \frac{\beta^{\alpha}}{\Gamma[\alpha]}\left(\frac{1}{\sigma^{2}}\right)^{\alpha+1} \exp \left[-\frac{2 \beta+\gamma(\delta-\mu)^{2}}{2 \sigma^{2}}\right]
$$

For short:
$$
\operatorname{Pr}\left(\mu, \sigma^{2}\right)=\text { NormInvGam }_{\mu, \sigma^{2}}[\alpha, \beta, \gamma, \delta]
$$

Multivariate Normal Distribution

# Normal Inverse Wishart Distribution