## A little More on PyMC3

In [1]:
%matplotlib inline
import numpy as np
import pandas as pd
import theano
import pymc3 as pm
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_context('notebook')
np.random.seed(12345)
rc = {'xtick.labelsize': 10, 'ytick.labelsize': 10, 'axes.labelsize': 10, 'font.size': 10, 
      'legend.fontsize': 12.0, 'axes.titlesize': 10, "figure.figsize": [14, 6]}
sns.set(rc = rc)
sns.set_style("whitegrid")

## Model context

All variables we want in our model are defined within the Model context/object.  
If you try to define a variable outside of the Model context, you will ge an error.

### Initial values
All variables in the Model context have initial values i.e test value. It's used as the starting point for sampling if no other start is specified.

In [2]:
with pm.Model() as model:
    param = pm.Exponential("param", lam = 1)
    data_gen = pm.Poisson("data_gen", mu = param)

In [5]:
param.tag.test_value

array(0.6931471824645996)

In [6]:
data_gen.tag.test_value

0

Supply initial values using testval. This maybe necessary if you need a better starting point

In [10]:
with pm.Model() as model:
    param = pm.Exponential("param", lam = 1, testval = 0.4)
    
param.tag.test_value    

array(0.4)

Use the *shape* argument to specify the number of variables like number of coefficients

In [14]:
number_of_coeff = 5
with pm.Model() as model:
    betas = pm.Normal("betas", mu = 0, sd = 1, shape = number_of_coeff)

betas.tag.test_value    

array([ 0.,  0.,  0.,  0.,  0.])

In [27]:
X = np.array(sns.load_dataset("tips")[["total_bill", "tip"]])
X

array([[ 16.99,   1.01],
       [ 10.34,   1.66],
       [ 21.01,   3.5 ],
       [ 23.68,   3.31],
       [ 24.59,   3.61],
       [ 25.29,   4.71],
       [  8.77,   2.  ],
       [ 26.88,   3.12],
       [ 15.04,   1.96],
       [ 14.78,   3.23],
       [ 10.27,   1.71],
       [ 35.26,   5.  ],
       [ 15.42,   1.57],
       [ 18.43,   3.  ],
       [ 14.83,   3.02],
       [ 21.58,   3.92],
       [ 10.33,   1.67],
       [ 16.29,   3.71],
       [ 16.97,   3.5 ],
       [ 20.65,   3.35],
       [ 17.92,   4.08],
       [ 20.29,   2.75],
       [ 15.77,   2.23],
       [ 39.42,   7.58],
       [ 19.82,   3.18],
       [ 17.81,   2.34],
       [ 13.37,   2.  ],
       [ 12.69,   2.  ],
       [ 21.7 ,   4.3 ],
       [ 19.65,   3.  ],
       [  9.55,   1.45],
       [ 18.35,   2.5 ],
       [ 15.06,   3.  ],
       [ 20.69,   2.45],
       [ 17.78,   3.27],
       [ 24.06,   3.6 ],
       [ 16.31,   2.  ],
       [ 16.93,   3.07],
       [ 18.69,   2.31],
       [ 31.27,   5.  ],


If we want a deterministic variable to be tracked by our sampling, define it explicitly using the *deterministic* constructor

In [28]:
X = sns.load_dataset("tips")[["total_bill", "tip"]]
with pm.Model() as model:
    betas = pm.Normal("betas", mu = 0, sd = 1, shape = 2)
    sd_error = pm.HalfNormal("sd_error", sd = 1)
    intercept = pm.Normal("intercept", mu = 0, sd = 1)
    
    # specify the mean function
    mean = pm.Deterministic("mean", intercept + betas * X)