In [7]:
import pymc3 as pm

# Some PyMC3 basics

A PyMC model is defined by relating unknown variables to observed data. Think of it as associating the observed data with a proposal for the data generation scheme that led to it.

## Model Context

PyMC3 uses python's `with...as...:` syntax to create a model context. It's a bit like `using(...) {...}` in C#.

```
with EXPRESSION as BLAH:
    SOME CODE
```

...is equivalent to...

```
BLAH = EXPRESSION
BLAH.__enter__()
try:
    SOME CODE
finally:
    BLAH.__exit__()
```

In PyMC3 any variables created in the context of a new model (`with pm.Model() as model:`) or an existing model (`with existing_model:`) are automatically associated with the model. You get an error if you create a PyMC variable outside the context of a model.

## Variables

PyMC3 variables have an initial value used as the starting point of the sampling. `tag.test_value` can be used to query the starting point. You can specify a particular starting value using the `testval` parameter...

In [41]:
with pm.Model() as model:
    var1 = pm.Exponential('var1', 3)
    var2 = pm.Exponential('var2', 3, testval=0.4)
    
print(var1.tag.test_value)
print(var2.tag.test_value)

0.2310490608215332
0.4


PyMC3 variables can be *Stochastic* or *Deterministic*...

* *Stochastic:* even if we know all of its input parameters we still don't know its value, i.e., it's still a random variable. For example, a Poisson random variable - even if we know the value of its parameter $\lambda$ it's still a random variable with a range of possible values. Other examples are Discrete Uniform and Exponential random variables (in fact, any of the probability distributions we've already discussed - and many we haven't).
* *Deterministic:* its value can be determined exactly if we know all of its input parameters.

Stochastic variables are created with a name argument of type string and other parameters specific to each type of variable. For example...

`some_variable = pm.DiscreteUniform("discrete_uni_var", 0, 4)`

...where $0$ and $4$ are the upper & lower bound parameters for the discrete uniform distribution.

Note that the name string is used as a key for the [backend](https://docs.pymc.io/api/backends.html) and for display purposes in various plots, graphs & traces you can get out of PyMC3. It's usual to use the name of the object variable for the name argument - but it doesn't have to be!

For convenience, the `shape` parameter allows you to create an array of independent stochastic variables instead of creating arbitrary names and variables for each one. For example, instead of...


    beta_1 = pm.Uniform("beta_1", 0, 1)
    beta_2 = pm.Uniform("beta_2", 0, 1)
    ...
    
...use a single variable...

    betas = pm.Uniform("betas", 0, 1, shape=N)

In [44]:
with pm.Model() as model:
     betas = pm.Uniform("betas", 0, 1, shape=4)

betas.tag.test_value

array([0.5, 0.5, 0.5, 0.5])

### A tangent: Theano...

Before looking at how to create deterministic variables in PyMC3 it's useful to know a little bit about the [Theano library](http://deeplearning.net/software/theano/index.html) as PyMC3 makes heavy use of it and you often need some of its functionality when creating deterministic variables.

Theano separates the definition and evaluation of mathematical expressions. And, more crucially, allows for optimisation of the expression before evaluating it. Instead of evaluating the epxresisons immediately it builds up a "compute graph" (think of it like a kind of lazy evaluation) and once it has the entire graph it can then apply some optimisations - such as arithmetic simplification, compiling to C, or making use of the GPU.

### ...back to variables: creating deterministic variables

A deterministic variable is defined in a similar way to a stochastic variable using the `Deterministic` class and passing in a function which defines the calculation for determining the value of the variable.

This is why you need to know about Theano. The functions you create to define a deterministic variable must use theano functions to build up the symbolic graph. It's easy to forget this as many of the theano functions just look like normal operations & function calls, but if you used a function from any old library in a deterministic variable definition it would evaluate it immediately rather than add it to the "compute graph".

In the example in the previous chapter, where we defined the deterministic variable using `lambda_t = pm.math.switch(tau > idx, e, l)` - the `pm.math.switch` function comes from the Theano library.

Here's a very simple example:

In [10]:
def subtract(x,y):
    return x-y

with pm.Model() as model:
    stochastic_1 = pm.Uniform("U_1", 0, 1)
    stochastic_2 = pm.Uniform("U_2", 0, 1)
    
    det_1 = pm.Deterministic("Delta", subtract(stochastic_1, stochastic_2))

Note that the variable passed into the `subtract` function are PyMC3 stochastic variables and the `-` operator will be a Theano operator.

In [29]:
import theano.tensor as tt

with pm.Model() as theano_test:
    p1 = pm.Uniform("p", 0, 1)
    p2 = 1 - p1
    # p = tt.stack([p1, p2])
    print(type(tt.stack([p1, p2])))
    #assignment = pm.Categorical("assignment", p1, p2)

<class 'theano.tensor.var.TensorVariable'>


* `assignment = pm.Categorical("assignment", [p1,p2])` error cos it needs a theano varialb