# Getting Started with SYMPAIS
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ethanluoyc/sympais/blob/master/notebooks/getting_started.ipynb)

## Setup

In [None]:
try:
  import google.colab
  IN_COLAB = True
except:
  IN_COLAB = False

### Install SYMPAIS

In [None]:
# (TODO(yl): Simplify when we make this public)
GIT_TOKEN = ""
if IN_COLAB:
    !pip install -U pip setuptools wheel
    if GIT_TOKEN:
        !pip install git+https://{GIT_TOKEN}@github.com/ethanluoyc/sympais.git#egg=sympais
    else:
        !pip install git+https://github.com/ethanluoyc/sympais.git#egg=sympais

### Download and install pre-built RealPaver v0.4

In [None]:
if IN_COLAB:
    !curl -L "https://drive.google.com/uc?export=download&id=1_Im0Ot5TjkzaWfid657AV_gyMpnPuVRa" -o realpaver
    !chmod u+x realpaver
    !cp realpaver /usr/local/bin    

In [None]:
import jax
import jax.numpy as jnp

from sympais import tasks
from sympais import methods
from sympais.methods import run_sympais, run_dmc
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib
import numpy as onp
import math

%load_ext autoreload
%autoreload 2
%matplotlib inline

## Load a task

In [None]:
task = tasks.Sphere(nd=3)

In [None]:
task.profile

In [None]:
task.constraints

In [None]:
task.domains

## Run DMC baseline

In [None]:
dmc_output = run_dmc(task, seed=0, num_samples=int(1e8))
print(dmc_output)

## Run SYMPAIS

In [None]:
sympais_output = run_sympais(
    task,
    key=jax.random.PRNGKey(0), 
    num_samples=int(1e6),
    num_proposals=100,
    tune=False, 
    init='realpaver', 
    num_warmup_steps=500,
    window_size=100
)
print(sympais_output)

## Create your own problem

In this section, we will show how to implement a new probabilistic analysis
task similar to the sphere task above.

A probabilistic ananlysis `Task` consists of an input `Profile` $p(\mathbf{x})$ and a list of constraints `cs`. A user create a new `Task` either by calling the super class constructor or subclassing the base class.

Consider a two-dimensional problems where we would like to know the probablity that the 
the inputs $x \in [-10, 10]$ and $y \in [-10, 10]$ are jointly in the interior of a two-dimensional _cube_. The set of constraints is
$$
\begin{align}
    x + y &\leq 1.0, \\   
    x + y &\geq -1.0, \\
    y - x &\geq -1.0, \\
    y - x &\leq 1.0.
\end{align}
$$

First, let's import the related modules used for defining the tasks

In [None]:
import sympy
from sympais import tasks
from sympais import profiles
from sympais import distributions as dist

### Independent profile

We will first show how to define a task when the input variables are _independent_. 
We use `Profile` for defining the input distribution and SymPy expressions for defining the constraints.

The `Profile` uses the following iterface. To create a customized profile, the user needs to implement 
`profile.log_prob` and `profile.sample` functions. Note that unlike numpyro distributions, the samples 
are represented as a dictionary from variable names to their values. This is so that it is easier to integrate 
with a symbolic execution engine.

In [None]:
help(profiles.Profile)

When the input random variables are independent, we provide a convenience `IndependentProfile` class which allows you to specify the per-component distribution. `IndependentProfile` implements `sample` and `log_prob` by dispatching to the individual components and then aggretating the results.

We are now ready to define a task for the `cube` problem. The code is shown below.

In [None]:
class IndependentCubeTask(tasks.Task):
    def __init__(self):
        profile = profiles.IndependentProfile({
            "x": dist.Normal(loc=-2, scale=1),
            "y": dist.Normal(loc=-2, scale=1)
        })
        domains = {"x": (-10., 10.), "y": (-10., 10.)}
        b = 1.0
        x = sympy.Symbol("x")
        y = sympy.Symbol("y")
        c1 = x + y <= b   # type: sympy.Expr
        c2 = x + y >= -b  # type: sympy.Expr
        c3 = y - x >= -b  # type: sympy.Expr
        c4 = y - x <= b  # type: sympy.Expr
        super().__init__(profile, [c1, c2, c3, c4], domains)

Let us create some helper functions for visualizing the profile and the constraints.

In [None]:
b = 1.

def f1(x):
    return b - x
def f2(x):
    return -b - x
def f3(x):
    return -b + x
def f4(x):
    return b + x

x = sympy.Symbol('x')
x1, = sympy.solve(f1(x)-f3(x))
x2, = sympy.solve(f1(x)-f4(x))
x3, = sympy.solve(f2(x)-f3(x))
x4, = sympy.solve(f2(x)-f4(x))

y1 = f1(x1)
y2 = f1(x2)
y3 = f2(x3)
y4 = f2(x4)

N = 200
X, Y = jnp.meshgrid(jnp.linspace(-4,4,N), jnp.linspace(-4, 4, N))
xr = jnp.linspace(-3, 3, 100)


def plot_constraints(ax):
    ax.plot(x1, y1, 'k', markersize=5)
    ax.plot(x2, y2, 'k', markersize=5)
    ax.plot(x3, y3, 'k', markersize=5)
    ax.plot(x4, y4, 'k', markersize=5)
    ax.fill([x1,x2,x4,x3],[y1,y2,y4,y3],'gray', alpha=0.5);

    y1r = f1(xr)
    y2r = f2(xr)
    y3r = f3(xr)
    y4r = f4(xr)
    ax.plot(xr, y1r, 'w--')
    ax.plot(xr, y2r, 'w--')
    ax.plot(xr, y3r, 'w--')
    ax.plot(xr, y4r, 'w--')

In [None]:
cube_task = IndependentCubeTask()

In [None]:
logp = cube_task.profile.log_prob(
        {'x': X.reshape(-1), "y": Y.reshape(-1)}).reshape((N, N))

fig, ax = plt.subplots(1, 1, figsize=(3,3))
ax.contourf(X, Y, logp, levels=20, cmap='Blues_r')

plot_constraints(ax)
ax.set(xlim=(-3,2), ylim=(-3,2), xlabel='$x$', ylabel='$y$');

### Correlated profile

In the general case, the inputs may be correlated. In this case, the user needs to provide a custom implementation
of `Profile`. We will show how to do this for the case where $x$ and $y$ are jointly Gaussian.

In [None]:
from numpyro import distributions as numpyro_dist

class CorrelatedProfile(profiles.Profile):
    def __init__(self):
        self._dist = numpyro_dist.MultivariateNormal(
            loc=jnp.array([-2, -2]), covariance_matrix=jnp.array([[1.0, 0.8], [0.8, 1.5]])
        )
    def sample(self, rng, sample_shape=()):
        samples = self._dist.sample(rng, sample_shape=sample_shape)
        # We needs the [..., ] to maintain batch dimensions.
        return {'x': samples[..., 0], 'y': samples[..., 1]}
    
    def log_prob(self, samples):
        samples = jnp.stack([samples['x'], samples['y']], -1)
        return self._dist.log_prob(samples)
    
class CorrelatedCubeTask(tasks.Task):
    def __init__(self):
        b = 1.0
        x = sympy.Symbol("x")
        y = sympy.Symbol("y")
        c1 = x + y <= b   # type: sympy.Expr
        c2 = x + y >= -b  # type: sympy.Expr
        c3 = y - x >= -b  # type: sympy.Expr
        c4 = y - x <= b  # type: sympy.Expr
        profile = CorrelatedProfile()
        domains = {"x": (-10., 10.), "y": (-10., 10.)}
        super().__init__(profile, [c1, c2, c3, c4], domains)

All of the benchmarks are define similarly to the examples shown above. 
If you are interested, check our the source code in src/sympais/tasks for more examples.

In [None]:
correlated_cube_task = CorrelatedCubeTask()
logp = correlated_cube_task.profile.log_prob(
        {'x': X.reshape(-1), "y": Y.reshape(-1)}).reshape((N, N))

fig, ax = plt.subplots(1, 1, figsize=(3,3))
ax.contourf(X, Y, logp, levels=20, cmap='Blues_r')
plot_constraints(ax)
ax.set(xlim=(-3,2), ylim=(-3,2), xlabel='$x$', ylabel='$y$');

### Run samplers

Now we have our new task definitions, let's run DMC and SYMPAIS on these tasks.

In [None]:
dmc_output = run_dmc(correlated_cube_task, seed=0, num_samples=int(1e8), batch_size=int(1e6))
print(dmc_output)

In [None]:
sympais_output = run_sympais(
    correlated_cube_task,
    key=jax.random.PRNGKey(0), 
    num_samples=int(1e6),
    num_proposals=100,
    tune=False, 
    init='realpaver', 
    num_warmup_steps=500,
    window_size=100
)
print(sympais_output)