In [24]:
import tensorflow as tf
import tensorflow.contrib.eager as tfe
tfe.enable_eager_execution()
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly.graph_objs as go
from IPython.display import display
import datetime
init_notebook_mode(connected=True)
from kronecker import KroneckerSolver
from likelihoods import PoissonLike
import data_utils as sim
import numpy as np
from kernels import RBF
from grid_utils import fill_grid
from plotly import tools
from thinnedEvents_eager import run_thinnedEventsSolver, ThinnedEventsSampler
from matplotlib import pyplot as plt

## Models

Given a set of points $\{x_i\}_{i=1}^N$, we're interested in the following type of model:

$$f \sim ~\mathcal{GP}(\mu(x), K(x, x))$$

$$ y(x_i) \sim ~ \mathcal{l}(f(x_i))$$

where $\mathcal{GP}(\mu(x), K(x, x))$ denotes a Gaussian process with mean $\mu$ and covariance kernel $K$, and $l$ denotes some likelihood. We primarily work with grid-structured data (this will be relevant at the inference step).

This is what draws of $f$ and $y$ look like. Given f, below we draw y from a Poisson
$$y_i \sim Poisson(\exp(f(x_i) + \epsilon))$$

where $\epsilon \sim \mathcal{N}(0, 1)$

In [2]:
X = sim.sim_X_equispaced(D = 2, N_dim = 30)
f = sim.sim_f(X, k=RBF(variance=1.0, length_scale=30.))
y = sim.poisson_draw(f, 1.)

trace_func = go.Scatter3d(x = X[:,0], y = X[:,1], z=f, mode = 'markers', marker=dict(size = 2,))
trace_draws = go.Scatter3d(x = X[:,0], y = X[:,1], z=y, mode = 'markers', marker=dict(size = 2,))
fig = tools.make_subplots(rows=1, cols=2, specs=[[{'is_3d': True}, {'is_3d': True}]])
fig.append_trace(trace_func, 1, 1)
fig.append_trace(trace_draws, 1, 2)
iplot(fig)

This is the format of your plot grid:
[ (1,1) scene1 ]  [ (1,2) scene2 ]




covariance is not positive-semidefinite.



We also worked with continuous time inhomogeneous Poisson processes. Given a set of event times $\{x_i\}_{i=1}^N$, we are interested in learning the intensity of an inhomogeneous PP via GP priors. We introduce a random scalar function which has a Gaussian process prior. This function is transformed into a random intensity function via the following transform 

> $$\lambda(s) = \lambda^{*} \sigma(g(s))$$

Given times where the events occurred, we can learn a smooth function over them. We also introduce a set of latent variables which are the locations of the thinned events. Basically, the idea is to introduce locations where less events have occurred.

Below, we use the following intensity function to sample data

> $$ \lambda_1(s) = 2 \exp\{-s/15\} + \exp\{-((s-25)/10)^2\}$$

In [31]:
kern = RBF(variance=1.0, length_scale=5.0)
def f(x):
    return 2*np.exp(-x/15) + np.exp(-((x-25)/10.0)**2)

sampler = ThinnedEventsSampler(f_lambda=f, kern=kern, measure=50, rate=2, dim=1, N_dim=100)
locations = sampler.S_k.flatten()

In [32]:
true_val = go.Scatter(x = sampler.S.flatten(), y = sampler.Z.flatten(), mode = 'markers', marker=dict(size = 3,))
init_val = go.Scatter(x = sampler.S_k.flatten(), y = sampler.G_k.flatten(), mode = 'markers', marker=dict(size = 3,))
fig = tools.make_subplots(rows=1, cols=2)
fig.append_trace(true_val, 1, 1)
fig.append_trace(init_val, 1, 2)
iplot(fig)

This is the format of your plot grid:
[ (1,1) x1,y1 ]  [ (1,2) x2,y2 ]



## Dataset

We worked with a dataset from the Federal Election Commission. The data includes information on individual contributions to political campaigns. Here's spatial data of the number of donations to Hillary Clinton in the 2016 election cycle. 

In [5]:
X_grid_fec = np.genfromtxt('data/X_grid.csv', delimiter = ',')
fec_counts = np.genfromtxt('data/y_hillary.csv', delimiter = ',')
obs_idx_fec = np.genfromtxt('data/obs_idx.csv', delimiter = ',', dtype = np.int32)
trace_fec_counts = go.Scatter3d(x = X_grid_fec[obs_idx_fec,1], y = X_grid_fec[obs_idx_fec,0], z=fec_counts, mode = 'markers', marker=dict(size = 2,))
iplot([trace_fec_counts])

We also worked with time series data. Here is data for all the days in 2015-2016 on which Facebook made a political contribution.

In [22]:
fb_dons = np.genfromtxt('data/facebook_donations.csv', delimiter = ',')
dates = np.array([datetime.datetime(2015, 1, 1) + datetime.timedelta(days=x) for x in range(0, 365*2)])
trace_fb_events = go.Scatter(x = dates, y = fb_dons[:,1], mode = 'markers', marker=dict(size = 3,))
iplot([trace_fb_events])

## Inference

We focused on developing general inference for the above type of models. Our implemented inference takes in the following:

<br>

* any differentiable, log-concave likelihood function $l$ (below we use a Poisson)


* a kernel function $k$ that decomposes as $k(x, x') = \prod_d k_d(x, x')$


* grid data $X$, and observations $y$.

<br>

By requiring X to lie on a grid (or a partial grid, as we'll see), we can avoid costly inversions of a $n \times n$ covariance matrix $K$ that are required by traditional GP inference.

<br>

The inference procedure outputs a Laplace approximation of the posterior of $f$. First, let's try to recover the function from the simulated data on a grid.

In [6]:
mu = tf.ones([X.shape[0]], tf.float32)*np.mean(np.log(y))
kern = RBF(variance=1.0, length_scale=30.)
likelihood = PoissonLike()
y_tf = tfe.Variable(y, dtype = tf.float32)

ks_sim = KroneckerSolver(mu, kern, likelihood, X, y_tf, verbose = True)
ks_sim.run(10)

trace_inferred = go.Scatter3d(x = X[:,0], y = X[:,1], z= np.array(ks_sim.f), mode = 'markers', marker=dict(size = 2,))
fig = tools.make_subplots(rows=1, cols=3, specs=[[{'is_3d': True}, {'is_3d': True}, {'is_3d': True}]])
fig.append_trace(trace_func, 1, 1)
fig.append_trace(trace_draws, 1, 2)
fig.append_trace(trace_inferred, 1, 3)
iplot(fig)

('Iteration: ', <tf.Variable 'Variable:0' shape=() dtype=int32, numpy=0>)
(' psi: ', <tf.Tensor: id=390918, shape=(), dtype=float32, numpy=-1820545.0>)
('step', 0.5)

('Iteration: ', <tf.Tensor: id=400671, shape=(), dtype=int32, numpy=1>)
(' psi: ', <tf.Tensor: id=400721, shape=(), dtype=float32, numpy=-1953004.9>)
('step', 2.0)

('Iteration: ', <tf.Tensor: id=424604, shape=(), dtype=int32, numpy=2>)
(' psi: ', <tf.Tensor: id=424654, shape=(), dtype=float32, numpy=-1969901.6>)
('step', 1.0)

('Iteration: ', <tf.Tensor: id=448136, shape=(), dtype=int32, numpy=3>)
(' psi: ', <tf.Tensor: id=448186, shape=(), dtype=float32, numpy=-1976942.8>)
('step', 1.0)

('Iteration: ', <tf.Tensor: id=468434, shape=(), dtype=int32, numpy=4>)
(' psi: ', <tf.Tensor: id=468484, shape=(), dtype=float32, numpy=-1977042.4>)
('step', 2.0)

('Iteration: ', <tf.Tensor: id=490123, shape=(), dtype=int32, numpy=5>)
(' psi: ', <tf.Tensor: id=490173, shape=(), dtype=float32, numpy=-1977044.9>)
('step', 0.5)

('Iterat

We can also do inference if we only observe part of the grid structure.

In [7]:
X_part, y_part = sim.rand_partial_grid(X, y, 0.1)
X_full, y_full, obs_idx, imag_idx = fill_grid(X_part, y_part)
y_tf = tfe.Variable(y_full[obs_idx], dtype = tf.float32)
mu = tf.ones([X_full.shape[0]], tf.float32)*np.mean(np.log(y_full[obs_idx]))
color = np.zeros(X_full.shape[0])
color[obs_idx] = 1.0

ks_part = KroneckerSolver(mu, RBF(variance=1.0, length_scale=30.0),
                     PoissonLike(), X_full, y_tf, obs_idx=obs_idx, verbose = True)
ks_part.run(10)

trace_partial_obs = go.Scatter3d(x = X_full[obs_idx, 0], y = X_full[obs_idx, 1], z= y[obs_idx], mode = 'markers', marker=dict(size = 2))
trace_partial_inf = go.Scatter3d(x = X_full[:, 0], y = X_full[:, 1], z= np.array(ks_part.f_pred), mode = 'markers', marker=dict(size = 2, color = color))

fig = tools.make_subplots(rows=1, cols=3, specs=[[{'is_3d': True}, {'is_3d': True}, {'is_3d': True}]])
fig.append_trace(trace_func, 1, 1)
fig.append_trace(trace_partial_obs, 1, 2)
fig.append_trace(trace_partial_inf, 1, 3)
iplot(fig)

('Iteration: ', <tf.Variable 'Variable:0' shape=() dtype=int32, numpy=0>)
(' psi: ', <tf.Tensor: id=582356, shape=(), dtype=float32, numpy=-222800.14>)
('step', 0.25)

('Iteration: ', <tf.Tensor: id=585126, shape=(), dtype=int32, numpy=1>)
(' psi: ', <tf.Tensor: id=585184, shape=(), dtype=float32, numpy=-256433.16>)
('step', 1.0)

('Iteration: ', <tf.Tensor: id=588614, shape=(), dtype=int32, numpy=2>)
(' psi: ', <tf.Tensor: id=588672, shape=(), dtype=float32, numpy=-261556.86>)
('step', 2.0)

('Iteration: ', <tf.Tensor: id=593128, shape=(), dtype=int32, numpy=3>)
(' psi: ', <tf.Tensor: id=593186, shape=(), dtype=float32, numpy=-261987.84>)
('step', 1.0)

('Iteration: ', <tf.Tensor: id=599317, shape=(), dtype=int32, numpy=4>)
(' psi: ', <tf.Tensor: id=599375, shape=(), dtype=float32, numpy=-262311.19>)
('step', 1.0)

('Iteration: ', <tf.Tensor: id=606090, shape=(), dtype=int32, numpy=5>)
(' psi: ', <tf.Tensor: id=606148, shape=(), dtype=float32, numpy=-262314.19>)
('step', 0.0)

This is

In [37]:
sampler, s_i, val = run_thinnedEventsSolver()

inf_val = go.Scatter(x = s_i.flatten(), y = val.flatten(), mode = 'markers', marker=dict(size = 3,))
fig = tools.make_subplots(rows=1, cols=3)
fig.append_trace(true_val, 1, 1)
fig.append_trace(init_val, 1, 2)
fig.append_trace(inf_val, 1, 3)
iplot(fig)

This is the format of your plot grid:
[ (1,1) x1,y1 ]  [ (1,2) x2,y2 ]  [ (1,3) x3,y3 ]



Inference on spatial dataset:

In [8]:
mu = tf.ones([X_grid_fec.shape[0]], tf.float32)*np.mean(np.log(fec_counts))
ks_fec = KroneckerSolver(mu, RBF(variance=5.0, length_scale=3.),
                     PoissonLike(), X_grid_fec, tf.constant(fec_counts, tf.float32), obs_idx=obs_idx_fec, verbose = True)
ks_fec.run(5)
color = np.zeros(X_grid_fec.shape[0])
color[obs_idx_fec] = 1.0

fec_grid = go.Scatter(x = X_grid_fec[:,1], y = X_grid_fec[:,0], mode = 'markers', marker=dict(size = 2, color = color))
fec_func = go.Scatter3d(x = X_grid_fec[obs_idx_fec,1], y = X_grid_fec[obs_idx_fec,0], z=ks_fec.f_pred.numpy()[obs_idx_fec], mode = 'markers', marker=dict(size = 2))
fig = tools.make_subplots(rows=1, cols=3, specs=[[{'is_3d': False}, {'is_3d': True}, {'is_3d': True}]])
fig.append_trace(fec_grid, 1, 1)
fig.append_trace(trace_fec_counts, 1, 2)
fig.append_trace(fec_func, 1, 3)
iplot(fig)

('Iteration: ', <tf.Variable 'Variable:0' shape=() dtype=int32, numpy=0>)
(' psi: ', <tf.Tensor: id=614081, shape=(), dtype=float32, numpy=-15185813.0>)
('step', 0.00390625)

('Iteration: ', <tf.Tensor: id=651774, shape=(), dtype=int32, numpy=1>)
(' psi: ', <tf.Tensor: id=651831, shape=(), dtype=float32, numpy=-19119524.0>)
('step', 0.0625)

('Iteration: ', <tf.Tensor: id=694139, shape=(), dtype=int32, numpy=2>)
(' psi: ', <tf.Tensor: id=694196, shape=(), dtype=float32, numpy=-24446324.0>)
('step', 0.5)

('Iteration: ', <tf.Tensor: id=776085, shape=(), dtype=int32, numpy=3>)
(' psi: ', <tf.Tensor: id=776142, shape=(), dtype=float32, numpy=-25809258.0>)
('step', 2.0)

('Iteration: ', <tf.Tensor: id=890891, shape=(), dtype=int32, numpy=4>)
(' psi: ', <tf.Tensor: id=890948, shape=(), dtype=float32, numpy=-26246752.0>)
('step', 1.0)

This is the format of your plot grid:
[ (1,1) x1,y1 ]   [ (1,2) scene1 ]  [ (1,3) scene2 ]



Inference on Temporal Dataset

## Model Selection and Criticism

A key part of model selection in GP inference is chosing kernel hyperparameters. This is usually done by optimizing the marginal likelihood. We've implemented marginal likelihood calculations – optimization is still in the works

Using the below (evaluating marginal likelihoods with different kernel hyperparameters), we could do hyperparameter selection by training a set of models.

In [9]:
ks_fec.marginal()

<tf.Tensor: id=1095405, shape=(), dtype=float32, numpy=26246700.0>

We can also simulate data from our learned model

In [10]:
simulated_counts = np.random.poisson(np.exp(ks_fec.f_pred.numpy()))
fec_sim = go.Scatter3d(x = X_grid_fec[obs_idx_fec,1], y = X_grid_fec[obs_idx_fec,0], z=simulated_counts[obs_idx_fec], mode = 'markers', marker=dict(size = 2))

fig = tools.make_subplots(rows=1, cols=2, specs=[[{'is_3d': True}, {'is_3d': True}]])
fig.append_trace(trace_fec_counts, 1, 1)
fig.append_trace(fec_sim, 1, 2)
iplot(fig)

This is the format of your plot grid:
[ (1,1) scene1 ]  [ (1,2) scene2 ]



## Next steps and references

Here are some potential next steps for this work

* Continuous space inhomogeneous point processes
* Marginal likelihood optimization
* Variance approximations
* Inference with inducing points
* Test on more complex custom likelihoods


Adams et al, Tractable Nonparametric Bayesian Inference in Poisson Processes with Gaussian Process Intensities, Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada, 2009

Flaxman et al, Fast Kronecker Inference in Gaussian Processes with non-Gaussian Likelihoods, Proceedings of the 32nd$$ International Conference on Machine Learning, Lille, France, 2015

Wilson and Nickisch, Kernel Interpolation for Scalable Structured Gaussian Processes (KISS-GP), Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 2015

Wilson, Dann, Nickish, Thoughts on Massively Scalable Gaussian Processes, 2015