# Interacting Particle Markov Chain Monte Carlo
## Introduction & Context

This notebook presents the interacting particle markov chain monte carlo algorithm from http://proceedings.mlr.press/v48/rainforth16.html by Rainforth et al. (Proceedings of the 33rd International Conference on Machine Learning,). This algorithm is an extension of the family of particleMarkov chain Monte Carlo algorithms originally proposed in https://www.stats.ox.ac.uk/~doucet/andrieu_doucet_holenstein_PMCMC.pdf by Andrieu et al. .

### Mathmatical context

We will focus on a Markovian model even if the algorithm is not limited to this type of models.

We have $x_t$ the states of our model and $y_t$ our observation following:
- $ x_t | x_{1:t−1} ∼ f_t(x_t|x_{1:t−1}) $ the transition model 
- $ y_t | x_{1:t} ∼ g_t ( y_t | x_{1:t}) $ the observation model
- $ x_1 ∼ \mu(\cdot)$

Our goal is to compute expectation values with respect to the posterior distribution $ p(x_{1:T}|y_{1:T}) \propto \mu(x_1)\prod\limits_{t=2}^{T}f_t(x_t | x_{1:t-1}) \prod\limits_{t=1}^{T} g_t ( y_t | x_{1:t} ) $

### Difficulties
As explained in Andrieu et al., Sequential Monte Carlo (SMC) and Markov Chain Monte Carlo (MCMC) methods are often unreliable when the proposal distributions that are used to explore the space are poorly chosen and/or if highly correlated variables are updated independently. Particle Markov Chain Monte Carlo (PMCMC) try to solve theses issues using SMC methods to construct efficient proposals for the MCMC sampler. 

But SMC methods are affected by *path degeneracy*! And interacting particle Markov Chain Monte Carlo tries to solve it!


```TODO: Maybe describre more the path degeneracy issue```

Let's dive in the algorithm!

## iPMCMC
### Algorithm

To explore the iPMCMC algorithm we must first explore SMC and CSMC, as iPMCMC uses both to generate efficient proposals of $x_{1:T}$ to the MCMC sampler.

#### Sequential Monte Carlo

The main idea of SMC is to generate at each time step a new position for each particle in our system, but instead of generating the next position of a particle with the particle past we take the past of another particle with a discrete distribution over all particles based on the normalized weights of each particle. The final particle system approximate $p(x_{1:t}|y_{1:t})$

<img src="./images/algo1.PNG" width=500px></img>

#### Conditional Sequential Monte Carlo

CSMC is basically the same as SMC but with one particle with a fixed trajectory given by the user.

<img src="./images/algo2.PNG" width=500px></img>

#### Interaction Particle Markov Chain Monte Carlo

IPMCMC combines both previous algorithms as MCMC sampler.

<img src="./images/algo3.PNG" width=500px></img>

### Results

In [1]:
import numpy as np
from scipy.spatial.transform import Rotation as R
from tqdm import tqdm
from ipmcmc.generate_data import *
from ipmcmc.linear_gaussian_state_model import *
from ipmcmc.non_linear_gaussian_state_model import *
from ipmcmc.smc import *
from ipmcmc.csmc import *
from ipmcmc.ipmcmc import *
from ipmcmc.estimation import *    


We implemented the same models as those in the paper. Each one of them is implemented and generates observations and states in the two next cells

In [2]:
# 4.1. Linear Gaussian State Space Model
np.random.seed(420)
# Parameters
t_max = 50
n_particles = 100

r = R.from_rotvec(np.array([7*np.pi/10, 3*np.pi/10, np.pi/20]))
rotation_matrix = r.as_dcm()
scaling_matrix = 0.99*np.eye(3)
beta = np.random.dirichlet(np.ones(20)*0.2, 3).transpose()
alpha = scaling_matrix@rotation_matrix
t_max = 50
mu = np.array([0, 1, 1])
start_var = 0.1*np.eye(3)
omega = np.eye(3)
sigma = 0.1*np.eye(20)



l_transition_model = [LinearMu(default_mean=mu, default_cov=start_var)]+[LinearTransition(
    default_mean=np.zeros(3), default_cov=omega, default_alpha=alpha) for t in range(1, t_max)]
l_proposals = [LinearMu(default_mean=mu, default_cov=start_var)]+[LinearProposal(
    default_mean=np.zeros(3), default_cov=omega, default_alpha=alpha) for t in range(1, t_max)]
l_observation_model = [LinearObservation(default_mean=np.zeros(
    20), default_cov=sigma, default_beta=beta) for t in range(0, t_max)]

# If we want to change the parameters
assert np.all(np.linalg.eigvals(start_var) > 0)
assert np.all(np.linalg.eigvals(omega) > 0)
assert np.all(np.linalg.eigvals(sigma) > 0)

l_states, l_observations = linear_gaussian_state_space(
    t_max=t_max, mu=mu, start_var=start_var, transition_var=omega, noise_var=sigma,
    transition_coeffs=alpha, observation_coeffs=beta)

In [3]:
# 4.2. Nonlinear State Space Model
np.random.seed(420)
nl_mu = 0
start_std = np.sqrt(5)
omega = np.sqrt(10)
sigma = np.sqrt(10)

nl_transition_model = [NonLinearMu(default_mean=nl_mu, default_std=start_std)]+[NonLinearTransition(
    default_mean=0, default_std=omega) for t in range(1, t_max)]
nl_proposals = [NonLinearMu(default_mean=nl_mu, default_std=start_std)]+[
    NonLinearProposal(default_mean=0, default_std=omega) for t in range(1, t_max)]
nl_observation_model = [NonLinearObservation(
    default_mean=0, default_std=sigma) for t in range(0, t_max)]

nl_states, nl_observations = nonlinear_gaussian_state_space(
    t_max=t_max, mu=nl_mu, start_std=start_std, transition_std=omega, noise_std=sigma)

In [4]:
# ipmcmc run: works with both linear and non-linear models.
# It is pretty long to run, longer for the linear model which has 3-dimensional states.
# For the linear model, each MCMC step take approximately 90 secs, and 80 secs for 
# the non-linear, on our computers.

n_nodes = 32
n_conditional_nodes = 16
n_steps = 5

linear= True

if linear:
    print('init_conditional_traj')
    
    init_conditional_traj = np.zeros((n_conditional_nodes, t_max, len(mu)))
    for i in tqdm(range(n_conditional_nodes)):
        particles, _, _ = smc(l_observations, n_particles,
                              l_transition_model, l_proposals, l_observation_model)
        init_conditional_traj[i] = particles.mean(axis=1)

    print('running ipmcmc')
    particles, conditional_traj, weights, conditional_indices, zetas = ipmcmc(
        n_steps, n_nodes, n_conditional_nodes, l_observations, n_particles, init_conditional_traj,
        l_proposals, l_transition_model, l_observation_model)

else:
    print('init_conditional_traj')
    
    init_conditional_traj = np.zeros((n_conditional_nodes, t_max, 1))
    for i in tqdm(range(n_conditional_nodes)):
        particles, _, _ = smc(nl_observations, n_particles,
                              nl_transition_model, nl_proposals, nl_observation_model)
        init_conditional_traj[i] = particles.mean(axis=1)

    print('running ipmcmc')
    particles, conditional_traj, weights, conditional_indices, zetas = ipmcmc(
        n_steps, n_nodes, n_conditional_nodes, nl_observations, n_particles, init_conditional_traj,
        nl_proposals, nl_transition_model, nl_observation_model)
        

0%|          | 0/16 [00:00<?, ?it/s]init_conditional_traj
100%|██████████| 16/16 [06:16<00:00, 23.55s/it]
  0%|          | 0/5 [00:00<?, ?it/s]running ipmcmc
100%|██████████| 5/5 [1:12:38<00:00, 871.79s/it]


In [5]:
# Mean estimation for the linear model, using kalman filter and rts smoother as ground truth
# Make sure that the particles used are the one generated during a run of the ipmcmc sampler
# for the linear model

true_means, true_covs = compute_ground_truth(l_observations, mu, start_var, alpha, omega, beta, sigma)

rao_black_traj = rao_blackwellisation(particles, weights, zetas, n_conditional_nodes)

errors_function_of_mcmc_step = []
errors_function_of_state_step = []
for r in range(1, (n_steps+1)):
    errors_function_of_mcmc_step.append(compute_error(rao_black_traj, true_means, r))

for t in range(1, (t_max+1)):
    errors_function_of_state_step.append(compute_error(rao_black_traj, true_means, state_step=t))

In [6]:
errors_function_of_state_step

[0.6087836767532451,
 0.5797236469944885,
 0.43976157994280657,
 0.4519981890915921,
 0.4333007905070591,
 0.41271498054333183,
 0.46281021378480264,
 0.4989163472904759,
 0.5821235533863381,
 0.6952979627283103,
 0.7019233542381452,
 0.7665190133751794,
 0.9100403145157321,
 0.9387936982121174,
 0.9794673887990303,
 0.9920256986806754,
 0.9848671010138201,
 0.9881986657650579,
 1.0472151292068037,
 1.1001705060323324,
 1.0916033431488004,
 1.0801021462430807,
 1.135001721442927,
 1.1287864544056663,
 1.1012485992555356,
 1.1274594152998894,
 1.1337308648896427,
 1.1333330692468964,
 1.1399780871465364,
 1.1501291023291726,
 1.1425205380709555,
 1.163171664362532,
 1.21262166996278,
 1.2346160643823145,
 1.2653439087371197,
 1.3321834923212623,
 1.3739945605789396,
 1.457498045729405,
 1.5078898947854302,
 1.5561381229550661,
 1.61044061145952,
 1.6731106710149282,
 1.735444370482412,
 1.7931720832529445,
 1.8600898588744086,
 1.912391701009612,
 1.9696206951344657,
 2.0197200056022386

In [8]:
particles.shape

(5, 32, 50, 100, 3)