## Step 2: Dimension reduction

##### In this step, we perform dimension reduction of the interpolated trajectories using Probabilistic PCA (PPCA). PPCA considers the model $x \sim wz + \text{noise}$ and defines prior distributions over $z$, $w$, and $\text{noise}$ ($p(\theta)$), along with the joint distribution $p(x,\theta)$. We've implemented and deployed the ADVI algorithm to find the parameters of the variational distribution $q(\mu, \omega)$ approximating $p(\theta | x)$, enabling the projection of $x$ into a lower dimension $z$.

### 0- Importations

In [1]:
import pandas as pd 
import numpy as np

from src.advi_fcts import * 
from src.df_processing import * 

import warnings
warnings.filterwarnings('ignore')

### 1- Initialization

#### Load dataset and extract trajectories

In [2]:
# Select dataframe dimension (ie. number of trajectories)
nb_points = 195

# Load dataset
x = pd.read_csv('df/interpolation/interpolation_'+str(nb_points)+'.csv')

# Extract trajectories
dataset = extract_traj(x)

# Reshape trajectories
reshaped = np.array([i.reshape(-1) for i in dataset])

# Convert to tensor (tensorflow)
dataset = tf.cast(tf.transpose(tf.convert_to_tensor(reshaped)), tf.float32)

#### Parameter intialization

In [3]:
### DIMENSIONS ###

# Number of data points (trajectories)
num_datapoints = dataset.shape[1] #(equal to nb_points)

# Dimension of trajectories: 50 coordinates (x,y) => R^100
data_dim = dataset.shape[0]

# Reduced dimension (here 11 from article results)
latent_dim = 11

In [4]:
### ADVI PARAMETERS ###

# Number of samples for Monte Carlo integration
nb_samples = 30

# Learning rate for step-size computation
lr = 0.1

In [5]:
### MODEL DECLARATION ###

advi_model = ADVI_algorithm(data_dim, latent_dim, num_datapoints, dataset, nb_samples, lr)

### 3- Run ADVI model

In [6]:
mu, omega = advi_model.run_ADVI()

10 147988.0
20 274656.38
30 103773.31
40 -48501.375
50 -76832.81
60 -37223.016
70 30343.953
80 -11266.078
90 6467.875
100 974.77344
110 193.23438
120 -1655.9062
130 -594.46094
140 -115.39844
150 1021.09375
160 -78.25781
170 -1208.5078
180 -1126.0625
190 -668.5078
200 -109.33594
210 275.67188
220 -502.32812
230 -1274.2891
240 153.32031
250 -75.796875
260 -67.81641
270 -225.1211
280 28.867188
290 -254.76562
300 -49.98828
310 331.96094
320 -291.7422
330 -282.26953
340 -947.97656
350 -298.375
360 -1073.0977


### 4- Save results

In [8]:
# mu 
pd.DataFrame(advi_model.mu.numpy()).to_csv('df/results/mu_'+str(nb_points)+'.csv',index=False)

# omega 
pd.DataFrame(advi_model.omega.numpy()).to_csv('df/results/omega_'+str(nb_points)+'.csv',index=False) 

# ELBO evolution 
pd.DataFrame(np.array([i.numpy() for i in advi_model.elbo_evol])).to_csv('df/results/elbo_evol_'+str(nb_points)+'.csv')