In [1]:
import numpy as np
import pandas as pd
from LPCA import LogisticPCA

We create a toy binary dataset and artificially assign probability to different points. 
P(0, 1) = 0.4, P(1, 1) = 0.3, P(1, 0) = 0.2, P(0, 0) = 0.1

In [2]:
matrix = np.array([[0, 1],
                   [0, 1],
                   [0, 1],
                   [0, 1],
                   [1, 1], 
                   [1, 1], 
                   [1, 1],
                   [1, 0],
                   [1, 0],
                   [0, 0]])

We now can train a Logistic PCA dimension reducer on this data. We will reduce the data from two dimensions to one.

In [3]:
lpca = LogisticPCA(m=6, k=1, verbose=True)
lpca.fit(matrix)

Iteration: 0
Percent of Deviance Explained: 62.359%
Log Likelihood: -4.69

Reached Convergence on Iteration #12
Training Complete. Converged Reached: True
Percent of Deviance Explained: 65.12607499916703 %
Total Training Time: 0.0024847984313964844


After fitting, we can call transform to reduce a new matrix of observations into the natural parameters on the lower dimensional subspace. Taking the sigmoid of these natural parameters brings the reduced values from the natural parameter space back to our feature space (though still at a lower dimension)

In [4]:
matrix = np.array([[0, 1], 
                   [1, 1],
                   [1, 0],
                   [0, 0]])

transformed = lpca.transform(matrix)
sig = lpca.sigmoid(transformed)
print(sig)

[[0.99752738]
 [0.46535796]
 [0.00247262]
 [0.52580888]]


We can reconstruct the original data using the reconstruct method. This will take the reduced natural parameters and convert them into unreduced natural parameters. Again, we will take the sigmoid of the outputs to bring it back to the feature space. 

In [5]:
reconstructed = lpca.reconstruct(transformed)
sig = lpca.sigmoid(reconstructed)
print(sig)

[[0.00653723 0.99752738]
 [0.75905435 0.73243713]
 [0.99752738 0.00798803]
 [0.72589093 0.7641857 ]]


We can see most observations were reconstructed fairly well, except for (0, 0) which is blatantly incorrect. This is because in our toy example the model has so few degrees of freedom and the input (0, 0) held little weight in model training. In reality, reducing large datasets explains far more deviance and has more degrees of freedom to capture all relationships.