Potential bug with (cmd)stan(py)’s optimisation process running a probabilistic PCA #664

InfProbSciX · 2023-03-26T21:14:09Z

Posting here as the stan forums post didn't get any replies. I would like to understand why I’m seeing the results that I’m seeing running a PCA with Stan on some simulated data.

The data: is nine different mnist digits rotated a bunch of times (the issue below also occurs with different datasets).

The model: Y | X ~ N(0, XX^T + vI)
(this is matrix normal notation for every column of Y being independently drawn from a zero mean multivariate normal with a dot product covariance function applied to the latents).

The model famously corresponds to probabilistic PCA (the maximum likelihood solution for $\mathbf{X} | \mathbf{Y}$ is attained at the scaled eigenvectors of the sample covariance).

The problem: If I use unconstrained latents, I get a different solution to if I use constrained latents. This is very surprising - is this caused by numerical instabilities? If so, why?

If I set the line (in my parameters block below) as follows:

    matrix<lower=-100, upper=100>[n, q] X;  // latents

I obtain the latents:

This are incorrect looking latents.
If instead the line is changed to:

    matrix[n, q] X;  // latents

the result becomes:

which is absolutely fine - it’s a rotation of the eigenvectors and since the posterior is invariant to rotation, this makes sense.
I don’t understand this behaviour assuming that Stan places an improper uniform prior on the constrained parameters.

What’s more is that, using a bridgestan model that uses the constrained parameters, and using scipy’s optimize to optimize the parameters works and generates the correct solution (and not the weird axis aligned solution). Could someone shed some light into what’s going on?

I’m using cmdstan 2.31.0.

My code (a minimal reproducing example) is as follows:

import torch
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from cmdstanpy import CmdStanModel
from torchvision.transforms.functional import rotate
from tensorflow.keras.datasets.mnist import load_data

plt.ion(); plt.style.use('seaborn-pastel')
np.random.seed(42)

with open('dr.stan', 'w') as f: f.write("""
data {
    int n;  // num data
    int d;  // num data dims
    int q;  // num latents
    matrix[n, d] Y;  // data
}
transformed data {
    vector[d] w = rep_vector(1.0, d);
}
parameters {
    matrix<lower=-100, upper=100>[n, q] X;  // latents
    real<lower=1e-6, upper=1> sigma_sq;
}
model {
    Y' ~ multi_gp(add_diag(X * X', sigma_sq), w);
}
""")

def plot(X):
    plot_df = pd.DataFrame(dict(x=X[:, 0], y=X[:, 1], hue=c.astype(str)))
    plot_df = plot_df.set_index('hue').sort_index().reset_index()
    sns.scatterplot(data=plot_df, x='x', y='y', hue='hue', palette='Paired')

if __name__ == '__main__':

    (_, _), (Y, c) = load_data()

    c = np.arange(10)
    Y = np.concatenate([Y[[np.where(c == lb)[0][0]], ...] for lb in c], axis=0)
    Ys = []
    for Yi in Y:
        for angle in np.linspace(0, 350, 25):
            Ys.append(rotate(torch.tensor(Yi)[None, ...], angle))
    Y = torch.cat(Ys, axis=0).numpy().reshape(-1, 28**2)/255
    c = np.repeat(c, 25)

    Y += np.random.normal(0, 0.05, size=Y.shape)
    Y -= Y.mean(axis=0)
    Y -= Y.mean(axis=1)[..., None]

    (n, d), q = Y.shape, 2

    pd.Series(dict(
        Y=Y, n=n, d=d, q=2,
    )).to_json('data.json')

    model = CmdStanModel(stan_file='dr.stan')
    fit = model.optimize(data='data.json', show_console=True, iter=int(1e5), seed=42)

    X = fit.stan_variable('X')
    plot(X)

The text was updated successfully, but these errors were encountered:

WardBrian · 2023-03-26T21:16:19Z

What’s more is that, using a bridgestan model that uses the constrained parameters, and using scipy’s optimize to optimize the parameters works and generates the correct solution

Just to check, is this bridgestan version being done with Jacobian=True?

InfProbSciX · 2023-03-26T21:21:55Z

I didn't change that flag, what I did was something like (this - the code below isn't a MWE... can provide one if you'd like)

import bridgestan as bs
from scipy.optimize import minimize
bs.set_bridgestan_path('/Users/aditya/github/bridgestan')

bs_model = bs.StanModel.from_stan_file("dr.stan", "data.json")

minimize(
    lambda x: -bs_model.log_density(x),
    np.hstack([X.reshape(-1), 0.0]),
)

WardBrian · 2023-03-26T21:29:08Z

Stan's optimization by default does not apply jacobian adjustments (this will be configurable next release: #649), but the default in bridgestan is to apply them.

I suspect this is the cause of the confusion, since the statement "Stan places an improper uniform prior on the constrained parameters" is only true in the presence of the jacobian adjustment for the change of variables.

What happens if you set jacobian=False in your bridgestan example?

InfProbSciX · 2023-03-26T22:04:14Z

Damn, great catch - I assumed that stan was adding the jacobians when automated transforms are used. I tried this with bridgestan with your suggestion, and it does indeed produce quite similar results to cmdstanpy.

I'm curious, does this mean that

real<lower=0, upper=1> x;
...
x ~ uniform(0, 1);

doesn't place a uniform density over x currently? (I'd guess that a constant is added to the target and not the jacobian here). It's a very subtle point, but is very interesting as constrains do weird things to posteriors in terms of sparsity, etc.

InfProbSciX closed this as completed Mar 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential bug with (cmd)stan(py)’s optimisation process running a probabilistic PCA #664

Potential bug with (cmd)stan(py)’s optimisation process running a probabilistic PCA #664

InfProbSciX commented Mar 26, 2023 •

edited

WardBrian commented Mar 26, 2023

InfProbSciX commented Mar 26, 2023 •

edited

WardBrian commented Mar 26, 2023 •

edited

InfProbSciX commented Mar 26, 2023

Potential bug with (cmd)stan(py)’s optimisation process running a probabilistic PCA #664

Potential bug with (cmd)stan(py)’s optimisation process running a probabilistic PCA #664

Comments

InfProbSciX commented Mar 26, 2023 • edited

WardBrian commented Mar 26, 2023

InfProbSciX commented Mar 26, 2023 • edited

WardBrian commented Mar 26, 2023 • edited

InfProbSciX commented Mar 26, 2023

InfProbSciX commented Mar 26, 2023 •

edited

InfProbSciX commented Mar 26, 2023 •

edited

WardBrian commented Mar 26, 2023 •

edited