Copyright 2023-2023 Lawrence Livermore National Security, LLC and other MuyGPyS
Project Developers. See the top-level COPYRIGHT file for details.

SPDX-License-Identifier: MIT

# Shear Kernel 2x3 Investigation

This notebook demonstrates how to use the specialized lensing shear kernel (hard-coded to RBF at the moment).
In particular, this notebook investigates differences between the 2x3 kernel and the 3x3 variant in predictions of the $\kappa$ convergence parameter, which appears to have an additive offset that we do not yet understand.

⚠️ _Note that this is still an experimental feature._ ⚠️

In [None]:
import copy
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import cm
from matplotlib.colors import LogNorm, SymLogNorm

from MuyGPyS._test.shear import (
    conventional_Kout,
    conventional_mean,
    conventional_variance,
    conventional_shear,
    targets_from_GP,
)
from MuyGPyS.gp import MuyGPS
from MuyGPyS.gp.deformation import DifferenceIsotropy, F2
from MuyGPyS.gp.hyperparameter import Parameter
from MuyGPyS.gp.kernels.experimental import ShearKernel, ShearKernel2in3out
from MuyGPyS.neighbors import NN_Wrapper
from MuyGPyS.gp.noise import HomoscedasticNoise

We will set a random seed here for consistency when building docs.
In practice we would not fix a seed.

In [None]:
np.random.seed(2)

In [None]:
my_cmap = copy.copy(cm.get_cmap('viridis'))
my_cmap.set_bad("white")
# my_sym_cmap = copy.copy(cm.get_cmap('coolwarm'))
# my_sym_cmap.set_bad((0, 0, 0))

## Data preparation

Here we simulate some simple data from a GP prior using the 3x3 shear kernel.

In [None]:
n = 25  # number of galaxies on a side
xmin = 0
xmax = 1
ymin = 0
ymax = 1

xx = np.linspace(xmin, xmax, n)
yy = np.linspace(ymin, ymax, n)

x, y = np.meshgrid(xx, yy)
features = np.vstack((x.flatten(), y.flatten())).T
data_count = features.shape[0]

Set the noise prior.

In [None]:
noise_prior = 1e-4
length_scale = 0.05

Define the target matrices by sampling from the GP.

In [None]:
targets = targets_from_GP(features, n, length_scale, noise_prior)

Here we create a train/test split in the dataset.
Modify the `train_ratio` to specify the proportion of data to hold out for training.

In [None]:
train_ratio = 0.2

In [None]:
rng = np.random.default_rng(seed=1)
interval_count = int(data_count * train_ratio)
interval = int(data_count / interval_count)
sfl = rng.permutation(np.arange(data_count))
train_mask = np.zeros(data_count, dtype=bool)
for i in range(interval_count):
    idx = np.random.choice(sfl[i * interval : (i + 1) * interval])
    train_mask[idx] = True
test_mask = np.invert(train_mask)
train_count = np.count_nonzero(train_mask)
test_count = np.count_nonzero(test_mask)

In [None]:
train_targets = targets[train_mask, :]
test_targets = targets[test_mask, :]
train_features = features[train_mask, :]
test_features = features[test_mask, :]

Let's visualize the train/test datasets.

In [None]:
def make_im(vec, mask):
    ret = np.zeros(len(mask))
    ret[mask] = vec
    ret[np.invert(mask)] = -np.inf
    return ret.reshape(n, n)

In [None]:
fig, ax = plt.subplots(2, 3,figsize = (10,7))
ax[0, 0].imshow(make_im(train_targets[:,0], train_mask))
ax[0, 0].set_ylabel("train", fontsize = 15)
ax[0, 0].set_title("$\kappa$", fontsize = 15)
ax[1, 0].imshow(make_im(test_targets[:,0], test_mask))
ax[1, 0].set_ylabel("test", fontsize = 15)
ax[0, 1].imshow(make_im(train_targets[:,1], train_mask))
ax[0, 1].set_title("g1", fontsize = 15)
ax[1, 1].imshow(make_im(test_targets[:,1], test_mask))
ax[0, 2].imshow(make_im(train_targets[:,2], train_mask))
ax[0, 2].set_title("g2", fontsize = 15)
ax[1, 2].imshow(make_im(test_targets[:,2], test_mask))
plt.show()

## 3x3 Matrices

Explicitly define the target matrices.

In [None]:
train_targets_33 = train_targets.swapaxes(0, 1).reshape(3 * train_count)
test_targets_33 = test_targets.swapaxes(0, 1).reshape(3 * test_count)

We only need this model to find the Kout form from its kernel function.

In [None]:
shear_model = MuyGPS(
        kernel=ShearKernel(
            deformation=DifferenceIsotropy(
                F2,
                length_scale=Parameter(length_scale),
            ),
        ),
        noise = HomoscedasticNoise(1e-4),
)

Realize the 3x3 kernel matrices.

In [None]:
Kin_33 = conventional_shear(train_features, train_features, length_scale=length_scale)
Kcross_33 = conventional_shear(test_features, train_features, length_scale=length_scale)
Kout_33 = conventional_Kout(shear_model.kernel, test_count)

In [None]:
print(f"shapes of 3x3 matrices:")
print(f"\tKout: {Kout_33.shape}")
print(f"\tKcross: {Kcross_33.shape}")
print(f"\tKin: {Kin_33.shape}")
print(f"\ttrain targets: {train_targets_33.shape}")
print(f"\ttest targets: {test_targets_33.shape}")

## 2x3 Matrices

Here we explore the 2in3out variant of the shear kernel, which trains on observations only of `g1` and `g2`, but predicts onto all three covariates.

In [None]:
mean_33 = conventional_mean(
    Kin_33, Kcross_33, train_targets_33, noise_prior
)
covariance_33 = conventional_variance(
    Kin_33, Kcross_33, Kout_33, noise_prior
)
diag_variance_33 = np.diag(covariance_33)
ci_analytic_33 = np.sqrt(diag_variance_33) * 1.96
ci_analytic_33 = ci_analytic_33.reshape(test_count, 3)
coverage_analytic_33 = (
    np.count_nonzero(
        np.abs(test_targets - mean_33) < ci_analytic_33, axis=0
    ) / test_count
)

Here we delete the relevant columns of the 3x3 kernel matrices to produce the 2x3

In [None]:
Kin_23 = Kin_33[train_count:, train_count:]
Kcross_23 = Kcross_33[:, train_count:]
train_targets_23 = train_targets_33[train_count:] 
test_targets_23 = test_targets_33[test_count:]

In [None]:
print(f"shapes of 2x3 matrices:")
print(f"\tKout: {Kout_33.shape}")  # we still use the 3x3 Kout prior, since we are predicting a 3-dimensional response
print(f"\tKcross: {Kcross_23.shape}")
print(f"\tKin: {Kin_23.shape}")
print(f"\ttrain targets: {train_targets_23.shape}")
print(f"\ttest targets: {test_targets_23.shape}")

In [None]:
mean_23 = conventional_mean(
    Kin_23, Kcross_23, train_targets_23, noise_prior
)
covariance_23 = conventional_variance(
    Kin_23, Kcross_23, Kout_33, noise_prior
)
diag_variance_23 = np.diag(covariance_23)
ci_23 = np.sqrt(diag_variance_23) * 1.96
ci_23 = ci_23.reshape(test_count, 3)
coverage_23 = (
    np.count_nonzero(
        np.abs(test_targets - mean_23) < ci_23, axis=0
    ) / test_count
)

In [None]:
print(
    mean_33.shape, covariance_33.shape, diag_variance_33.shape
)
print(
    mean_23.shape, covariance_23.shape, diag_variance_23.shape
)

## Plot the mean comparison

In [None]:
def show_im(vec, mask, ax):
    mat = make_im(vec, mask)
    im = ax.imshow(mat.reshape(n, n), norm=LogNorm(), cmap=my_cmap)
    fig.colorbar(im, ax=ax)

def compare_means(truth, first, second, fname, sname, fontsize=12, all_colorbar=False):
    f_residual = np.abs(truth - first) + 1e-15
    s_residual = np.abs(truth - second) + 1e-15
    fs_residual = np.abs(first - second) + 1e-15

    fig, ax = plt.subplots(6, 3, figsize = (10, 18))
    
    for axis_set in ax:
        for axis in axis_set:
            axis.set_xticks([])
            axis.set_yticks([])

    ax[0, 0].set_title("$\kappa$")
    ax[0, 1].set_title("g1")
    ax[0, 2].set_title("g2")
    ax[0, 0].set_ylabel("Truth", fontsize=fontsize)
    ax[1, 0].set_ylabel(f"{fname} Mean", fontsize=fontsize)
    ax[2, 0].set_ylabel(f"|truth - {fname}|", fontsize=fontsize)
    ax[3, 0].set_ylabel(f"{sname} Mean", fontsize=fontsize)
    ax[4, 0].set_ylabel(f"|truth - {sname}|", fontsize=fontsize)
    ax[5, 0].set_ylabel(f"|{fname} - {sname}|", fontsize=fontsize)

    # truth
    im00 = ax[0, 0].imshow(make_im(truth[:,0], test_mask))
    im01 = ax[0, 1].imshow(make_im(truth[:,1], test_mask))
    im02 = ax[0, 2].imshow(make_im(truth[:,2], test_mask))
    if all_colorbar is True:
        fig.colorbar(im00, ax=ax[0, 0])
        fig.colorbar(im01, ax=ax[0, 1])
        fig.colorbar(im02, ax=ax[0, 2])

    # first model
    im10 = ax[1, 0].imshow(make_im(first[:,0], test_mask))
    im11 = ax[1, 1].imshow(make_im(first[:,1], test_mask))
    im12 = ax[1, 2].imshow(make_im(first[:,2], test_mask))
    if all_colorbar is True:
        fig.colorbar(im10, ax=ax[1, 0])
        fig.colorbar(im11, ax=ax[1, 1])
        fig.colorbar(im12, ax=ax[1, 2])

    # first model residual
    show_im(f_residual[:,0], test_mask, ax=ax[2, 0])
    show_im(f_residual[:,1], test_mask, ax=ax[2, 1])
    show_im(f_residual[:,2], test_mask, ax=ax[2, 2])

    # second model
    im30 = ax[3, 0].imshow(make_im(second[:,0], test_mask))
    im31 = ax[3, 1].imshow(make_im(second[:,1], test_mask))
    im32 = ax[3, 2].imshow(make_im(second[:,2], test_mask))
    if all_colorbar is True:
        fig.colorbar(im30, ax=ax[3, 0])
        fig.colorbar(im31, ax=ax[3, 1])
        fig.colorbar(im32, ax=ax[3, 2])

    # second model residual
    show_im(s_residual[:, 0], test_mask, ax=ax[4, 0])
    show_im(s_residual[:, 1], test_mask, ax=ax[4, 1])
    show_im(s_residual[:, 2], test_mask, ax=ax[4, 2])

    # residual between the two models
    show_im(fs_residual[:, 0], test_mask, ax=ax[5, 0])
    show_im(fs_residual[:, 1], test_mask, ax=ax[5, 1])
    show_im(fs_residual[:, 2], test_mask, ax=ax[5, 2])

    plt.show()

In [None]:
compare_means(test_targets, mean_23, mean_33, "2x3 Model", "3x3 Model", all_colorbar=True)

Note that the 2x3 residual appears to be nearly constant.
We are hypothesizing that there is an additive offset in the posterior mean results returned by the 2x3 solution in the $\kappa$ response.

In [None]:
offset = np.mean(mean_23, axis=0) - np.mean(mean_33, axis=0)

In [None]:
offset

Here we compare the 2x3 posterior mean after subtracting this offset:

In [None]:
compare_means(test_targets, mean_23 - offset, mean_33, "2x3 Model", "3x3 Model", all_colorbar=True)

Now the corrected 2x3 mean in the $\kappa$ response is still further off than the 3x3 mean, but the residual is no longer nearly constant as the majority of the residual appears to be accounted for in this scalar offset term.
Of course, the way I've set this offset is synthetic and requires access to $\kappa$ targets in the training data.
How do we identify and subtract this mean term in general?

Moreover, different random samples appear to produce different offsets (try rerunning with a different random seed).
Is there perhaps no closed form expression of this offset?
Is there a way to disciplined way to deal with this?
Do we have enough information about $\kappa$ to do anything?