The objective of this notebook is to test least square fit and HMC sampling, verifying the model predictive power on synthetic data, generated from known VSH coefficients. 
The designed script will generate data of sample size 100,000. The number of VSH coefficients depends on the desired value of $l_{max}$ (in our case $l_{max} = 2$, i.e. 16 VSH coefficients in total). 
-	We randomly drew the VSH coefficients from a uniform distribution with amplitude 0.01. 
-	We also drew the RA $\in [0, 2\pi]$ and Dec $\in [-1,1]$ (uniform in the sphere) angles from uniform distributions
-	 Then using the functions [`basis_vectors`](src/models/vsh_model.py) and [`model_vsh`](src/models/vsh_model.py) we used the generated angles RA and Dec to convert them into proper motions
-	We then drew the uncertainties on the proper motions from a Gaussian distribution centred around 0.0 and with a noise level (std) of 0.03.
-	There is no correlation in the synthetic dataset between proper motions.

Below we import the required packages and functions from `src`.

In [1]:
from iminuit import Minuit # to perform least square
from src.models.configuration import*
from src.data.data_utils import*
from src.models.vsh_model import*
import jax.numpy as jnp
from jax import jit, vmap
from functools import partial, lru_cache

  from .autonotebook import tqdm as notebook_tqdm


The cell below will generate and store the syntheic data along with the randomly generated VSH coefficients (true values).

In [None]:
from src.data import generate_synthetic_data
generate_synthetic_data # Generate synthetic data

Load synthetic data and true VSH coefficient values. 

In [2]:
import pandas as pd
import numpy as np

df_synthetic = pd.read_csv('synthetic_data/synthetic_vsh_data.csv') # loead synthetic 
true_coeff = np.load('synthetic_data/theta_true.npy') # loead true VSH coefficients
angles_gen, obs_gen, error_gen = config_data(df_synthetic) # configurate the data for input

# Test Least Square Fit \& HMC Sampling on the Synthetic Dataset

- First by fitting the least square (see function `toy_least_square` in [`src.models.vsh_model.py`](src/models/vsh_model.py)) with `iminuit`.
- Additionaly testing both least square fit and HMC on universal vsh model and least square fucntion (respectivelly `model_vsh` and `least_square` and [`src.models.vsh_model.py`](src/models/vsh_model.py)).

Note "toy model" (by toy model I mean a static function designed for $l=1$) only work for $l_{max}$ = 1, recall synthetic data was generated with $l=2$.

In [3]:
# Bind fixed arguments into a new function
bound_least_square = partial(toy_least_square, angles_gen, obs_gen, error_gen) 

# Now Minuit only sees the 6 free parameters
m_toy = Minuit(bound_least_square,
           t_10=0.0, t_11r=0.0, t_11i=0.0,
           s_10=0.0, s_11r=0.0, s_11i=0.0)

m_toy.errordef=Minuit.LEAST_SQUARES

m_toy.migrad()

print('Toy Model Result l = 1:')
theta_fit_toy = jnp.array([m_toy.values[k] for k in m_toy.parameters])
print("Fitted parameters values:")
print(theta_fit_toy)
print("True values:")
print(true_coeff[:count_vsh_coeffs(1)])

Toy Model Result l = 1:
Fitted parameters values:
[ 0.03217079  0.01450661  0.0113407   0.1311283   0.02000312 -0.06413274]
True values:
[ 0.05372004  0.05742959 -0.02012502 -0.00375978  0.00838665 -0.04013964]


"Universal" (by universal I mean for any desired values of $l_{max}$), least square.

In [4]:
lmax = 1
total_params = count_vsh_coeffs(lmax) 

# Flat vector theta: [t10, ..., t_lmaxm, s10, ..., s_lmaxm]
theta_init = jnp.zeros(total_params)

# Fix everything except theta
def least_square_wrapper(*theta_flat):
    theta = jnp.array(theta_flat)  # reconstructs the vector from scalars
    return least_square(angles_gen, obs_gen, error_gen, theta, lmax=lmax, grid=False)

m1 = Minuit(least_square_wrapper, *theta_init)

m1.errordef = Minuit.LEAST_SQUARES

m1.migrad()

print('Compleate least square result l = 1:')
theta_fit_1 = jnp.array([m1.values[k] for k in m1.parameters])
print("Fitted parameters values:")
print(theta_fit_1)
print("True values:")
print(true_coeff[:count_vsh_coeffs(1)])

Compleate least square result l = 1:
Fitted parameters values:
[ 0.03216825  0.13111432  0.01450038  0.01133812  0.02000075 -0.0641298 ]
True values:
[ 0.05372004  0.05742959 -0.02012502 -0.00375978  0.00838665 -0.04013964]


Least square fit for $l=2$

In [5]:
lmax = 2
total_params = count_vsh_coeffs(lmax) 

# Flat vector theta: [t10, ..., t_lmaxm, s10, ..., s_lmaxm]
theta_init = jnp.zeros(total_params)

# Fix everything except theta
def least_square_wrapper(*theta_flat):
    theta = jnp.array(theta_flat)  # reconstructs the vector from scalars
    return least_square(angles_gen, obs_gen, error_gen, theta, lmax=lmax, grid=False)

m2 = Minuit(least_square_wrapper, *theta_init)

m2.errordef = Minuit.LEAST_SQUARES

m2.migrad()

print('Compleate least square result l = 2:')
theta_fit_2 = jnp.array([m2.values[k] for k in m2.parameters])
print("Fitted parameters values:")
print(theta_fit_2)
print("True values:")
print(true_coeff[:count_vsh_coeffs(2)])

Compleate least square result l = 2:
Fitted parameters values:
[ 0.05237787  0.05555838 -0.02015737 -0.00370113  0.01000114 -0.0395992
 -0.01779413  0.02249806  0.03067211 -0.03782388  0.05861543 -0.05849521
  0.01581841  0.0072174   0.0466077   0.05248069]
True values:
[ 0.05372004  0.05742959 -0.02012502 -0.00375978  0.00838665 -0.04013964
 -0.02277664  0.02273766  0.02961199 -0.03947825  0.05824246 -0.05696608
  0.01680502  0.0075229   0.04790566  0.0521445 ]


Least square fit for $l=3$.

In [6]:
lmax = 3
total_params = count_vsh_coeffs(lmax) 

# Flat vector theta: [t10, ..., t_lmaxm, s10, ..., s_lmaxm]
theta_init = jnp.zeros(total_params)

# Fix everything except theta
def least_square_wrapper(*theta_flat):
    theta = jnp.array(theta_flat)  # reconstructs the vector from scalars
    return least_square(angles_gen, obs_gen, error_gen, theta, lmax=lmax, grid=False)

m3 = Minuit(least_square_wrapper, *theta_init)

m3.errordef = Minuit.LEAST_SQUARES

m3.migrad()

print('Compleate least square result l = 2:')
theta_fit_3 = jnp.array([m3.values[k] for k in m3.parameters])
print("Fitted parameters values:")
print(theta_fit_3[:count_vsh_coeffs(2)])
print("True values:")
print(true_coeff[:count_vsh_coeffs(2)])

Compleate least square result l = 2:
Fitted parameters values:
[ 0.03014502  0.03355306 -0.0007573   0.01433174  0.01033522 -0.01194067
 -0.01777831  0.00241366  0.02045461 -0.04025296  0.04289113 -0.06602865
  0.00887562  0.01604122  0.03837589  0.01503773]
True values:
[ 0.05372004  0.05742959 -0.02012502 -0.00375978  0.00838665 -0.04013964
 -0.02277664  0.02273766  0.02961199 -0.03947825  0.05824246 -0.05696608
  0.01680502  0.0075229   0.04790566  0.0521445 ]


In [None]:
theta_list = [theta_fit_toy, theta_fit_1, theta_fit_2, theta_fit_3]

s_val = []
for theta in theta_list:
    s_10 = theta[1]
    s_11r = theta[4]
    s_11i = theta[5]
    arr = np.array([s_10, s_11r, s_11i])
    s_val.append(arr)

lsq_result = np.array(s_val)

# Test HMC Sampling on Synthetic Data

In [1]:
import jax
from numpyro.infer import NUTS, MCMC
from numpyro.diagnostics import summary
import numpyro
import gc
from src.models.vsh_model import*
from src.models.configuration import*

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
import pandas as pd
import numpy as np

df_synthetic = pd.read_csv('synthetic_data/synthetic_vsh_data.csv') # loead synthetic 
true_coeff = np.load('synthetic_data/theta_true.npy') # loead true VSH coefficients
angles_gen, obs_gen, error_gen = config_data(df_synthetic) # configurate the data for input
lsq_result = np.load('lsq_result_deleteme.npy')

Define two HMC models, with priors
- ~ Uniform $(-0.2, 0.2)$
- ~ Normal $(0.0, 0.2)$

In [3]:
def chi2_jit(angles, obs, error, theta, lmax):
    return least_square(angles, obs, error, theta, lmax=lmax, grid=False)
chi2_jit = jit(chi2_jit, static_argnames=['lmax'])

# Define Model with Uniform prior
def model_w_uni_prior(angles, obs, error, limit = 0.2, lmax = 3):
    total_params = count_vsh_coeffs(lmax)
    # Prior on all VSH coefficients (both toroidal and spheroidal)
    theta = numpyro.sample("theta", dist.Uniform(-limit, limit).expand([total_params]))
    # Least-squares residuals: we assume Gaussian-distributed residuals
    chi2_val = chi2_jit(angles, obs, error, theta, lmax=lmax)

    # The log-likelihood is proportional to -0.5*chi^2
    numpyro.factor("likelihood", -0.5*chi2_val)

# Definie Model with Gaussian Prior
def model_w_norm_prior(angles, obs, error, std = 0.2, lmax = 3):

    total_params = count_vsh_coeffs(lmax)
    # Prior on all VSH coefficients (both toroidal and spheroidal)
    theta = numpyro.sample("theta", dist.Normal(0., std).expand([total_params]))
    # Least-squares residuals: we assume Gaussian-distributed residuals
    chi2_val = chi2_jit(angles, obs, error, theta, lmax=lmax)

    # The log-likelihood is proportional to -0.5*chi^2
    numpyro.factor("likelihood", -0.5*chi2_val)


n_s = 100 # number of samples
n_warmup = 2000 #  number of warmups 
n_chains = 2 # numbe of chains

HMC sampling with uniform prior.

- $l_{max} = 2$ :

In [4]:
rng_key = jax.random.key(0)

kernel_uni = NUTS(model_w_uni_prior, target_accept_prob=0.75) # this is to make sure acceptance does not exceed 90%
mcmc_uni = MCMC(kernel_uni, num_warmup=n_warmup, num_samples=n_s, num_chains=n_chains, chain_method='sequential', progress_bar=True)
mcmc_uni.run(rng_key, angles = angles_gen, obs = obs_gen, error = error_gen, lmax=2)
ps_w_uni_prior = mcmc_uni.get_samples()

diagnostics = summary(mcmc_uni.get_samples(group_by_chain=True))
divergences = mcmc_uni.get_extra_fields()["diverging"]  # shape: (num_samples * num_chains,)
num_divergences = divergences.sum()
print("Number of divergences:", num_divergences)

del mcmc_uni
gc.collect()
jax.clear_caches()

sample: 100%|██████████| 2100/2100 [01:22<00:00, 25.58it/s, 31 steps of size 1.11e-01. acc. prob=0.90] 
sample: 100%|██████████| 2100/2100 [01:23<00:00, 25.03it/s, 63 steps of size 1.05e-01. acc. prob=0.91] 


Number of divergences: 0


HMC sampling with Gaussian prior.

- $l_{max} = 2$ :

In [6]:
rng_key = jax.random.key(0)

kernel_norm = NUTS(model_w_norm_prior, target_accept_prob=0.75)
mcmc_norm = MCMC(kernel_norm, num_warmup=n_warmup, num_samples=n_s, num_chains=n_chains, progress_bar=True)
mcmc_norm.run(rng_key, angles = angles_gen, obs = obs_gen, error = error_gen, lmax=2)

ps_w_norm_prior2 = mcmc_norm.get_samples()

diagnostics = summary(mcmc_norm.get_samples(group_by_chain=True))
divergences = mcmc_norm.get_extra_fields()["diverging"]  # shape: (num_samples * num_chains,)
num_divergences = divergences.sum()
print("Number of divergences:", num_divergences)

del mcmc_norm
gc.collect()
jax.clear_caches()

  mcmc_norm = MCMC(kernel_norm, num_warmup=n_warmup, num_samples=n_s, num_chains=n_chains, progress_bar=True)
sample: 100%|██████████| 2100/2100 [01:19<00:00, 26.49it/s, 15 steps of size 6.24e-02. acc. prob=0.87] 
sample: 100%|██████████| 2100/2100 [01:09<00:00, 30.04it/s, 63 steps of size 5.75e-02. acc. prob=0.90] 


Number of divergences: 0


In [7]:
sample_uni_l2 = jnp.mean(ps_w_uni_prior['theta'], axis=0)
error_uni_l2 = jnp.std(ps_w_uni_prior['theta'], axis=0)


sample_norm_l2 = jnp.mean(ps_w_norm_prior2['theta'], axis=0)
error_norm_l2 = jnp.std(ps_w_norm_prior2['theta'], axis=0)


# Collecting Results

We are going to collect the above results in a DataFrame fromat and display them. For the general objective of this project, we only need to consider the spheroidal VSH coefficeint of the dipole ($l=1$), i.e. $s_{10}$, $s_{11}^{\real}$ and $s_{11}^{\Im}$, hence these are the ones we are going to display.

In [13]:
collect = {
    'True VSH Values': [true_coeff[1], true_coeff[4], true_coeff[5]],
    'Least Square Toy Model': [lsq_result[0][0], lsq_result[0][1], lsq_result[0][2]],
    'Least Square l = 1' : [lsq_result[1][0], lsq_result[1][1], lsq_result[1][2]],
    'Least Square l = 2' : [lsq_result[2][0], lsq_result[2][1], lsq_result[2][2]],
    'Least Square l = 3' : [lsq_result[3][0], lsq_result[3][1], lsq_result[3][2]],
    'HMC w uniform (l = 2)': [f'{sample_uni_l2[1]:.6f}+/-{error_uni_l2[1]:.6f}', f'{sample_uni_l2[4]:.6f}+/-{error_uni_l2[4]:.6f}', f'{sample_uni_l2[5]:.6f}+/-{error_uni_l2[5]:.6f}'],
    'HMC w normal (l = 2)': [f'{sample_norm_l2[1]:.6f}+/-{error_norm_l2[1]:.6f}', f'{sample_norm_l2[4]:.6f}+/-{error_norm_l2[4]:.6f}', f'{sample_norm_l2[5]:.6f}+/-{error_norm_l2[5]:.6f}']
    }
results = pd.DataFrame(data=collect, index = ['s_10', 's_11r', 's_11i'])
results

Unnamed: 0,True VSH Values,Least Square Toy Model,Least Square l = 1,Least Square l = 2,Least Square l = 3,HMC w uniform (l = 2),HMC w normal (l = 2)
s_10,0.05743,0.014507,0.131114,0.055558,0.033553,0.055663+/-0.000858,0.055714+/-0.000981
s_11r,0.008387,0.020003,0.020001,0.010001,0.010335,0.010170+/-0.001390,0.010037+/-0.001133
s_11i,-0.04014,-0.064133,-0.06413,-0.039599,-0.011941,-0.039626+/-0.000353,-0.039627+/-0.000321


The above results show that both HMC sampling provide similar results even with different priors, this is very important! Furthermore, we can recall that the synthetic data was degenerated with a quadrupole setting, i.e. l=2, of course the least square fir and the HMC sampling will work best with quadrupole setting. Nevertheless, considering the least square fit performance is acceptable even with other VSH settings. I did not bother providing and example in higher dimension for the HMC sampling as it is time consuming and our objective is not to focus on some synthetic data; this was only a matter of showing model consistency.