# Recovering Gauss coefficients by HMC sampling

In this notebook, we compute the posterior distribution for the geomagnetic problem using the Hamiltonian Monte Carlo (HMC) algorithm.

# 0. Python packages and figure embellishments

In [None]:
# Some Python packages.
import magnetic as magnetic
import random
import time
import numpy as np
import matplotlib.pyplot as plt

# Set some parameters to make plots nicer.
plt.rcParams["font.family"] = "serif"
plt.rcParams.update({'font.size': 25})

# Set specific random seed to make simulations comparable.
np.random.seed(0)

# 1. Input parameters

Observation points at the surface of the Earth.

In [None]:
# Observation points.
theta_obs=np.pi*np.random.rand(20)
phi_obs=2.0*np.pi*np.random.rand(20)

Gauss coefficients to include (to compute artificial and trial data).

In [None]:
# Maximum degree.
ell_max=2

HMC sampling parameters.

In [None]:
# Number of leap-frog steps.
Nt=8

# Leap-frog time increment.
dt=150.0

# Number of HMC samples.
N=10000

# 2. Initialisations

Read the Gauss coefficients from the IGRF13 model. These will be used as ground-thruth parameters that we try to estimate.

In [None]:
# Read Gauss coefficients from IGRF13.
g_igrf13,h_igrf13=magnetic.read_coefficients(verbose=False)

Nc=np.shape(g_igrf13)[0]

To accelerate the evaluation of the forward model, we precompute the Schmidt quasi-normalised associated Legendre functions.

In [None]:
Pnmi=magnetic.Pnmi(theta_obs,ell_max)

Compute artificial observations for a chosen set of Gauss coefficients and synthesise the magnetic field from them.

In [None]:
# Compute the magnetic field values for the observation points.
d_obs=magnetic.B(phi_obs,theta_obs,g_igrf13,h_igrf13,Pnmi,ell_max=ell_max)

# Compute magnetic field for longitude and colatitude arrays.
theta=np.arange(0.0,np.pi,0.05)
phi=np.arange(0.0,2.0*np.pi,0.05)

d_plot=magnetic.B_field(phi,theta,g_igrf13,h_igrf13,ell_max=ell_max)

Plot the ground-truth magnetic field and the observation points.

In [None]:
# Plot radial component of the magnetic field.
colat,lon=np.meshgrid(phi,theta)

plt.subplots(1, figsize=(22,10))
plt.gca().invert_yaxis()
plt.pcolor(180.0*colat/np.pi,180.0*lon/np.pi,d_plot, cmap=plt.cm.get_cmap('Greys'))
plt.colorbar()
plt.contour(180.0*colat/np.pi,180.0*lon/np.pi,d_plot, colors='k')
plt.plot(180.0*phi_obs/np.pi,180.0*theta_obs/np.pi,'ro',markersize=10)
plt.grid()
plt.xlabel('longitude [°]',labelpad=15)
plt.ylabel('colatitude [°]',labelpad=15)
plt.title('magnetic field, radial component',pad=20)
plt.show()

# 3. Sampling

Before actually sampling the posterior distribution, we need to choose an initial location of the random walker. This can be done entirely randomly or already with some prior idea about useful parameters in mind. The performance of the sampler will depend on how well the initial position is chosen.

In [None]:
# Initial model vector. We place g coefficients into m[0,:,:] and h coefficients into m[1,:,:].
m=np.zeros((2,Nc,Nc))

# Selection of initial model parameters near the ground-truth values.
for i in range(0,ell_max+1):
    for j in range(0,i+1):
        m[0,i,j]=g_igrf13[i,j]+200.0*np.random.randn()
        m[1,i,j]=h_igrf13[i,j]+200.0*np.random.randn()

# Evaluate initial probability density.
rho=magnetic.log_posterior(d_obs,phi_obs,theta_obs,m[0,:,:],m[1,:,:],Pnmi,ell_max)

In [None]:
E=0.0

# Add energy contributions from included degrees.
for i in range(0,ell_max+1):
	for j in range(0,i+1):
		E+=(m[0,i,j]**2+m[1,i,j]**2)

To avoid excessive storage requirements, we will only store all samples for two of the model parameters. The corresponding vectors and the number of accepted moves are initialised below.

In [None]:
# Initialise number of accepted models.
accept=0

# Initialise arrays for the collection of samples.
s1=[]
s2=[]

In [None]:
t1=time.time()

# Loop over samples.
for it in range(0,N):
    
    # Choose a random momentum vector from the standard normal distribution.
    p=np.random.randn(2*Nc*Nc).reshape(2,Nc,Nc)
    
    # Evaluate energies.
    U=-magnetic.log_posterior(d_obs,phi_obs,theta_obs,m[0,:,:],m[1,:,:],Pnmi,ell_max)
    K=0.5*np.dot(p.flatten(),p.flatten())
    H=U+K
    
    # Leap-frog iteration. ====================================================
    
    m_prop=m.copy()
    p_prop=p.copy()
    
    J=-magnetic.grad_posterior(d_obs,phi_obs,theta_obs,m[0,:,:],m[1,:,:],Pnmi,ell_max)
    
    for k in range(Nt):
        
        p_prop=p_prop-0.5*dt*J
        m_prop=m_prop+dt*p_prop
        J=-magnetic.grad_posterior(d_obs,phi_obs,theta_obs,m_prop[0,:,:],m_prop[1,:,:],Pnmi,ell_max)
        p_prop=p_prop-0.5*dt*J
        
    # =========================================================================
    
    # Evaluate new energies.
    U_prop=-magnetic.log_posterior(d_obs,phi_obs,theta_obs,m_prop[0,:,:],m_prop[1,:,:],Pnmi,ell_max)
    K_prop=0.5*np.dot(p_prop.flatten(),p_prop.flatten())
    H_prop=U_prop+K_prop
    
    # Evaluate HMC Metropolis rule in logarithmic form.
    r=np.minimum(0.0,H-H_prop)
    if r>=np.log(np.random.random(1)):
        # Make move to proposed position.
        m=m_prop.copy()
        H=H_prop
        # Increase number of accepted models.
        accept+=1
    
    # Collect the samples.
    # Here you may change the model parameters that are being considered.
    s1.append(m[0,2,1])
    s2.append(m[0,1,1])
    
t2=time.time()
print('elapsed time: %f s' % (t2-t1))


# 4. Output and analysis

Following the sampling, we plot the results and perform some analyses. We start with the acceptance rate.

In [None]:
# Acceptance rate.
print('acceptance rate: %f ' % (accept/N))

Trace plots of the two selected model parameters. They should look like a hairy caterpillar.

In [None]:
# Trace plots.
plt.figure(figsize=(15,8))
plt.plot(s1,'k',linewidth=2)
plt.xlim([0,N])
plt.grid()
plt.xlabel('samples',labelpad=15)
plt.title('trace plot parameter 1',pad=15)
plt.show()

plt.figure(figsize=(15,8))
plt.plot(s2,'k',linewidth=2)
plt.xlim([0,N])
plt.grid()
plt.xlabel('samples',labelpad=15)
plt.title('trace plot parameter 2',pad=15)
plt.show()

Auto-correlation functions and derived from them, the effective sample size.

In [None]:
# Auto-correlations.
cc1=np.correlate(s1-np.mean(s1),s1-np.mean(s1),'full')/np.sum((s1-np.mean(s1))**2)
cc1=cc1[N-1:]

cc2=np.correlate(s2-np.mean(s2),s2-np.mean(s2),'full')/np.sum((s2-np.mean(s2))**2)
cc2=cc2[N-1:]

# Estimate of the effective sample size (Gelman et al., 2013).
Neff1=0.0
for i in range(N-1):
    if (cc1[i]+cc1[i+1]>0.0):
        Neff1+=cc1[i]
        
Neff1=N/(1.0+2.0*Neff1)
print('effective sample size (parameter 1): %f' % Neff1)

Neff2=0.0
for i in range(N-1):
    if (cc2[i]+cc2[i+1]>0.0):
        Neff2+=cc2[i]
        
Neff2=N/(1.0+2.0*Neff2)
print('effective sample size (parameter 2): %f' % Neff2)

# Plot autocorrelation function.
plt.figure(figsize=(15,8))
plt.plot(cc1[0:N],'k',linewidth=2)
plt.xlabel('samples',labelpad=15)
plt.xlim([0,N])
plt.title('auto-correlation (parameter 1)',pad=15)
plt.grid()
plt.show()

plt.figure(figsize=(15,8))
plt.plot(cc2[0:N],'k',linewidth=2)
plt.xlabel('samples',labelpad=15)
plt.xlim([0,N])
plt.title('auto-correlation (parameter 2)',pad=15)
plt.grid()
plt.show()

1-D marginals of the two selected model parameters.

In [None]:
plt.figure(figsize=(10,10))
n, bins, patches = plt.hist(s1, 20, density=True, facecolor='k', alpha=1.0)
plt.xlabel('parameter 1',labelpad=15)
plt.title('1-D marginal (parameter 1)',pad=15)
plt.grid()
plt.show()

plt.figure(figsize=(10,10))
n, bins, patches = plt.hist(s2, 20, density=True, facecolor='k', alpha=1.0)
plt.xlabel('parameter 2',labelpad=15)
plt.title('1-D marginal (parameter 2)',pad=15)
plt.grid()
plt.show()

2-D marginal of the selected model parameters.

In [None]:
plt.figure(figsize=(10,10))
plt.hist2d(s1, s2, bins=20, density=True, cmap='Greys')
plt.xlabel('parameter 1',labelpad=15)
plt.ylabel('parameter 2',labelpad=15)
plt.title('1-D marginal (parameter 1)',pad=15)
plt.xlim([1500.0,4500.0])
plt.ylim([-3000.0,0.0])
plt.grid()
plt.show()

# 5. Exercises

**Exercise 1**: Include actual random errors in the artificial observed data *d_obs*. Make sure that these are in accord with the observed error standard deviation *sigma_D* in *magnetic.py*. Repeat the HMC sampling for a range of different standard deviations. Describe your observations.

**Exercise 2**: Keeping the leap-frog time increment fixed at *dt=150*, explore the relation between the number of leap-frog steps *Nt*, the acceptance rate, and the effective sample size. Choose *Nt* values of 1, 4, 8, 12 and 16. Explain your observation.

**Exercise 3**: What is the maximum effective sample fraction that you can achieve by tuning *dt* and *Nt*? How does this compare to the optimal tuning of MALA and the Metropolis-Hastings algorithm, where *sigma* is the only tuning parameter?

**Exercise 4**: Increase the maximum harmonic degree, *ell_max* to 13, which is the maximum degree included in the IGRF13 model of the current geomagnetic field. How do the number of leap-frog steps *Nt* and the time increment *dt* need to be adjusted in order to achieve an acceptance rate of around 50 %?

**Exercise 5**: Return to the MALA notebook and also set *ell_max* to 13. Tune the parameter *sigma* such that the acceptance rate is also around 50 %. Compare the trace plots for the HMC and MALA samplers. How do the two algorithms explore model space? Explain your observation.