# Heavy-tailed Markov Chain Tutorial

This notebook demonstrates the knockoff construction for the heavy-tailed Markov Chain where each variable has t-distributed tailes. The model is as follows:
$$X_1=\sqrt{\frac{\nu-2}\nu}Z_1, \quad X_{j+1}=\rho_j X_j + \sqrt{1-\rho_j^2}\sqrt{\frac{\nu-2}\nu}Z_{j+1}, \quad Z_j\stackrel{i.i.d.}{\sim} t_\nu$$
for $j=1,\dots,p$.

## Multiple-try Metropolis

We demonstrate the Multiple-try Metropolis proposals below.

In [1]:
import math
import numpy as np
from scipy.stats import t

%run ../heavy-tailed-t/t_core #load the functions for the t-MC experiment

In [2]:
#simulation parameters
df_t = 5 # degree of freedom of t-distribution
p = 50 # dimension of the random vector X
numsamples = 100 # number of samples to generate knockoffs 
rhos = [0.6] * (p-1) # the correlations

#algorithm parameters
halfnumtry = 1 # m/half number of candidates
stepsize = 1.5 # step size in the unit of 1/\sqrt((\Sigma)^{-1}_{jj})

We first compute the proposal scaling for each variable. Recall that the recommended scaling for the proposal for variable $j$ is $\sqrt{(\Sigma^{-1})_{jj}}$.

In [3]:
#generate the grid size
quantile_x = np.zeros([p,2*halfnumtry+1])
sds = [0]*p
sds[0] = math.sqrt(1 - rhos[0]**2)
for i in range(1, p - 1):
    sds[i] = math.sqrt((1 - rhos[i-1]**2)*(1 - rhos[i]**2)/
                       (1 - rhos[i-1]**2 * rhos[i]**2))
sds[p - 1] = math.sqrt(1 - rhos[p - 2]**2)
for i in range(p):
    quantile_x[i] = [x*sds[i]*stepsize for x in list(
        range(-halfnumtry, halfnumtry + 1))]

Next, we sample observations from the Markov Chain and generate knockoffs with the MTM technique.

In [4]:
#generate the data
bigmatrix = np.zeros([numsamples, 2*p]) # store simulation data
for i in range(numsamples):
    #sample from the Markov Chain
    bigmatrix[i, 0] = t.rvs(df=df_t)*math.sqrt((df_t - 2)/df_t)
    for j in range(1, p):
        bigmatrix[i, j] = math.sqrt(1 - rhos[j - 1]**2)*t.rvs(df=df_t)* \
        math.sqrt((df_t - 2)/df_t) + rhos[j - 1]*bigmatrix[i,j - 1]
    
    #sample the knockoff
    bigmatrix[i, p:(2*p)] = SCEP_MH_MC(bigmatrix[i, 0:p], 0.999, 
                                       quantile_x, rhos, df_t)

np.shape(bigmatrix)

(100, 100)

We can evaluate the quality of these knocokffs by computing the average correlation between $X_{i,j}$ and $\tilde{X}_{i,j}$.

In [5]:
cors = []
for j in range(p):
    cors += [np.corrcoef(bigmatrix[:, j], bigmatrix[:, j + p])[0, 1]]
np.mean(cors)                    

0.6626815538939757