We simulate $Y^e = \beta_1 X_1^e + \beta_2 X_2^e + \epsilon^e $, for an environement $e \in E$. Consider two environements $r , \, f \in E$. Assume that both $(X_1^r , X_2^r)$ and $(X_1^f , X_2^f)$ have independent $Pa(1)$ margins, that is $\mathbb{P}(X_i^e > x)  = 1/x, \, x> 1$ for $i=1,2$. Suppose that both $\epsilon^r$ and $\epsilon^f$ follow a $Pa(2)$ distribution, the first with scale parameter $1$ and second with scale parameter $2$. Note that with this setting, the covariates $\boldsymbol{X}$ are heavier than the noise $\epsilon$, on the tail. Note also that across the two environements $r, \, f$, the noise has comparable tails.   

In [1]:
import numpy as np
from sklearn import linear_model

In [2]:

N = 10000
Beta1 = 1 
Beta2 = 2

###Environement r 
X1_r = 1 + np.random.pareto(1, N)  
X2_r = 1 + np.random.pareto(1, N) 
epsilon_r = 1 + np.random.pareto(2, N) 
Y_r = Beta1 * X1_r + Beta2 * X2_r + epsilon_r 
covariates_r  = np.column_stack((X1_r, X2_r))
###Environement f 
X1_f = 1 +  np.random.pareto(1, N)  
X2_f = 1 + np.random.pareto(1, N) 
epsilon_f = (np.random.pareto(2, N) + 1 ) * 2
Y_f = Beta1 * X1_f + Beta2 * X2_f + epsilon_f 
covariates_f  = np.column_stack((X1_f, X2_f))

regr = linear_model.LinearRegression()
regr.fit(covariates_r , Y_r)
Y_r_hat =  (covariates_r * regr.coef_).sum(axis=1)
resid_r = Y_r - Y_r_hat
###For environement f
regr = linear_model.LinearRegression()
regr.fit(covariates_f , Y_f)
Y_f_hat =  (covariates_f * regr.coef_).sum(axis=1)
resid_f = Y_f - Y_f_hat

In the following, we will test whether the residuals $\hat{\epsilon}_r$ and $\hat{\epsilon}_f$ have the same shape coefficient. The hypothesis of the test are :
$$
H_0 : \alpha_r = \alpha_f,
\qquad 
H_1 : \alpha_r \neq \alpha_f
$$

To do so, we will use the likelihood ratio test proposed in Worms, J., & Worms, R. (2015)

In [4]:
from statistic import statistic_fun

data = [resid_r , resid_f]
extr_tail_fraction = [0.95 , 0.95]

value_stat = statistic_fun(data , extr_tail_fraction)

print("The value of the statistic for the residuals of the two environements is given by ", value_stat)

The value of the statistic for the residuals of the two environements is given by  0.054581479237945194


For a level $0.95$, we have $\chi^2_{0.95} = 3.84$. Thus, we keep $H_0$. 