# Validation of statistical-tests

We would like to validate that our test-settings are correct. 

We have data of $X=Y$ and $X!=Y$ beforehand, and we run tests. If test's result are same as the truth, we regard that as valid.

The notebook shows you samples to validate a stats-test.

In [1]:
import sys
sys.path.append("../")
sys.path.append(".")

In [2]:
from model_criticism_mmd import ModelTrainerTorchBackend, MMD, TwoSampleDataSet
from model_criticism_mmd import kernels_torch
from model_criticism_mmd import PermutationTest, SelectionKernels
from model_criticism_mmd.models.static import DEFAULT_DEVICE
from model_criticism_mmd.supports.evaluate_stats_tests import StatsTestEvaluator, TestResultGroupsFormatter



In [3]:
import torch
import numpy as np
import tqdm
import typing
%matplotlib inline
import matplotlib.pyplot as plt

In [4]:
N_DATA_SIZE = 500
N_FEATURE = 100
NOISE_MU_X = 0
NOISE_SIGMA_X = 0.5
NOISE_MU_Y = 0
NOISE_SIGMA_Y = 0.5
THRESHOLD_P_VALUE = 0.05

# Epoch should be > 500 normally. Here small value for example.
num_epochs_selection = 50
# Permutation should be > 500 normally. Here small value for example.
n_permutation_test = 100

In [5]:
device_obj = torch.device('cpu')

In [6]:
x_train = torch.tensor(np.random.normal(NOISE_MU_X, NOISE_SIGMA_X, (N_DATA_SIZE, N_FEATURE)))
x_eval = [torch.tensor(np.random.normal(NOISE_MU_X, NOISE_SIGMA_X, (N_DATA_SIZE, N_FEATURE))) for i in range(0, 3)]
y_train_same = torch.tensor(np.random.normal(NOISE_MU_X, NOISE_SIGMA_X, (N_DATA_SIZE, N_FEATURE)))
y_eval_same = [torch.tensor(np.random.normal(NOISE_MU_X, NOISE_SIGMA_X, (N_DATA_SIZE, N_FEATURE))) for i in range(0, 3)]
y_train_diff = torch.tensor(np.random.laplace(NOISE_MU_X, NOISE_SIGMA_X, (N_DATA_SIZE, N_FEATURE)))
y_eval_diff = [torch.tensor(np.random.laplace(NOISE_MU_X, NOISE_SIGMA_X, (N_DATA_SIZE, N_FEATURE))) for i in range(0, 3)]

In [7]:
# lengthscale=-1.0 is "median heuristic"
rbf_kernel = kernels_torch.BasicRBFKernelFunction(device_obj=device_obj, log_sigma=-1.0)
matern_0_5 = kernels_torch.MaternKernelFunction(nu=0.5, device_obj=device_obj, lengthscale=-1.0)
matern_1_5 = kernels_torch.MaternKernelFunction(nu=1.5, device_obj=device_obj, lengthscale=-1.0)
matern_2_5 = kernels_torch.MaternKernelFunction(nu=2.5, device_obj=device_obj, lengthscale=-1.0)

# the tuple is (initial-scles, kernel-function). If initial-scale is None, the scale is initialized randomly.
kernels_optimization = [(None, rbf_kernel), (None, matern_0_5), (None, matern_1_5), (None, matern_2_5)]
kernels_non_optimization = [rbf_kernel, matern_2_5]

`StatsTestEvaluator` runs all operations automatically,

1. optimization of kernels.
2. running of permutation tests.
3. decision if stats-test is same as our expectations.

In [8]:
test_eval = StatsTestEvaluator(candidate_kernels=kernels_optimization, 
                               kernels_no_optimization=kernels_non_optimization, 
                               device_obj=device_obj, 
                               num_epochs=num_epochs_selection, 
                               n_permutation_test=n_permutation_test)

Either (y_train_same, y_eval_same) or (y_train_diff, y_eval_diff) must be given

In [9]:
stats_tests = test_eval.interface(code_approach='tests', 
                                  x_train=x_train,
                                  y_train_same=y_train_same,
                                  y_train_diff=y_train_diff,
                                  seq_x_eval=x_eval,
                                  seq_y_eval_same=y_eval_same,
                                  seq_y_eval_diff=y_eval_diff)

2021-08-26 14:01:27,499 - model_criticism_mmd.logger_unit - INFO - Set the initial scales value
  scales = torch.tensor(init_scale.clone().detach().cpu(), requires_grad=True, device=self.device_obj)
2021-08-26 14:01:27,504 - model_criticism_mmd.logger_unit - INFO - Getting median initial sigma value...
2021-08-26 14:01:27,593 - model_criticism_mmd.logger_unit - INFO - initial by median-heuristics 1.78 with is_log=True
2021-08-26 14:01:27,644 - model_criticism_mmd.logger_unit - INFO - Validation at 0. MMD^2 = 0.011146119449893632, ratio = [14.00113678] obj = [-2.63913852]
2021-08-26 14:01:28,391 - model_criticism_mmd.logger_unit - INFO -      5: [avg train] MMD^2 0.0054434180513573405 obj [-3.97209434] val-MMD^2 0.01120307338267923 val-ratio [13.54671787] val-obj [-2.60614429]  elapsed: 0.0
2021-08-26 14:01:29,713 - model_criticism_mmd.logger_unit - INFO -     25: [avg train] MMD^2 0.005681154364915841 obj [-4.01937891] val-MMD^2 0.01195289760922058 val-ratio [13.92881736] val-obj [-2.6

`TestResultGroupsFormatter` is a class to format test-results friendly.

In [10]:
test_formatter = TestResultGroupsFormatter(stats_tests)
df_results = test_formatter.format_result_table()
df_results_summary = test_formatter.format_result_summary_table()

`format_result_summary_table()` shows you test-results for both of X=Y and X!=Y.

In [11]:
df_results_summary

Unnamed: 0,test-key,X=Y_total,X=Y_pass,X=Y_error-1,X=Y_error-2,X!=Y_total,X!=Y_pass,X!=Y_error-1,X=!Y_error-2,kernel,length_scale,is_optimization
0,tests-BasicRBFKernelFunction-False,3,3,0,0,3,3,0,0,BasicRBFKernelFunction,1.7842152488771497,False
1,tests-BasicRBFKernelFunction-True,3,3,0,0,3,3,0,0,BasicRBFKernelFunction,1.7842152488771497,True
2,tests-MaternKernelFunction-nu=0.5-True,3,3,0,0,3,3,0,0,MaternKernelFunction-nu=0.5,"[[tensor(5.0075, grad_fn=<UnbindBackward>)]]",True
3,tests-MaternKernelFunction-nu=1.5-True,3,3,0,0,3,3,0,0,MaternKernelFunction-nu=1.5,"[[tensor(5.0075, grad_fn=<UnbindBackward>)]]",True
4,tests-MaternKernelFunction-nu=2.5-False,3,3,0,0,3,3,0,0,MaternKernelFunction-nu=2.5,"[[tensor(5.0075, grad_fn=<UnbindBackward>)]]",False
5,tests-MaternKernelFunction-nu=2.5-True,3,3,0,0,3,3,0,0,MaternKernelFunction-nu=2.5,"[[tensor(5.0075, grad_fn=<UnbindBackward>)]]",True


`format_result_table()` shows you details of test-results.

In [12]:
df_results

Unnamed: 0,codename_experiment,kernel,kernel_parameter,is_optimized,test_result,p_value,is_same_distribution_truth,is_same_distribution_test,ratio
0,tests,MaternKernelFunction-nu=0.5,"[[tensor(5.0075, grad_fn=<UnbindBackward>)]]",True,pass,0.93,True,True,39.38828
1,tests,MaternKernelFunction-nu=1.5,"[[tensor(5.0075, grad_fn=<UnbindBackward>)]]",True,pass,0.9,True,True,24.767485
2,tests,MaternKernelFunction-nu=2.5,"[[tensor(5.0075, grad_fn=<UnbindBackward>)]]",True,pass,0.96,True,True,21.734753
3,tests,BasicRBFKernelFunction,1.7842152488771497,True,pass,0.94,True,True,14.868143
4,tests,BasicRBFKernelFunction,1.7842152488771497,False,pass,0.86,True,True,
5,tests,MaternKernelFunction-nu=2.5,"[[tensor(5.0075, grad_fn=<UnbindBackward>)]]",False,pass,0.93,True,True,
6,tests,MaternKernelFunction-nu=0.5,"[[tensor(5.0075, grad_fn=<UnbindBackward>)]]",True,pass,0.26,True,True,39.38828
7,tests,MaternKernelFunction-nu=1.5,"[[tensor(5.0075, grad_fn=<UnbindBackward>)]]",True,pass,0.25,True,True,24.767485
8,tests,MaternKernelFunction-nu=2.5,"[[tensor(5.0075, grad_fn=<UnbindBackward>)]]",True,pass,0.14,True,True,21.734753
9,tests,BasicRBFKernelFunction,1.7842152488771497,True,pass,0.17,True,True,14.868143
