# Validation of statistical-tests

We would like to validate that our test-settings are correct. 

We have data of $X=Y$ and $X!=Y$ beforehand, and we run tests. If test's result are same as the truth, we regard that as valid.

The notebook shows you samples to validate a stats-test.

In [10]:
import sys
sys.path.append("../")
sys.path.append(".")

In [11]:
from model_criticism_mmd import ModelTrainerTorchBackend, MMD, TwoSampleDataSet
from model_criticism_mmd import kernels_torch
from model_criticism_mmd import PermutationTest, SelectionKernels
from model_criticism_mmd.models.static import DEFAULT_DEVICE
from model_criticism_mmd.supports.evaluate_stats_tests import StatsTestEvaluator, TestResultGroupsFormatter

In [12]:
import torch
import numpy as np
import tqdm
import typing
%matplotlib inline
import matplotlib.pyplot as plt

In [13]:
N_DATA_SIZE = 500
N_FEATURE = 100
NOISE_MU_X = 0
NOISE_SIGMA_X = 0.5
NOISE_MU_Y = 0
NOISE_SIGMA_Y = 0.5
THRESHOLD_P_VALUE = 0.05

# Epoch should be > 500 normally. Here small value for example.
num_epochs_selection = 50
# Permutation should be > 500 normally. Here small value for example.
n_permutation_test = 100

In [14]:
device_obj = DEFAULT_DEVICE

In [15]:
x_train = torch.tensor(np.random.normal(NOISE_MU_X, NOISE_SIGMA_X, (N_DATA_SIZE, N_FEATURE)))
x_eval = [torch.tensor(np.random.normal(NOISE_MU_X, NOISE_SIGMA_X, (N_DATA_SIZE, N_FEATURE))) for i in range(0, 3)]
y_train_same = torch.tensor(np.random.normal(NOISE_MU_X, NOISE_SIGMA_X, (N_DATA_SIZE, N_FEATURE)))
y_eval_same = [torch.tensor(np.random.normal(NOISE_MU_X, NOISE_SIGMA_X, (N_DATA_SIZE, N_FEATURE))) for i in range(0, 3)]
y_train_diff = torch.tensor(np.random.laplace(NOISE_MU_X, NOISE_SIGMA_X, (N_DATA_SIZE, N_FEATURE)))
y_eval_diff = [torch.tensor(np.random.laplace(NOISE_MU_X, NOISE_SIGMA_X, (N_DATA_SIZE, N_FEATURE))) for i in range(0, 3)]

In [16]:
# lengthscale=-1.0 is "median heuristic"
rbf_kernel = kernels_torch.BasicRBFKernelFunction(device_obj=device_obj, log_sigma=-1.0)
matern_0_5 = kernels_torch.MaternKernelFunction(nu=0.5, device_obj=device_obj, lengthscale=-1.0)
matern_1_5 = kernels_torch.MaternKernelFunction(nu=1.5, device_obj=device_obj, lengthscale=-1.0)
matern_2_5 = kernels_torch.MaternKernelFunction(nu=2.5, device_obj=device_obj, lengthscale=-1.0)

# the tuple is (initial-scles, kernel-function). If initial-scale is None, the scale is initialized randomly.
kernels_optimization = [(None, rbf_kernel), (None, matern_0_5), (None, matern_1_5), (None, matern_2_5)]
kernels_non_optimization = [rbf_kernel, matern_2_5]

`StatsTestEvaluator` runs all operations automatically,

1. optimization of kernels.
2. running of permutation tests.
3. decision if stats-test is same as our expectations.

In [17]:
test_eval = StatsTestEvaluator(candidate_kernels=kernels_optimization, 
                               kernels_no_optimization=kernels_non_optimization, 
                               device_obj=device_obj, 
                               num_epochs=num_epochs_selection, 
                               n_permutation_test=n_permutation_test)

Either (y_train_same, y_eval_same) or (y_train_diff, y_eval_diff) must be given

In [18]:
stats_tests = test_eval.interface(code_approach='tests', 
                                  x_train=x_train,
                                  y_train_same=y_train_same,
                                  y_train_diff=y_train_diff,
                                  seq_x_eval=x_eval,
                                  seq_y_eval_same=y_eval_same,
                                  seq_y_eval_diff=y_eval_diff)

2021-08-27 13:12:12,140 - model_criticism_mmd.logger_unit - INFO - Set the initial scales value
  scales = torch.tensor(init_scale.clone().detach().cpu(), requires_grad=True, device=self.device_obj)
2021-08-27 13:12:12,141 - model_criticism_mmd.logger_unit - INFO - Getting median initial sigma value...
2021-08-27 13:12:12,176 - model_criticism_mmd.logger_unit - INFO - initial by median-heuristics 1.78 with is_log=True
2021-08-27 13:12:12,181 - model_criticism_mmd.logger_unit - INFO - Validation at 0. MMD^2 = 0.010267131798815776, ratio = [74.07250009] obj = [-4.30504434]
2021-08-27 13:12:12,337 - model_criticism_mmd.logger_unit - INFO -      5: [avg train] MMD^2 0.004867938004671057 obj [-3.84460782] val-MMD^2 0.010358365632511246 val-ratio [77.99871712] val-obj [-4.35669238]  elapsed: 0.0
2021-08-27 13:12:12,734 - model_criticism_mmd.logger_unit - INFO -     25: [avg train] MMD^2 0.005438670627568143 obj [-3.94215541] val-MMD^2 0.01096682363017909 val-ratio [109.6682363] val-obj [-4.6

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

`TestResultGroupsFormatter` is a class to format test-results friendly.

In [None]:
test_formatter = TestResultGroupsFormatter(stats_tests)
df_results = test_formatter.format_result_table()
df_results_summary = test_formatter.format_result_summary_table()

`format_result_summary_table()` shows you test-results for both of X=Y and X!=Y.

In [None]:
df_results_summary

`format_result_table()` shows you details of test-results.

In [None]:
df_results