Below, we demonstrate distribution calibration of probabilistic uncertainty over continuous output. 

At first, we import the necessary files. For this demo, we use the  [California Housing Dataset](https://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html). 

In [None]:
import torch
import sys
sys.path.append('../..')

from torchuq.transform.distcal_continuous import *

from torchuq.evaluate.distribution_cal import *
from torchuq.dataset.regression import *
from sklearn.linear_model import BayesianRidge
from sklearn.metrics import mean_absolute_error
from torchuq.evaluate import quantile as q_eval
from matplotlib import pyplot as plt

#uci_dataset = ["wine", "crime", "naval", "protein", "superconductivity"]

subset_uci = ["cal_housing"]



Below, we use Bayesian Ridge Regression as the base model to predict probabilistic outcome, represented by the mean and standard deviation as parameters of a Gaussian outcome distribution.

We use object of the **DistCalibrator** class to train a recalibrator that takes the probabilistic outcome from the base model and outputs the recalibrated distribution parameterized by a fixed number of equispaced quantiles. In this example, we use 20 equispaced quantiles to featurize the outcome distribution. 

We use an independent calibration dataset to train the DistCalibrator. We evaluate the quality of probabilistic uncertainty with the check score and calibration score as defined [here](https://arxiv.org/pdf/2112.07184). 

In [None]:
# Number of evaluation buckets
num_buckets=20

for name in subset_uci:
	# 60% Train, 20% Calibration, 20% Test dataset
	dataset = get_regression_datasets(name, val_fraction=0.2, test_fraction=0.2, split_seed=0, normalize=True, verbose=True)
	
	train_dataset, cal_dataset, test_dataset = dataset
	X_train, y_train = train_dataset[:][0], train_dataset[:][1]
	X_cal, y_cal = cal_dataset[:][0], cal_dataset[:][1]
	X_test, y_test = test_dataset[:][0], test_dataset[:][1]
	
	# Bayesian Ridge Regression to obtain probabilistic outcomes parameterized by the mean and std deviation of Gaussian outcome for each data-point
	reg = BayesianRidge().fit(X_train, y_train)
	print(f"Coeff of determination (R^2) on Train: {reg.score(X_train, y_train):.2}")
	print(f"Coeff of determination (R^2) on Test: {reg.score(X_test, y_test):.2}")
	


	# Predict mean and std deviation of the outcome distribution on the calibration and test datasets 
	mean_cal, std_dev_cal = reg.predict(X_cal.numpy(), return_std=True)
	mean_cal, std_dev_cal = torch.Tensor(mean_cal), torch.Tensor(std_dev_cal)

	mean_test, std_dev_test = reg.predict(X_test.numpy(), return_std=True)
	mean_test, std_dev_test = torch.Tensor(mean_test), torch.Tensor(std_dev_test)

	params_cal = torch.cat((mean_cal.reshape(-1, 1), std_dev_cal.reshape(-1, 1)), axis=1)
	params_test = torch.cat((mean_test.reshape(-1, 1), std_dev_test.reshape(-1, 1)), axis=1)

	# Convert probabilistic predictions to quantiles
	quantiles_cal = convert_normal_to_quantiles(mean_cal, std_dev_cal, num_buckets)
	quantiles_test = convert_normal_to_quantiles(mean_test, std_dev_test, num_buckets)

	

	# Use the DistCalibrator class and train it on the calibration dataset
	# Here, the recalibrator uses a fixed number of equispaced quantiles as featurization of the probabilistic outcome
	calibrator = DistCalibrator(num_buckets = num_buckets, quantile_input=True, verbose=True)
	calibrator.train(quantiles_cal, torch.Tensor(y_cal))

	# Below code is needed if you featurized the Gaussian probabilistic outcome using their parameters mean and std deviation
	# calibrator = DistCalibrator(quantile_input=False, verbose=True)
	# calibrator.train(params_cal, torch.Tensor(y_cal))

	# Evaluation
	# 
	
	# Compare check scores and weighted calibrations cores 
	print("="*25)
	check_score_before, check_score_after = comparison_quantile_check_score(quantiles_cal, torch.Tensor(y_cal), np.linspace(0, 1, num_buckets), model=calibrator)

	print(f"[Calibration Split] Check score before calibration={check_score_before}, Check score after calibration={check_score_after}")


	cal_score_before, cal_score_after = comparison_quantile_calibration_scores(quantiles_cal, torch.Tensor(y_cal), np.linspace(0, 1, num_buckets), model=calibrator)

	print(f"[Calibration Split] Calibration score before calibration={cal_score_before}, Calibration score after calibration={cal_score_after}")

	print("="*25)

	check_score_before, check_score_after = comparison_quantile_check_score(quantiles_test, torch.Tensor(y_test), np.linspace(0, 1, num_buckets), model=calibrator)

	print(f"[Test Split] Check score before calibration={check_score_before}, Check score after calibration={check_score_after}")

	cal_score_before, cal_score_after = comparison_quantile_calibration_scores(quantiles_test, torch.Tensor(y_test), np.linspace(0, 1, num_buckets), model=calibrator)
	
	print(f"[Test Split] Calibration score before calibration={cal_score_before}, Calibration score after calibration={cal_score_after}")
	print("="*25)



Loading dataset cal_housing....
Splitting into train/val/test with 12384/4128/4128 samples
Done loading dataset cal_housing
Coeff of determination (R^2) on Train: 0.61
Coeff of determination (R^2) on Test: 0.6


100%|██████████| 1500/1500 [01:51<00:00, 13.48it/s]


[Calibration Split] Check score before calibration=3.1939690113067627, Check score after calibration=2.9653549194335938
[Calibration Split] Calibration score before calibration=0.04242164668846845, Calibration score after calibration=0.009161099773619456
[Test Split] Check score before calibration=3.199695587158203, Check score after calibration=2.962188243865967
[Test Split] Calibration score before calibration=0.03905351046967815, Calibration score after calibration=0.009016458873366034
