# Using GaussianProcessIntervalArithmetic (GPIA)

If you haven't already, before checking out this notebook read the README file.  That gives a really brief introduction to the concept of interval arithmetic and the motivation for implementing Gaussian process regression with interval arithmetic.  

In this notebook we use GPIA in order to obtain interval arithmetic predictions of a Gaussian process.  We compare the output of GPIA to the output of a Gaussian process implemented in scikit-learn (which does not track interval arithmetic).  We see that, as expected, the output of the scikit-learn function always lies in the rigorous interval produced by GPIA.  

In [7]:
# importing Gaussian Process Interval Arithmetic Regressor
from GPIA import *

# Import sklearn GPR to compare
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF

# for drawing points to test algorithms
import random

## Function Approximation

Gaussian processes (GP) are a type of machine learning model.  In this notebook we will use GP in order to approximate the function 

$$f(x,y,z) = (3xy, 5z^2)$$

There is nothing special about this function, we just need something for our model to learn.  We will randomly sample data from this function in order to train the GP.


In [31]:
# function to approximate
def f(x,y,z):
    return [3 * x * y, 5 * z**2]

# randomly sampling function

# Set a seed for reproducibility (optional)
random.seed(42)

# parameters for region of analysis
lower_bounds = [0,0,0]
upper_bounds = [1, 2, 3]

# Generate training data
num_points = 200
X_train = [[random.uniform(lower_bounds[i],upper_bounds[i]) for i in range(len(lower_bounds))] for _ in range(num_points)]
Y_train = [ f(*X[i]) for i in range(num_points) ]

## Training the GPs

In [35]:
# setting up GP algorithms
# set length scale 
tau = 5
# set noise variance
sig_sq = 1

# interval arithmetic:
gpia = GPIA_Regressor(tau = tau, sig_sq = sig_sq)
gpia.fit(X_train, Y_train)

# sklearn:
kernel = RBF(length_scale = tau, length_scale_bounds = "fixed")
gp = GaussianProcessRegressor(kernel=kernel, alpha = sig_sq)
gp.fit(X_train, Y_train)

## Gathering Test Data

In [48]:
# generating random points in the same domain as the training data

# Generate test data
num_points = 3
X_test = [[random.uniform(lower_bounds[i],upper_bounds[i]) for i in range(len(lower_bounds))] for _ in range(num_points)]


## Comparison of scikit-learn and interval arithmetic

We compare the output of GPIA and scikit-learn's GP regression.  Notice that for each test point, the output of sklearn's predictor is a pair of points. The output of GPIA, on the other hand, is a pair of intervals; each of these intervals contains the corresponding scalar from the sklearn pair. 

In [49]:
for x in X_test:
    print('**************************************')
    print('')
    
    # sklearn
    skoutput = gp.predict([x])
    print('sklearn output: %s' %skoutput)
    print('')
    
    # interval arithmetic
    inoutput = gpia.predict(x)
    print('Interval Arithmetic output: %s' %inoutput)
    print('                ')

    print('**************************************')


**************************************

sklearn output: [[ 1.63792083 22.49981561]]

Interval Arithmetic output: [interval([1.6379208278699449, 1.6379208278700257]), interval([22.49981561138933, 22.499815611390073])]
                
**************************************
**************************************

sklearn output: [[ 2.65095585 20.05278707]]

Interval Arithmetic output: [interval([2.6509558510214055, 2.650955851021488]), interval([20.05278706973838, 20.052787069739125])]
                
**************************************
**************************************

sklearn output: [[ 2.72121346 12.13876642]]

Interval Arithmetic output: [interval([2.72121345711083, 2.7212134571109106]), interval([12.1387664244903, 12.138766424490962])]
                
**************************************
