<img src="./static/AI_background_v4_4_960px.jpg" width=300/>

# AI and Safety

A tutorial to the Design-of-experiments (DOE) presented in DNV's [AI and safety](https://ai-and-safety.dnvgl.com/#sec-doe)  position paper, implemented under the [RaPiD-models research project](https://rapid-models.dnvgl.com/)

## import packages

In [12]:
# Load the autoreload extension
%load_ext autoreload

# Autoreload reloads modules before executing code
# 0: disable
# 1: reload modules imported with %aimport
# 2: reload all modules, except those excluded by %aimport
%autoreload 2


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [2]:
import numpy as np
import scipy as sp
import plotly
import plotly.graph_objs as go

# Import Gaussian process from scikit-learn (https://scikit-learn.org/)
import sklearn.gaussian_process as sklgp

# Import rapid_models DOE packages
import rapid_models.doe as doe
import rapid_models.doe.adaptive_learning as doe_al

## Define black-box function
This black-box function simulate an expensive/slow process that we want to explore the response of

In [3]:
def black_box_func(x,y):
    """A non-linear "black box" function used to 
    illustrate aspects of Design-of-experiments
    
    """
    return 0.4*np.sin(x*6)*np.sin(2*y)+x*y+0.05*np.sin(7*x)+0.1*x

## One-at-a-time (oat) exploration
Use the One-at-a-time / full factorial function to probe the black box function at 15 locations (3 in x times 5 in y). We explore the range $ x,y\in[0,1] $ and set the lb and ub of the points at 0.1 and 0.9



In [4]:
# Use the One-at-a-time function doe.fullfact_with_bounds() to probe the black box function at 15 locations (3 in x times 5 in y)
LBs=[0.1,0.1]
UBs=[0.9,0.9]
N_xi=[3,5]
X_oat=np.asarray(doe.fullfact_with_bounds(LBs, UBs, N_xi))
X_oat=X_oat[X_oat[:,0].argsort()]
y_oat=black_box_func(X_oat[:,0],X_oat[:,1])

In [5]:
# make x,y grid for plotting surfaces
mx, my = np.mgrid[0:1:20j,0:1:20j]
# calculate true function for surface plotting
Xsurf=np.vstack([mx.flatten(),my.flatten()]).T
# calculate true surface
mz=black_box_func(mx.flatten(), my.flatten()).reshape(mx.shape)


# Plot the result
fig = go.FigureWidget()

fig.add_surface(x=mx, y=my, z=mz, name='true function',
                colorscale='Gray', opacity=0.5, showscale=False,
               colorbar={'title':'std','len':0.5, 'thickness':20})
fig.add_scatter3d(x=X_oat[:,0], y=X_oat[:,1], z=y_oat, mode='markers',
                  marker={'symbol':'circle', 'color':'black', 'size':3},
                  name='OAT obs', showlegend=True)

fig.update_layout(scene={'xaxis':{'range':[0,1]},
                         'yaxis':{'range':[0,1]},
                         'zaxis':{'range':[0,1]},
                         'aspectratio':{'x':1,'y':1,'z':1},
                         'camera':{
                            'eye':{'x':-0.7, 'y':-2., 'z':0.3}}
                        })

fig.show()

Train a Gaussian Process regression model
Use Scikit-learn's Gausian Process (GP) regressor to train a GP model based on the above observations

[https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.GaussianProcessRegressor.html#sklearn.gaussian_process.GaussianProcessRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.GaussianProcessRegressor.html#sklearn.gaussian_process.GaussianProcessRegressor)

and select a Matérn 5/2 kernel with a prior isotropic lengthscale of 30% of the range (i.e. 0.1)

[https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.Matern.html#sklearn.gaussian_process.kernels.Matern](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.Matern.html#sklearn.gaussian_process.kernels.Matern)

In [6]:
length_scale=1/3
kernel = sklgp.kernels.RBF(length_scale=length_scale)#sklgp.kernels.Matern(length_scale=length_scale, nu=5/2)

In [7]:
def train_gp(X_train, y_train, kernel, noise=1e-10, n_restart_optimizer=5, random_state=42):
    """
    Train a scikit-learn gpm
    
    **Parameters:**
    X_train: training input data, array-like of shape (n_samples, n_dimensions)
    y_train: training output data, array-like of shape (n_samples)
    kernel:  GP kernel
    noise:   Interpreted as the variance of additional Gaussian measurement noise on the training observations.
             Value will be added to the diagonal of the kernel matrix
    n_restart_optimizer: Number of restarts of the optimizer
    random_state: Determines random number generation used to initialize the centers. Pass an int for reproducible results across multiple function calls.
    """
   
    gpm = sklgp.GaussianProcessRegressor(alpha=noise, kernel=kernel, n_restarts_optimizer=n_restart_optimizer, random_state=random_state)
    gpm.fit(X_train, y_train)
    return gpm
  

In [8]:
# Train gpm on oat data
gpm_oat = train_gp(X_oat, y_oat, kernel)

# Predict surface based on fitted GP
mu_oat,std_oat = gpm_oat.predict(Xsurf, return_std=True)

# Plot the result
fig = go.FigureWidget()

fig.add_surface(x=mx, y=my, z=mu_oat.reshape(mx.shape), name='GP oat',
                colorscale='Reds', surfacecolor=std_oat.reshape(mx.shape), opacity=0.5, showscale=True,
               colorbar={'title':'std','len':0.5, 'thickness':20})

fig.add_scatter3d(x=X_oat[:,0], y=X_oat[:,1], z=y_oat, mode='markers',
                  marker={'symbol':'circle', 'color':'red', 'size':3},
                  name='OAT obs', showlegend=True)

# update layout to set view
fig.update_layout(
                  scene={'xaxis':{'range':[0,1]},
                         'yaxis':{'range':[0,1]},
                         'zaxis':{'range':[0,1]},
                         'aspectratio':{'x':1,'y':1,'z':1},
                         'camera':{
                            'eye':{'x':-0.7, 'y':-2., 'z':0.3}}
                        })

fig.show()


lbfgs failed to converge (status=2):
ABNORMAL_TERMINATION_IN_LNSRCH.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html



## Latin-Hypercube
Use the rapid_models.doe package to create a latin-hypercube sample set of 15 samples in 2 dimensions. (The lhs function is based on the pyDOE2 package with documentation of methods at [https://pythonhosted.org/pyDOE/](https://pythonhosted.org/pyDOE/)

In [9]:
LBs=[0.0,0.0]
UBs=[1.0,1.0]
X_lhs = doe.lhs_with_bounds(2, 15, LBs, UBs)
y_lhs = black_box_func(X_lhs[:,0],X_lhs[:,1])

In [10]:
# Train gpm on oat data
gpm_lhs = train_gp(X_lhs, y_lhs, kernel)

# Predict surface based on fitted GP
mu_lhs,std_lhs = gpm_lhs.predict(Xsurf, return_std=True)

# Plot the result
fig = go.FigureWidget()

fig.add_surface(x=mx, y=my, z=mu_oat.reshape(mx.shape), name='GP oat',
                colorscale='Reds', surfacecolor=std_oat.reshape(mx.shape), opacity=0.5, showscale=False,
               colorbar={'title':'std','len':0.5, 'thickness':20})

fig.add_surface(x=mx, y=my, z=mu_lhs.reshape(mx.shape), name='GP lhs',
                colorscale='Oranges', surfacecolor=std_lhs.reshape(mx.shape), opacity=0.5, showscale=True,
               colorbar={'title':'std','len':0.5, 'thickness':20})

fig.add_scatter3d(x=X_oat[:,0], y=X_oat[:,1], z=y_oat, mode='markers',
                  marker={'symbol':'circle', 'color':'red', 'size':2},
                  name='OAT obs', showlegend=True)

fig.add_scatter3d(x=X_lhs[:,0], y=X_lhs[:,1], z=y_lhs, mode='markers',
                  marker={'symbol':'circle', 'color':'orange', 'size':3},
                  name='LHS obs', showlegend=True)

# update layout to set view
fig.update_layout(
                  scene={'xaxis':{'range':[0,1]},
                         'yaxis':{'range':[0,1]},
                         'zaxis':{'range':[0,1]},
                         'aspectratio':{'x':1,'y':1,'z':1},
                         'camera':{
                            'eye':{'x':-0.7, 'y':-2., 'z':0.3}}
                        })

fig.show()

## Adaptive learning - reducing local uncertainty

In [11]:
n_init=5
X_init=doe.lhs_with_bounds(2, n_init, LBs, UBs)
y_init=black_box_func(X_init[:,0],X_init[:,1])

# train initial GP to initial observations
gpm_al1 = train_gp(X_init, y_init, kernel)

# large sample to explore objective function
X = doe.lhs_with_bounds(2, 1000, LBs, UBs)
    
n_AL_iterations=10
X_al1=None
y_al1=None
for q in range(n_AL_iterations):
      
    # predict output for the large sample to explore the objective function
    mu,std = gpm_al1.predict(X, return_std=True) 
    
    # select ixs where the objective function is to select the point with the highest predicted std
    ixs, _ = doe_al.AL_McKay92_idx(std, nNew=1)
    
    # add selected X[ixs] to the X_al1 
    if X_al1 is None:
        X_al1=X[ixs]
    else:
        X_al1=np.concatenate([X_al1, X[ixs]])
    
    # observe black-box-function at selected input
    if y_al1 is None:
        y_al1 = black_box_func(X[ixs,0],X[ixs,1])
    else:
        y_al1 = np.concatenate([y_al1,black_box_func(X[ixs,0],X[ixs,1])])
        
    # train model on initial and adaptivly selected inputs
    gpm_al1 = train_gp(np.concatenate([X_init, X_al1]), np.concatenate([y_init, y_al1]), kernel)

IndexError: tuple index out of range

In [None]:
# Predict surface based on fitted GP
mu_al1,std_al1 = gpm_al1.predict(Xsurf, return_std=True)

# Plot the result
fig = go.FigureWidget()

# fig.add_surface(x=mx, y=my, z=mu_oat.reshape(mx.shape), name='GP oat',
#                 colorscale='Reds', surfacecolor=std_oat.reshape(mx.shape), opacity=0.5, showscale=False,
#                colorbar={'title':'std','len':0.5, 'thickness':20})

fig.add_surface(x=mx, y=my, z=mu_lhs.reshape(mx.shape), name='GP lhs',
                colorscale='Oranges', surfacecolor=std_lhs.reshape(mx.shape), opacity=0.5, showscale=False,
               colorbar={'title':'std','len':0.5, 'thickness':20})

fig.add_surface(x=mx, y=my, z=mu_al1.reshape(mx.shape), name='GP al1',
                colorscale='Blues', surfacecolor=std_al1.reshape(mx.shape), opacity=0.5, showscale=True,
               colorbar={'title':'std','len':0.5, 'thickness':20})

# fig.add_scatter3d(x=X_oat[:,0], y=X_oat[:,1], z=y_oat, mode='markers',
#                   marker={'symbol':'circle', 'color':'red', 'size':2},
#                   name='OAT obs', showlegend=True)

fig.add_scatter3d(x=X_lhs[:,0], y=X_lhs[:,1], z=y_lhs, mode='markers',
                  marker={'symbol':'circle', 'color':'orange', 'size':2},
                  name='LHS obs', showlegend=True)

fig.add_scatter3d(x=X_init[:,0], y=X_init[:,1], z=y_init, mode='markers',
                  marker={'symbol':'circle', 'color':'black', 'size':3},
                  name='LHS init obs', showlegend=True)

fig.add_scatter3d(x=X_al1[:,0], y=X_al1[:,1], z=y_al1, mode='markers',
                  marker={'symbol':'circle', 'color':'blue', 'size':3},
                  name='al1 obs', showlegend=True)


# update layout to set view
fig.update_layout(
                  scene={'xaxis':{'range':[0,1]},
                         'yaxis':{'range':[0,1]},
                         'zaxis':{'range':[0,1]},
                         'aspectratio':{'x':1,'y':1,'z':1},
                         'camera':{
                            'eye':{'x':-0.7, 'y':-2., 'z':0.3}}
                        })

fig.show()

## Adaptive learning - reducing global uncertainty

In [None]:
# train initial GP to initial observations
gpm_al2 = train_gp(X_init, y_init, kernel)
ker_func = gpm_al2.kernel.__call__ #sklgp.kernels.RBF(length_scale=0.1)

# large sample to explore objective function
X = doe.lhs_with_bounds(2, 1000, LBs, UBs)
  
n_AL_iterations=10
X_al2=None
y_al2=None
for q in range(n_AL_iterations):
      
    # predict output for the large sample to explore the objective function
    mu,std = gpm_al2.predict(X, return_std=True) 
    
    # select ixs where the objective function is to select the point with the highest predicted std
    ixs, _ = doe_al.AL_Cohn96_idx(kernel_fn=ker_func, X_train=gpm_al2.X_train_, X_lhs=X)
    
    # add selected X[ixs] to the X_al1 
    if X_al2 is None:
        X_al2=X[ixs]
    else:
        X_al2=np.concatenate([X_al2, X[ixs]])
    
    # observe black-box-function at selected input
    if y_al2 is None:
        y_al2 = black_box_func(X[ixs,0],X[ixs,1])
    else:
        y_al2 = np.concatenate([y_al2,black_box_func(X[ixs,0],X[ixs,1])])
        
        
    # train model on initial and adaptivly selected inputs
    gpm_al2 = train_gp(np.concatenate([X_init, X_al2]), np.concatenate([y_init, y_al2]), kernel)


Size of X_lhs might not be sufficiently large. s=10 samples is small compared to number of dimensions n=2


Size of X_lhs might not be sufficiently large. s=10 samples is small compared to number of dimensions n=2


Size of X_lhs might not be sufficiently large. s=10 samples is small compared to number of dimensions n=2


Size of X_lhs might not be sufficiently large. s=10 samples is small compared to number of dimensions n=2


Size of X_lhs might not be sufficiently large. s=10 samples is small compared to number of dimensions n=2


Size of X_lhs might not be sufficiently large. s=10 samples is small compared to number of dimensions n=2


Size of X_lhs might not be sufficiently large. s=10 samples is small compared to number of dimensions n=2


Size of X_lhs might not be sufficiently large. s=10 samples is small compared to number of dimensions n=2


Size of X_lhs might not be sufficiently large. s=10 samples is small compared to number of dimensions n=2


Size of X_lhs might not be 

In [None]:
# Predict surface based on fitted GP
mu_al2,std_al2 = gpm_al2.predict(Xsurf, return_std=True)

# Plot the result
fig = go.FigureWidget()

fig.add_surface(x=mx, y=my, z=mz, name='true function',
                colorscale='Gray', opacity=0.5, showscale=False,
               colorbar={'title':'std','len':0.5, 'thickness':20})

# fig.add_surface(x=mx, y=my, z=mu_oat.reshape(mx.shape), name='GP oat',
#                 colorscale='Oranges', surfacecolor=std_oat.reshape(mx.shape), opacity=0.5, showscale=False,
#                colorbar={'title':'std','len':0.5, 'thickness':20})

# fig.add_surface(x=mx, y=my, z=mu_lhs.reshape(mx.shape), name='GP lhs',
#                 colorscale='Blues', surfacecolor=std_lhs.reshape(mx.shape), opacity=0.5, showscale=True,
#                colorbar={'title':'std','len':0.5, 'thickness':20})

fig.add_surface(x=mx, y=my, z=mu_al2.reshape(mx.shape), name='GP al2',
                colorscale='Purples', surfacecolor=std_al2.reshape(mx.shape), opacity=0.5, showscale=True,
               colorbar={'title':'std','len':0.5, 'thickness':20})

# fig.add_scatter3d(x=X_oat[:,0], y=X_oat[:,1], z=y_oat, mode='markers',
#                   marker={'symbol':'circle', 'color':'black', 'size':2},
#                   name='OAT obs', showlegend=True)

# fig.add_scatter3d(x=X_lhs[:,0], y=X_lhs[:,1], z=y_lhs, mode='markers',
#                   marker={'symbol':'circle', 'color':'blue', 'size':2},
#                   name='LHS obs', showlegend=True)

fig.add_scatter3d(x=X_init[:,0], y=X_init[:,1], z=y_init, mode='markers',
                  marker={'symbol':'circle', 'color':'black', 'size':3},
                  name='LHS init obs', showlegend=True)

fig.add_scatter3d(x=X_al2[:,0], y=X_al2[:,1], z=y_al2, mode='markers',
                  marker={'symbol':'circle', 'color':'purple', 'size':3},
                  name='al1 obs', showlegend=True)

# update layout to set view
fig.update_layout(
                  scene={'xaxis':{'range':[0,1]},
                         'yaxis':{'range':[0,1]},
                         #'zaxis':{'range':[0,1]},
                         'aspectratio':{'x':1,'y':1,'z':1.2},
                         'camera':{
                            'eye':{'x':-0.7, 'y':-2., 'z':0.3}}
                        })

fig.show()

In [None]:
X_lhs.shape

(15, 2)