### Polynomial Chaos Expansion example: Ishigami function (3 random inputs, scalar output)


Authors: Dimitrios Loukrezis \ 
Date: May 6 2021

In this example, we approximate the well-known Ishigami function with a total-degree Polynomial Chaos Expansion.

We start with the necessary imports.

In [1]:
import numpy as np
import math
import numpy as np
from UQpy.distributions import Uniform, JointIndependent
from UQpy.surrogates import *

The selected optimizer method does not support bounds and thus will be ignored.
The selected optimizer method does not support bounds and thus will be ignored.


We then define the Ishigami function, which reads:
$$f(x_1, x_2, x_3) = \sin(x_1) + a \sin^2(x_2) + b x_3^4 \sin(x_1).$$

In [2]:
# function to be approximated
def ishigami(xx):
    """Ishigami function"""
    a = 7
    b = 0.1
    term1 = np.sin(xx[0])
    term2 = a * np.sin(xx[1])**2
    term3 = b * xx[2]**4 * np.sin(xx[0])
    return term1 + term2 + term3

The Ishigami function has three random inputs, which are uniformly distributed in $[-\pi, \pi]$. Moreover, the input random variables are mutually independent, which simplifies the construction of the joint distribution. Let's define the corresponding distributions. 

In [3]:
# input distributions
dist1 = Uniform(loc=-np.pi, scale=2*np.pi)    
dist2 = Uniform(loc=-np.pi, scale=2*np.pi)
dist3 = Uniform(loc=-np.pi, scale=2*np.pi)    
marg = [dist1, dist2, dist3]
joint = JointIndependent(marginals=marg)

We now define our PCE. Only thing we need is the joint distribution.

In [4]:
# define PCE metamodel
pce_metamodel = PolyChaosExp(joint)

We must now select a polynomial basis. Here we opt for a total-degree (TD) basis, such that the univariate polynomials have a maximum degree equal to $P$ and all multivariate polynomial have a total-degree (sum of degrees of corresponding univariate polynomials) at most equal to $P$. The size of the basis is then given by 
$$\frac{(N+P)!}{N! P!},$$
where $N$ is the number of random inputs (here, $N+3$).

In [5]:
# maximum polynomial degree
P = 6  

# construct total-degree polynomial basis
construct_td_basis(pce_metamodel, P)

# check the size of the basis
print('Size of PCE basis:', pce_metamodel.n_polys)

Size of PCE basis: 84


We must now compute the PCE coefficients. For that we first need a training sample of input random variable realizations and the corresponding model outputs. These two data sets form what is also known as an ''experimental design''. It is generally advisable that the experimental design has $2-10$ times more data points than the number of PCE polynomials.

In [6]:
# create training data
sample_size = int(pce_metamodel.n_polys*5)
print('Size of experimental design:', sample_size)

# realizations of random inputs
xx_train = joint.rvs(sample_size)
# corresponding model outputs
yy_train = np.array([ishigami(x) for x in xx_train])

Size of experimental design: 420


We now fit the PCE coefficients by solving a regression problem. There are multiple ways to do this, e.g. least squares regression, ridge regression, LASSO regression, etc. Here we opt for the _np.linalg.lstsq_ method, which is based on the _dgelsd_ solver of LAPACK.

In [7]:
# fit model
fit_lstsq(pce_metamodel, xx_train, yy_train)

By simply post-processing the PCE's terms, we are able to get estimates regarding the mean and standard deviation of the model output.

In [8]:
mean_est = pce_mean(pce_metamodel)
var_est = pce_variance(pce_metamodel)
print('PCE mean estimate:', mean_est)
print('PCE variance estimate:', var_est)

PCE mean estimate: [3.48873447]
PCE variance estimate: [14.01257008]


Similarly to the mean and variance estimates, we can very simply estimate the Sobol sensitivity indices, which quantify the importance of the input random variables in terms of impact on the model output.

In [9]:
sobol_first = pce_sobol_first(pce_metamodel)
sobol_total = pce_sobol_total(pce_metamodel)
print('First-order Sobol indices:')
print(sobol_first)
print('Total-order Sobol indices:')
print(sobol_total)

First-order Sobol indices:
[[0.31391502]
 [0.43182407]
 [0.00050566]]
Total-order Sobol indices:
[[0.56487066]
 [0.44163132]
 [0.25151004]]


The PCE should become increasingly more accurate as the maximum polynomial degree $P$ increases. We will test that by computing the mean absolute error (MAE) between the PCE's predictions and the true model outputs, given a validation sample of $10^5$ data points.

In [10]:
# validation data sets
np.random.seed(999) # fix random seed for reproducibility
n_samples_val = 100000
xx_val = joint.rvs(n_samples_val)
yy_val = np.array([ishigami(x) for x in xx_val])

mae = [] # to hold MAE for increasing polynomial degree
for degree in range(16):
    # define PCE
    pce_metamodel = PolyChaosExp(joint)
    
    # build PCE basis
    construct_td_basis(pce_metamodel, degree)
    
    # create training data
    np.random.seed(1) # fix random seed for reproducibility
    sample_size = int(pce_metamodel.n_polys*5)
    xx_train = joint.rvs(sample_size)
    yy_train = np.array([ishigami(x) for x in xx_train])
    
    # fit PCE coefficients
    fit_lstsq(pce_metamodel, xx_train, yy_train)
    
    # compute mean absolute validation error
    yy_val_pce = pce_metamodel.predict(xx_val).flatten()
    errors = np.abs(yy_val.flatten() - yy_val_pce)
    mae.append(np.linalg.norm(errors, 1)/n_samples_val)
    
    print('Polynomial degree:', degree)
    print('Mean absolute error:', mae[-1])
    print(' ')


Polynomial degree: 0
Mean absolute error: 3.5092014513593974
 
Polynomial degree: 1
Mean absolute error: 2.9190940130421446
 
Polynomial degree: 2
Mean absolute error: 2.881426057510191
 
Polynomial degree: 3
Mean absolute error: 2.490517639729513
 
Polynomial degree: 4
Mean absolute error: 1.6837629839128996
 
Polynomial degree: 5
Mean absolute error: 1.4288047926039877
 
Polynomial degree: 6
Mean absolute error: 0.47225689340879734
 
Polynomial degree: 7
Mean absolute error: 0.3275845356142154
 
Polynomial degree: 8
Mean absolute error: 0.068521038112655
 
Polynomial degree: 9
Mean absolute error: 0.04424218398187098
 
Polynomial degree: 10
Mean absolute error: 0.0049732041095257515
 
Polynomial degree: 11
Mean absolute error: 0.0038931663654605785
 
Polynomial degree: 12
Mean absolute error: 0.00025446069874273823
 
Polynomial degree: 13
Mean absolute error: 0.0002543042852670587
 
Polynomial degree: 14
Mean absolute error: 1.118914790000318e-05
 
Polynomial degree: 15
Mean absolute