# Expected Improvement {#sec-expected-improvement}

This chapter describes, analyzes, and compares different infill criterion. An infill criterion defines how the next point $x_{n+1}$ is selected from the surrogate model $S$. Expected improvement is a popular infill criterion in Bayesian optimization.

## Example: `Spot` and the 1-dim Sphere Function


In [1]:
import numpy as np
from math import inf
from spotPython.fun.objectivefunctions import analytical
from spotPython.spot import spot
from spotPython.utils.init import fun_control_init, surrogate_control_init, design_control_init
import matplotlib.pyplot as plt

2024-01-24 22:15:16.407584: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-24 22:15:16.407633: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-24 22:15:16.409017: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-01-24 22:15:16.418476: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.




Seed set to 123


### The Objective Function: 1-dim Sphere

* The `spotPython` package provides several classes of objective functions.
* We will use an analytical objective function, i.e., a function that can be described by a (closed) formula:
   $$f(x) = x^2 $$


In [2]:
fun = analytical().fun_sphere

* The size of the `lower` bound vector determines the problem dimension.
* Here we will use `np.array([-1])`, i.e., a one-dim function.

:::{.callout-note}
#### TensorBoard

Similar to the one-dimensional case, which was introduced in Section @sec-visualizing-tensorboard-01, we can use TensorBoard to monitor the progress of the optimization. We will use the same code, only the prefix is different:


In [3]:
from spotPython.utils.init import fun_control_init
PREFIX = "07_Y"
fun_control = fun_control_init(
    PREFIX=PREFIX,
    fun_evals = 25,
    lower = np.array([-1]),
    upper = np.array([1]),
    tolerance_x = np.sqrt(np.spacing(1)),)
design_control = design_control_init(init_size=10)

Seed set to 123


Created spot_tensorboard_path: runs/spot_logs/07_Y_s7-Precision-Tower-7910_2024-01-24_22-15-18 for SummaryWriter()


:::


In [4]:
spot_1 = spot.Spot(
            fun=fun,
            fun_control=fun_control,
            design_control=design_control)
spot_1.run()

spotPython tuning: 2.1780933519077765e-08 [####------] 44.00% 


spotPython tuning: 2.1780933519077765e-08 [#####-----] 48.00% 


spotPython tuning: 2.1641353222272515e-08 [#####-----] 52.00% 


spotPython tuning: 1.1394886784872778e-08 [######----] 56.00% 


spotPython tuning: 3.7010904275056666e-10 [######----] 60.00% 


spotPython tuning: 3.7010904275056666e-10 [######----] 64.00% 


spotPython tuning: 3.7010904275056666e-10 [#######---] 68.00% 


spotPython tuning: 3.7010904275056666e-10 [#######---] 72.00% 


spotPython tuning: 3.7010904275056666e-10 [########--] 76.00% 


spotPython tuning: 3.7010904275056666e-10 [########--] 80.00% 


spotPython tuning: 3.7010904275056666e-10 [########--] 84.00% 


spotPython tuning: 3.7010904275056666e-10 [#########-] 88.00% 


spotPython tuning: 4.9233547910699585e-11 [#########-] 92.00% 


spotPython tuning: 4.9233547910699585e-11 [##########] 96.00% 


spotPython tuning: 4.9233547910699585e-11 [##########] 100.00% Done...



<spotPython.spot.spot.Spot at 0x7f0415cb49d0>

### Results


In [5]:
spot_1.print_results()

min y: 4.9233547910699585e-11
x0: -7.016662163072951e-06


[['x0', -7.016662163072951e-06]]

In [6]:
spot_1.plot_progress(log_y=True)

<Figure size 2700x1800 with 1 Axes>

![TensorBoard visualization of the spotPython optimization process and the surrogate model.](figures_static/07_tensorboard_Y.png){width="100%"}

## Same, but with EI as infill_criterion


In [7]:
PREFIX = "07_EI_ISO"
fun_control = fun_control_init(
    PREFIX=PREFIX,
    lower = np.array([-1]),
    upper = np.array([1]),
    fun_evals = 25,
    tolerance_x = np.sqrt(np.spacing(1)),
    infill_criterion = "ei")

Seed set to 123


Created spot_tensorboard_path: runs/spot_logs/07_EI_ISO_s7-Precision-Tower-7910_2024-01-24_22-15-20 for SummaryWriter()


In [8]:
spot_1_ei = spot.Spot(fun=fun,
                     fun_control=fun_control)
spot_1_ei.run()

spotPython tuning: 1.1650633358670164e-08 [####------] 44.00% 


spotPython tuning: 1.1650633358670164e-08 [#####-----] 48.00% 


spotPython tuning: 1.1650633358670164e-08 [#####-----] 52.00% 


spotPython tuning: 1.1650633358670164e-08 [######----] 56.00% 


spotPython tuning: 2.1953599655748327e-10 [######----] 60.00% 


spotPython tuning: 2.1953599655748327e-10 [######----] 64.00% 


spotPython tuning: 2.1953599655748327e-10 [#######---] 68.00% 


spotPython tuning: 2.1953599655748327e-10 [#######---] 72.00% 


spotPython tuning: 2.1953599655748327e-10 [########--] 76.00% 


spotPython tuning: 2.1953599655748327e-10 [########--] 80.00% 


spotPython tuning: 2.1953599655748327e-10 [########--] 84.00% 


spotPython tuning: 2.1953599655748327e-10 [#########-] 88.00% 


spotPython tuning: 2.1953599655748327e-10 [#########-] 92.00% 


spotPython tuning: 2.1953599655748327e-10 [##########] 96.00% 


spotPython tuning: 2.1953599655748327e-10 [##########] 100.00% Done...



<spotPython.spot.spot.Spot at 0x7f0415402c50>

In [9]:
spot_1_ei.plot_progress(log_y=True)

<Figure size 2700x1800 with 1 Axes>

In [10]:
spot_1_ei.print_results()

min y: 2.1953599655748327e-10
x0: 1.481674716520071e-05


[['x0', 1.481674716520071e-05]]

![TensorBoard visualization of the spotPython optimization process and the surrogate model. Expected improvement, isotropic Kriging.](figures_static/07_tensorboard_EI_ISO.png){width="100%"}


## Non-isotropic Kriging


In [11]:
PREFIX = "07_EI_NONISO"
fun_control = fun_control_init(
    PREFIX=PREFIX,
    lower = np.array([-1, -1]),
    upper = np.array([1, 1]),
    fun_evals = 25,
    tolerance_x = np.sqrt(np.spacing(1)),
    infill_criterion = "ei")
surrogate_control = surrogate_control_init(
    n_theta=2,
    noise=False,
    )

Seed set to 123


Created spot_tensorboard_path: runs/spot_logs/07_EI_NONISO_s7-Precision-Tower-7910_2024-01-24_22-15-23 for SummaryWriter()


In [12]:
spot_2_ei_noniso = spot.Spot(fun=fun,
                   fun_control=fun_control,
                   surrogate_control=surrogate_control)
spot_2_ei_noniso.run()

spotPython tuning: 1.8810726155905e-05 [####------] 44.00% 


spotPython tuning: 1.8810726155905e-05 [#####-----] 48.00% 


spotPython tuning: 1.8810726155905e-05 [#####-----] 52.00% 


spotPython tuning: 1.0504409769682476e-05 [######----] 56.00% 


spotPython tuning: 1.0504409769682476e-05 [######----] 60.00% 


spotPython tuning: 1.8936524499518082e-07 [######----] 64.00% 


spotPython tuning: 1.8936524499518082e-07 [#######---] 68.00% 


spotPython tuning: 1.8936524499518082e-07 [#######---] 72.00% 


spotPython tuning: 1.8936524499518082e-07 [########--] 76.00% 


spotPython tuning: 1.8936524499518082e-07 [########--] 80.00% 


spotPython tuning: 1.8936524499518082e-07 [########--] 84.00% 


spotPython tuning: 1.8936524499518082e-07 [#########-] 88.00% 


spotPython tuning: 1.8936524499518082e-07 [#########-] 92.00% 


spotPython tuning: 1.8936524499518082e-07 [##########] 96.00% 


spotPython tuning: 1.8936524499518082e-07 [##########] 100.00% Done...



<spotPython.spot.spot.Spot at 0x7f04155749d0>

In [13]:
spot_2_ei_noniso.plot_progress(log_y=True)

<Figure size 2700x1800 with 1 Axes>

In [14]:
spot_2_ei_noniso.print_results()

min y: 1.8936524499518082e-07
x0: -0.0003023141167718133
x1: 0.00031300386546440485


[['x0', -0.0003023141167718133], ['x1', 0.00031300386546440485]]

In [15]:
spot_2_ei_noniso.surrogate.plot()

<Figure size 2700x1800 with 6 Axes>

![TensorBoard visualization of the spotPython optimization process and the surrogate model. Expected improvement, isotropic Kriging.](figures_static/07_tensorboard_EI_NONISO.png){width="100%"}


## Using `sklearn` Surrogates

### The spot Loop

The `spot` loop consists of the following steps:

1. Init: Build initial design $X$
2. Evaluate initial design on real objective $f$: $y = f(X)$
3. Build surrogate: $S = S(X,y)$
4. Optimize on surrogate: $X_0 =  \text{optimize}(S)$
5. Evaluate on real objective: $y_0 = f(X_0)$
6. Impute (Infill) new points: $X = X \cup X_0$, $y = y \cup y_0$.
7. Got 3.

The `spot` loop is implemented in `R` as follows:

![Visual representation of the model based search with SPOT. Taken from: Bartz-Beielstein, T., and Zaefferer, M. Hyperparameter tuning approaches. In Hyperparameter Tuning for Machine and Deep Learning with R - A Practical Guide, E. Bartz, T. Bartz-Beielstein, M. Zaefferer, and O. Mersmann, Eds. Springer, 2022, ch. 4, pp. 67–114. ](figures_static/spotModel.png)

### spot: The Initial Model

#### Example: Modifying the initial design size

This is the "Example: Modifying the initial design size"  from Chapter 4.5.1 in [bart21i].


In [16]:
spot_ei = spot.Spot(fun=fun,
                fun_control=fun_control_init(
                lower = np.array([-1,-1]),
                upper= np.array([1,1])), 
                design_control = design_control_init(init_size=5))
spot_ei.run()

Seed set to 123


spotPython tuning: 0.1377171893465624 [####------] 40.00% 


spotPython tuning: 0.00877036247095888 [#####-----] 46.67% 


spotPython tuning: 0.002830524927647587 [#####-----] 53.33% 


spotPython tuning: 0.0008144824805376925 [######----] 60.00% 


spotPython tuning: 0.00036503097442588706 [#######---] 66.67% 


spotPython tuning: 0.0003586477903084947 [#######---] 73.33% 


spotPython tuning: 0.0003586477903084947 [########--] 80.00% 


spotPython tuning: 0.00031830063136968596 [#########-] 86.67% 


spotPython tuning: 0.00027128673233310445 [#########-] 93.33% 


spotPython tuning: 0.00015101212476377582 [##########] 100.00% Done...



<spotPython.spot.spot.Spot at 0x7f0415574110>

In [17]:
spot_ei.plot_progress()

<Figure size 2700x1800 with 1 Axes>

In [18]:
np.min(spot_1.y), np.min(spot_ei.y)

(4.9233547910699585e-11, 0.00015101212476377582)

### Init: Build Initial Design


In [19]:
from spotPython.design.spacefilling import spacefilling
from spotPython.build.kriging import Kriging
from spotPython.fun.objectivefunctions import analytical
gen = spacefilling(2)
rng = np.random.RandomState(1)
lower = np.array([-5,-0])
upper = np.array([10,15])
fun = analytical().fun_branin

X = gen.scipy_lhd(10, lower=lower, upper = upper)
print(X)
y = fun(X, fun_control=fun_control)
print(y)

[[ 8.97647221 13.41926847]
 [ 0.66946019  1.22344228]
 [ 5.23614115 13.78185824]
 [ 5.6149825  11.5851384 ]
 [-1.72963184  1.66516096]
 [-4.26945568  7.1325531 ]
 [ 1.26363761 10.17935555]
 [ 2.88779942  8.05508969]
 [-3.39111089  4.15213772]
 [ 7.30131231  5.22275244]]
[128.95676449  31.73474356 172.89678121 126.71295908  64.34349975
  70.16178611  48.71407916  31.77322887  76.91788181  30.69410529]


In [20]:
S = Kriging(name='kriging',  seed=123)
S.fit(X, y)
S.plot()

<Figure size 2700x1800 with 6 Axes>

In [21]:
gen = spacefilling(2, seed=123)
X0 = gen.scipy_lhd(3)
gen = spacefilling(2, seed=345)
X1 = gen.scipy_lhd(3)
X2 = gen.scipy_lhd(3)
gen = spacefilling(2, seed=123)
X3 = gen.scipy_lhd(3)
X0, X1, X2, X3

(array([[0.77254938, 0.31539299],
        [0.59321338, 0.93854273],
        [0.27469803, 0.3959685 ]]),
 array([[0.78373509, 0.86811887],
        [0.06692621, 0.6058029 ],
        [0.41374778, 0.00525456]]),
 array([[0.121357  , 0.69043832],
        [0.41906219, 0.32838498],
        [0.86742658, 0.52910374]]),
 array([[0.77254938, 0.31539299],
        [0.59321338, 0.93854273],
        [0.27469803, 0.3959685 ]]))

### Evaluate 

###  Build Surrogate

### A Simple Predictor

The code below shows how to use a simple model for prediction.

* Assume that only two (very costly) measurements are available:
  
  1. f(0) = 0.5
  2. f(2) = 2.5

* We are interested in the value at $x_0 = 1$, i.e., $f(x_0 = 1)$, but cannot run an additional, third experiment.


In [22]:
from sklearn import linear_model
X = np.array([[0], [2]])
y = np.array([0.5, 2.5])
S_lm = linear_model.LinearRegression()
S_lm = S_lm.fit(X, y)
X0 = np.array([[1]])
y0 = S_lm.predict(X0)
print(y0)

[1.5]


* Central Idea:
  * Evaluation of the surrogate model `S_lm` is much cheaper (or / and much faster) than running the real-world experiment $f$.

## Gaussian Processes regression: basic introductory example

This example was taken from [scikit-learn](https://scikit-learn.org/stable/auto_examples/gaussian_process/plot_gpr_noisy_targets.html). After fitting our model, we see that the hyperparameters of the kernel have been optimized. Now, we will use our kernel to compute the mean prediction of the full dataset and plot the 95% confidence interval.


In [23]:
import numpy as np
import matplotlib.pyplot as plt
import math as m
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF

X = np.linspace(start=0, stop=10, num=1_000).reshape(-1, 1)
y = np.squeeze(X * np.sin(X))
rng = np.random.RandomState(1)
training_indices = rng.choice(np.arange(y.size), size=6, replace=False)
X_train, y_train = X[training_indices], y[training_indices]

kernel = 1 * RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e2))
gaussian_process = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=9)
gaussian_process.fit(X_train, y_train)
gaussian_process.kernel_

mean_prediction, std_prediction = gaussian_process.predict(X, return_std=True)

plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
plt.scatter(X_train, y_train, label="Observations")
plt.plot(X, mean_prediction, label="Mean prediction")
plt.fill_between(
    X.ravel(),
    mean_prediction - 1.96 * std_prediction,
    mean_prediction + 1.96 * std_prediction,
    alpha=0.5,
    label=r"95% confidence interval",
)
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("sk-learn Version: Gaussian process regression on noise-free dataset")

<Figure size 1650x1050 with 1 Axes>

In [24]:
from spotPython.build.kriging import Kriging
import numpy as np
import matplotlib.pyplot as plt
rng = np.random.RandomState(1)
X = np.linspace(start=0, stop=10, num=1_000).reshape(-1, 1)
y = np.squeeze(X * np.sin(X))
training_indices = rng.choice(np.arange(y.size), size=6, replace=False)
X_train, y_train = X[training_indices], y[training_indices]


S = Kriging(name='kriging',  seed=123, log_level=50, cod_type="norm")
S.fit(X_train, y_train)

mean_prediction, std_prediction, ei = S.predict(X, return_val="all")

std_prediction

plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
plt.scatter(X_train, y_train, label="Observations")
plt.plot(X, mean_prediction, label="Mean prediction")
plt.fill_between(
    X.ravel(),
    mean_prediction - 1.96 * std_prediction,
    mean_prediction + 1.96 * std_prediction,
    alpha=0.5,
    label=r"95% confidence interval",
)
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("spotPython Version: Gaussian process regression on noise-free dataset")

<Figure size 1650x1050 with 1 Axes>

## The Surrogate: Using scikit-learn models

Default is the internal `kriging` surrogate.


In [25]:
S_0 = Kriging(name='kriging', seed=123)

Models from `scikit-learn` can be selected, e.g., Gaussian Process:


In [26]:
# Needed for the sklearn surrogates:
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn import linear_model
from sklearn import tree
import pandas as pd

In [27]:
kernel = 1 * RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e2))
S_GP = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=9)

* and many more:


In [28]:
S_Tree = DecisionTreeRegressor(random_state=0)
S_LM = linear_model.LinearRegression()
S_Ridge = linear_model.Ridge()
S_RF = RandomForestRegressor(max_depth=2, random_state=0) 

* The scikit-learn GP model `S_GP` is selected.


In [29]:
S = S_GP

In [30]:
isinstance(S, GaussianProcessRegressor)


True

In [31]:
from spotPython.fun.objectivefunctions import analytical
fun = analytical().fun_branin
fun_control = fun_control_init(
    lower = np.array([-5,-0]),
    upper = np.array([10,15]),
    fun_evals = 15)    
design_control = design_control_init(init_size=5)
spot_GP = spot.Spot(fun=fun, 
                    fun_control=fun_control,
                    surrogate=S, 
                    design_control=design_control)
spot_GP.run()

Seed set to 123


spotPython tuning: 24.51465459019188 [####------] 40.00% 


spotPython tuning: 11.00311039380438 [#####-----] 46.67% 


spotPython tuning: 11.00311039380438 [#####-----] 53.33% 


spotPython tuning: 7.28117907433701 [######----] 60.00% 


spotPython tuning: 7.28117907433701 [#######---] 66.67% 


spotPython tuning: 7.28117907433701 [#######---] 73.33% 


spotPython tuning: 2.9518930509970556 [########--] 80.00% 


spotPython tuning: 2.9518930509970556 [#########-] 86.67% 


spotPython tuning: 2.104968905163581 [#########-] 93.33% 



lbfgs failed to converge (status=2):
ABNORMAL_TERMINATION_IN_LNSRCH.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html



spotPython tuning: 1.9431617098702745 [##########] 100.00% Done...



<spotPython.spot.spot.Spot at 0x7f04140a1010>

In [32]:
spot_GP.y

array([ 69.32459936, 152.38491454, 107.92560483,  24.51465459,
        76.73500031,  86.30426421,  11.00311039,  16.11744711,
         7.28117907,  21.82319987,  10.96088904,   2.95189305,
         3.02909296,   2.10496891,   1.94316171])

In [33]:
spot_GP.plot_progress()

<Figure size 2700x1800 with 1 Axes>

In [34]:
spot_GP.print_results()

min y: 1.9431617098702745
x0: 10.0
x1: 2.9983689059684346


[['x0', 10.0], ['x1', 2.9983689059684346]]

## Additional Examples


In [35]:
# Needed for the sklearn surrogates:
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn import linear_model
from sklearn import tree
import pandas as pd

In [36]:
kernel = 1 * RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e2))
S_GP = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=9)

In [37]:
from spotPython.build.kriging import Kriging
import numpy as np
import spotPython
from spotPython.fun.objectivefunctions import analytical
from spotPython.spot import spot

S_K = Kriging(name='kriging',
              seed=123,
              log_level=50,
              infill_criterion = "y",
              n_theta=1,
              noise=False,
              cod_type="norm")
fun = analytical().fun_sphere

fun_control = fun_control_init(
    lower = np.array([-1,-1]),
    upper = np.array([1,1]),
    fun_evals = 25)

spot_S_K = spot.Spot(fun=fun,
                     fun_control=fun_control,
                     surrogate=S_K,
                     design_control=design_control,
                     surrogate_control=surrogate_control)
spot_S_K.run()

Seed set to 123


spotPython tuning: 0.1377171884648105 [##--------] 24.00% 


spotPython tuning: 0.008765732583944055 [###-------] 28.00% 


spotPython tuning: 0.0028341771875529855 [###-------] 32.00% 


spotPython tuning: 0.0008167444555924514 [####------] 36.00% 


spotPython tuning: 0.00036643805492765077 [####------] 40.00% 


spotPython tuning: 0.00035717508948430733 [####------] 44.00% 


spotPython tuning: 0.00035717508948430733 [#####-----] 48.00% 


spotPython tuning: 0.00032755842800122754 [#####-----] 52.00% 


spotPython tuning: 0.00026655125927622475 [######----] 56.00% 


spotPython tuning: 0.00013086843795281645 [######----] 60.00% 


spotPython tuning: 1.1731665741417858e-05 [######----] 64.00% 


spotPython tuning: 1.6356997057537843e-06 [#######---] 68.00% 


spotPython tuning: 9.161096221284059e-07 [#######---] 72.00% 


spotPython tuning: 4.4455659914875556e-07 [########--] 76.00% 


spotPython tuning: 3.9440213569993096e-07 [########--] 80.00% 


spotPython tuning: 2.0152094007295912e-07 [########--] 84.00% 


spotPython tuning: 1.702407685252356e-07 [#########-] 88.00% 


spotPython tuning: 1.702407685252356e-07 [#########-] 92.00% 


spotPython tuning: 1.702407685252356e-07 [##########] 96.00% 


spotPython tuning: 1.702407685252356e-07 [##########] 100.00% Done...



<spotPython.spot.spot.Spot at 0x7f0402be6c90>

In [38]:
spot_S_K.plot_progress(log_y=True)

<Figure size 2700x1800 with 1 Axes>

In [39]:
spot_S_K.surrogate.plot()

<Figure size 2700x1800 with 6 Axes>

In [40]:
spot_S_K.print_results()

min y: 1.702407685252356e-07
x0: 0.000310978691516528
x1: 0.0002711697290405102


[['x0', 0.000310978691516528], ['x1', 0.0002711697290405102]]

### Optimize on Surrogate

### Evaluate on Real Objective

### Impute / Infill new Points

## Tests


In [41]:
import numpy as np
from spotPython.spot import spot
from spotPython.fun.objectivefunctions import analytical

fun_sphere = analytical().fun_sphere

fun_control = fun_control_init(
                    lower=np.array([-1, -1]),
                    upper=np.array([1, 1]),
                    n_points = 2)
spot_1 = spot.Spot(
    fun=fun_sphere,
    fun_control=fun_control,
)

# (S-2) Initial Design:
spot_1.X = spot_1.design.scipy_lhd(
    spot_1.design_control["init_size"], lower=spot_1.lower, upper=spot_1.upper
)
print(spot_1.X)

# (S-3): Eval initial design:
spot_1.y = spot_1.fun(spot_1.X)
print(spot_1.y)

spot_1.fit_surrogate()
X0 = spot_1.suggest_new_X()
print(X0)
assert X0.size == spot_1.n_points * spot_1.k

Seed set to 123


[[ 0.86352963  0.7892358 ]
 [-0.24407197 -0.83687436]
 [ 0.36481882  0.8375811 ]
 [ 0.415331    0.54468512]
 [-0.56395091 -0.77797854]
 [-0.90259409 -0.04899292]
 [-0.16484832  0.35724741]
 [ 0.05170659  0.07401196]
 [-0.78548145 -0.44638164]
 [ 0.64017497 -0.30363301]]
[1.36857656 0.75992983 0.83463487 0.46918172 0.92329124 0.8170764
 0.15480068 0.00815134 0.81623768 0.502017  ]


[[0.00156835 0.00406697]
 [0.00160809 0.00398348]]


## EI: The Famous Schonlau Example


In [42]:
X_train0 = np.array([1, 2, 3, 4, 12]).reshape(-1,1)
X_train = np.linspace(start=0, stop=10, num=5).reshape(-1, 1)

In [43]:
from spotPython.build.kriging import Kriging
import numpy as np
import matplotlib.pyplot as plt

X_train = np.array([1., 2., 3., 4., 12.]).reshape(-1,1)
y_train = np.array([0., -1.75, -2, -0.5, 5.])

S = Kriging(name='kriging',  seed=123, log_level=50, n_theta=1, noise=False, cod_type="norm")
S.fit(X_train, y_train)

X = np.linspace(start=0, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
plt.plot(X, mean_prediction, label="Mean prediction")
if True:
    plt.fill_between(
        X.ravel(),
        mean_prediction - 2 * std_prediction,
        mean_prediction + 2 * std_prediction,
        alpha=0.5,
        label=r"95% confidence interval",
    )
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression on noise-free dataset")

<Figure size 1650x1050 with 1 Axes>

In [44]:
#plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
# plt.scatter(X_train, y_train, label="Observations")
plt.plot(X, -ei, label="Expected Improvement")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression on noise-free dataset")

<Figure size 1650x1050 with 1 Axes>

In [45]:
S.log

{'negLnLike': array([1.20788205]),
 'theta': array([-0.99002536]),
 'p': [],
 'Lambda': []}

## EI: The Forrester Example


In [46]:
from spotPython.build.kriging import Kriging
import numpy as np
import matplotlib.pyplot as plt
import spotPython
from spotPython.fun.objectivefunctions import analytical
from spotPython.spot import spot

# exact x locations are unknown:
X_train = np.array([0.0, 0.175, 0.225, 0.3, 0.35, 0.375, 0.5,1]).reshape(-1,1)

fun = analytical().fun_forrester
fun_control = fun_control_init(
    PREFIX="07_EI_FORRESTER",
    sigma=1.0,
    seed=123,)
y_train = fun(X_train, fun_control=fun_control)

S = Kriging(name='kriging',  seed=123, log_level=50, n_theta=1, noise=False, cod_type="norm")
S.fit(X_train, y_train)

X = np.linspace(start=0, stop=1, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
plt.plot(X, mean_prediction, label="Mean prediction")
if True:
    plt.fill_between(
        X.ravel(),
        mean_prediction - 2 * std_prediction,
        mean_prediction + 2 * std_prediction,
        alpha=0.5,
        label=r"95% confidence interval",
    )
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression on noise-free dataset")

Seed set to 123


Created spot_tensorboard_path: runs/spot_logs/07_EI_FORRESTER_s7-Precision-Tower-7910_2024-01-24_22-15-53 for SummaryWriter()


<Figure size 1650x1050 with 1 Axes>

In [47]:
#plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
# plt.scatter(X_train, y_train, label="Observations")
plt.plot(X, -ei, label="Expected Improvement")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression on noise-free dataset")

<Figure size 1650x1050 with 1 Axes>

## Noise


In [48]:
import numpy as np
import spotPython
from spotPython.fun.objectivefunctions import analytical
from spotPython.spot import spot
from spotPython.design.spacefilling import spacefilling
from spotPython.build.kriging import Kriging
import matplotlib.pyplot as plt

gen = spacefilling(1)
rng = np.random.RandomState(1)
lower = np.array([-10])
upper = np.array([10])
fun = analytical().fun_sphere
fun_control = fun_control_init(
    PREFIX="07_Y",
    sigma=2.0,
    seed=123,)
X = gen.scipy_lhd(10, lower=lower, upper = upper)
print(X)
y = fun(X, fun_control=fun_control)
print(y)
y.shape
X_train = X.reshape(-1,1)
y_train = y

S = Kriging(name='kriging',
            seed=123,
            log_level=50,
            n_theta=1,
            noise=False)
S.fit(X_train, y_train)

X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

#plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Sphere: Gaussian process regression on noisy dataset")

Seed set to 123


Created spot_tensorboard_path: runs/spot_logs/07_Y_s7-Precision-Tower-7910_2024-01-24_22-15-54 for SummaryWriter()
[[ 0.63529627]
 [-4.10764204]
 [-0.44071975]
 [ 9.63125638]
 [-8.3518118 ]
 [-3.62418901]
 [ 4.15331   ]
 [ 3.4468512 ]
 [ 6.36049088]
 [-7.77978539]]
[-1.57464135 16.13714981  2.77008442 93.14904827 71.59322218 14.28895359
 15.9770567  12.96468767 39.82265329 59.88028242]


<Figure size 1650x1050 with 1 Axes>

In [49]:
S.log

{'negLnLike': array([26.18505386]),
 'theta': array([-1.10547478]),
 'p': [],
 'Lambda': []}

In [50]:
S = Kriging(name='kriging',
            seed=123,
            log_level=50,
            n_theta=1,
            noise=True)
S.fit(X_train, y_train)

X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

#plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Sphere: Gaussian process regression with nugget on noisy dataset")

<Figure size 1650x1050 with 1 Axes>

In [51]:
S.log

{'negLnLike': array([21.82059177]),
 'theta': array([-2.96941225]),
 'p': [],
 'Lambda': array([4.290475e-05])}

## Cubic Function


In [52]:
import numpy as np
import spotPython
from spotPython.fun.objectivefunctions import analytical
from spotPython.spot import spot
from spotPython.design.spacefilling import spacefilling
from spotPython.build.kriging import Kriging
import matplotlib.pyplot as plt

gen = spacefilling(1)
rng = np.random.RandomState(1)
lower = np.array([-10])
upper = np.array([10])
fun = analytical().fun_cubed
fun_control = fun_control_init(
    PREFIX="07_Y",
    sigma=10.0,
    seed=123,)

X = gen.scipy_lhd(10, lower=lower, upper = upper)
print(X)
y = fun(X, fun_control=fun_control)
print(y)
y.shape
X_train = X.reshape(-1,1)
y_train = y

S = Kriging(name='kriging',  seed=123, log_level=50, n_theta=1, noise=False)
S.fit(X_train, y_train)

X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Cubed: Gaussian process regression on noisy dataset")

Seed set to 123


Created spot_tensorboard_path: runs/spot_logs/07_Y_s7-Precision-Tower-7910_2024-01-24_22-15-55 for SummaryWriter()
[[ 0.63529627]
 [-4.10764204]
 [-0.44071975]
 [ 9.63125638]
 [-8.3518118 ]
 [-3.62418901]
 [ 4.15331   ]
 [ 3.4468512 ]
 [ 6.36049088]
 [-7.77978539]]
[ 2.56406437e-01 -6.93071067e+01 -8.56027124e-02  8.93405931e+02
 -5.82561927e+02 -4.76028022e+01  7.16445311e+01  4.09512920e+01
  2.57319028e+02 -4.70871982e+02]


<Figure size 1650x1050 with 1 Axes>

In [53]:
S = Kriging(name='kriging',  seed=123, log_level=0, n_theta=1, noise=True)
S.fit(X_train, y_train)

X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Cubed: Gaussian process with nugget regression on noisy dataset")

<Figure size 1650x1050 with 1 Axes>

In [54]:
import numpy as np
import spotPython
from spotPython.fun.objectivefunctions import analytical
from spotPython.spot import spot
from spotPython.design.spacefilling import spacefilling
from spotPython.build.kriging import Kriging
import matplotlib.pyplot as plt

gen = spacefilling(1)
rng = np.random.RandomState(1)
lower = np.array([-10])
upper = np.array([10])
fun = analytical().fun_runge
fun_control = fun_control_init(
    PREFIX="07_Y",
    sigma=0.25,
    seed=123,)

X = gen.scipy_lhd(10, lower=lower, upper = upper)
print(X)
y = fun(X, fun_control=fun_control)
print(y)
y.shape
X_train = X.reshape(-1,1)
y_train = y

S = Kriging(name='kriging',  seed=123, log_level=50, n_theta=1, noise=False)
S.fit(X_train, y_train)

X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression on noisy dataset")

Seed set to 123


Created spot_tensorboard_path: runs/spot_logs/07_Y_s7-Precision-Tower-7910_2024-01-24_22-15-56 for SummaryWriter()
[[ 0.63529627]
 [-4.10764204]
 [-0.44071975]
 [ 9.63125638]
 [-8.3518118 ]
 [-3.62418901]
 [ 4.15331   ]
 [ 3.4468512 ]
 [ 6.36049088]
 [-7.77978539]]
[0.712453   0.05595118 0.83735691 0.0106654  0.01413372 0.07074765
 0.05479457 0.07763503 0.02412205 0.01625354]


<Figure size 1650x1050 with 1 Axes>

In [55]:
S = Kriging(name='kriging',
            seed=123,
            log_level=50,
            n_theta=1,
            noise=True)
S.fit(X_train, y_train)

X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression with nugget on noisy dataset")

<Figure size 1650x1050 with 1 Axes>

## Modifying Lambda Search Space


In [56]:
S = Kriging(name='kriging',
            seed=123,
            log_level=50,
            n_theta=1,
            noise=True,
            min_Lambda=0.1,
            max_Lambda=10)
S.fit(X_train, y_train)

print(f"Lambda: {S.Lambda}")

Lambda: 0.1


In [57]:
X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression with nugget on noisy dataset. Modified Lambda search space.")

<Figure size 1650x1050 with 1 Axes>

## Factors


In [58]:
["num"] * 3

['num', 'num', 'num']

In [59]:
from spotPython.design.spacefilling import spacefilling
from spotPython.build.kriging import Kriging
from spotPython.fun.objectivefunctions import analytical
import numpy as np

In [60]:
gen = spacefilling(2)
n = 30
rng = np.random.RandomState(1)
lower = np.array([-5,-0])
upper = np.array([10,15])
fun = analytical().fun_branin_factor
#fun = analytical(sigma=0).fun_sphere

X0 = gen.scipy_lhd(n, lower=lower, upper = upper)
X1 = np.random.randint(low=1, high=3, size=(n,))
X = np.c_[X0, X1]
y = fun(X)
S = Kriging(name='kriging',  seed=123, log_level=50, n_theta=3, noise=False, var_type=["num", "num", "num"])
S.fit(X, y)
Sf = Kriging(name='kriging',  seed=123, log_level=50, n_theta=3, noise=False, var_type=["num", "num", "factor"])
Sf.fit(X, y)
n = 50
X0 = gen.scipy_lhd(n, lower=lower, upper = upper)
X1 = np.random.randint(low=1, high=3, size=(n,))
X = np.c_[X0, X1]
y = fun(X)
s=np.sum(np.abs(S.predict(X)[0] - y))
sf=np.sum(np.abs(Sf.predict(X)[0] - y))
sf - s

-55.495136103768345

In [61]:
# vars(S)

In [62]:
# vars(Sf)