sequential parameter space search method based on sensitivity analysis
sequentPSS / version 0.1.3
- install:
!pip install sequentPSSThe SPS algorithm consists of preprocessing and sequential calibration stages, with validation being optional. In this study, k number of parameters are denoted as X , while d number of outcomes are denoted as Y . The mathematical representation is:
import sequentPSS as sqp
# set parameter spaces
x1_list = [1,2,3,4,5]
x2_list = [1,2,3,4,5]
x3_list = [1,2,3,4,5]
# set hyper parameters
M = 150
k = 3
# --- run simulations for M(2k+2) times with random parameter values---
multi_simul_df = sqp.multiple_simple_simulation(x1_list, x2_list, x3_list, M, k)
multi_simul_df.head()Here's the DataFrame representing the simulation results with three parameters (x1, x2, x3) and three simulation outcomes (y1, y2, y3)
In the preprocessing step, the criterion for calibration, RMSEsel, is determined as illustrated in Algorithm 1. During process (1), a parameter value x is randomly selected for each Xi based on a uniform distribution. These values are then combined to compute RMSEtem in each iteration. This procedure continues until reaching M(2k+2) iterations, as outlined in equation 1.
RMSE, a widely-used metric for model calibration, is employed here to assess the discrepancy between simulated outcomes and observed data. The threshold RMSEsel is set for each Yj as the upper limit RMSE from any parameter combination. Users can adjust the leniency index μ to control the calibration rigor. For instance, with a μ value of 0.1, the lower 10% of all RMSE values become the RMSEsel criteria. Setting μ too low might lead to overfitting, while a higher value can introduce undue uncertainty.
# --- preprocessing 1: determining a criterion for calibration
O_list = [sqp.O1, sqp.O2, sqp.O3] # observed data to list -
u = 0.1
rmse_sel_df, multi_simul_df_rmse_sel = sqp.prep1_criterion(O_list, multi_simul_df, u, k)
# now, we have the rmse_sel for all O (observed data O1, O2, O3 corresponding to y1, y2, y3).
rmse_sel_dfAlgorithm 2 details the procedure for ordering j and i before calibration. Utilizing simulations from Algorithm 1, data generated from X to Y are used in processes (2) and (3).
In process (2), c(Yj) showcases the proportion of instances where RMSEtem is smaller than RMSEsel compared to all cases n. A prominent c(Yj) indicates a broad parameter space apt for calibration. Therefore, j is arranged in descending order of c(Yj) for subsequent calibration phases.
In process (3), the first-order sensitivity index of each Xi in relation to Yj (denoted as Sji) is organized in descending order. This index gauges the extent to which Xi uniquely impacts Yj. If Xi has a significantly low sensitivity index, it might not notably influence the outcome variance and can be skipped from calibration. Calibration begins by focusing on the most critical parameters based on their sensitivity indices.
# --- preprocessing 2: sorting Y for calibration
y_seq_df = sqp.sorting_Y(multi_simul_df_rmse_sel)
y_seq_df# --- preprocessing 3: sorting X based on sensitivity analysis for calibration
problem = {
'num_vars': 3,
'names': ['x1', 'x2', 'x3'],
'bounds': [[1, 5],
[1, 5],
[1, 5]]
}
x_seq_df = sqp.sorting_X(problem, multi_simul_df_rmse_sel, SA = 'RBD-FAST')
x_seq_dfAlgorithm 3 details sequential calibration using the ordered Y and X values from Algorithm 2. In process (4), γ stores selected parameter combinations for each X. Initially, a fixed parameter v from X1 is chosen with random selections from other X sets excluding X1. If the RMSE of the γ combination is under the RMSEsel threshold, γ is added to C with its RMSE value recorded in R. This iterates M times.
Subsequently, parameter space of X1 is reduced by eliminating v if many combinations involving v don't meet the RMSEsel threshold. If v occurrences in C is below τ * M, v is removed from X1. τ, a user-defined tolerance index, dictates the parameter space reduction intensity. High τ values can lead to significant parameter space reduction and stringent calibration, while low values keep more of v in C, making space reduction inefficient. This process repeats for X2 using the already reduced X1 space. Ultimately, we get the condensed parameter space for the X sets, displayed in Figure 1. If additional Y outcomes exist, the loop reinitiates with the shrunk X parameter spaces.
After completing sequential calibration for all Y, the refined parameter combinations are the final output. The Uncertainty index U aids in identifying the optimal parameter set, with lower indices indicating higher trustworthiness. Based on Equation 4, R encompasses all RMSE outcomes, while C captures only those RMSEs falling beneath RMSEsel. The ratio of C to R illustrates the reliability of each parameter set. Deducting this ratio from 1 produces the uncertainty measure U. For instance, if a parameter set yields an acceptable RMSE in 7 out of 10 instances, there's a 70% confidence in that result, yielding an uncertainty of 0.3 for that combo.
# -- now we need to run sequential calibration with the previous sequence of y and x (y1 -> y3 -> y2 / x3 -> x2 -> x1) --
# First round of y1: fix x3
x1_list = [1,2,3,4,5]
x2_list = [1,2,3,4,5]
x3_list = [1,2,3,4,5]
fix_x3_y1_simul_result_df = sqp.fix_param_simple_simulation(x1_list, x2_list, x3_list, fix_x = 'x3', M = 100) # fix x3: fix each x3 value one by one and run 100 times of simulation
x3_list, result_df = sqp.seqCalibration(fix_x = 'x3', fix_y = 'y1', rmse_sel = 401.295316, simul_result_df = fix_x3_y1_simul_result_df, O_list = O_list, t = 0.2, df_return = True)
print('updated x3 parameter space:', x3_list)reliability of 'x3' for 'y1' (1 - uncertainty degree): {3: 0.59, 4: 0.91, 5: 1.0}
updated x3 parameter space: [3, 4, 5]
Sequential calibration is conducted in the order of sorted y and x values. The first step involves fixing x3 (and calibrate with y1). The RMSE_sel value corresponding to y1 and its matching O1 values are used (401.295316), along with the tolerance index (t).
# Second round of y1: fix x2
fix_x2_y1_simul_result_df = sqp.fix_param_simple_simulation(x1_list, x2_list, x3_list, fix_x = 'x2', M = 100) # fix x3: fix each x3 value one by one and run 100 times of simulation
x2_list, result_df = sqp.seqCalibration(fix_x = 'x2', fix_y = 'y1', rmse_sel = 401.295316, simul_result_df = fix_x2_y1_simul_result_df, O_list = O_list, t = 0.2, df_return = True)
print('updated x2 parameter space:', x2_list)reliability of 'x2' for 'y1' (1 - uncertainty degree): {1: 0.93, 2: 0.88, 3: 0.79, 4: 0.79, 5: 0.58}
updated x2 parameter space: [1, 2, 3, 4, 5]
# Third round of y1: fix x1
fix_x1_y1_simul_result_df = sqp.fix_param_simple_simulation(x1_list, x2_list, x3_list, fix_x = 'x1', M = 100) # fix x3: fix each x3 value one by one and run 100 times of simulation
x1_list, result_df = sqp.seqCalibration(fix_x = 'x1', fix_y = 'y1', rmse_sel = 401.295316, simul_result_df = fix_x1_y1_simul_result_df, O_list = O_list, t = 0.2, df_return = True)
print('updated x1 parameter space:', x1_list)reliability of 'x1' for 'y1' (1 - uncertainty degree): {1: 0.726, 2: 0.869, 4: 0.909, 3: 0.85, 5: 0.729}
updated x1 parameter space: [1, 2, 3, 4, 5]
# First round of y3: fix x3
fix_x3_y3_simul_result_df = sqp.fix_param_simple_simulation(x1_list, x2_list, x3_list, fix_x = 'x3', M = 100) # fix x3: fix each x3 value one by one and run 100 times of simulation
x3_list, result_df = sqp.seqCalibration(fix_x = 'x3', fix_y = 'y3', rmse_sel = 3.176924, simul_result_df = fix_x3_y3_simul_result_df, O_list = O_list, t = 0.2, df_return = True)
print('updated x3 parameter space:', x3_list)reliability of 'x3' for 'y3' (1 - uncertainty degree): {4: 0.41, 5: 0.62}
updated x3 parameter space: [4, 5]
# second round of y3: fix x2
fix_x2_y3_simul_result_df = sqp.fix_param_simple_simulation(x1_list, x2_list, x3_list, fix_x = 'x2', M = 100) # fix x3: fix each x3 value one by one and run 100 times of simulation
x2_list, result_df = sqp.seqCalibration(fix_x = 'x2', fix_y = 'y3', rmse_sel = 3.176924, simul_result_df = fix_x2_y3_simul_result_df, O_list = O_list, t = 0.2, df_return = True)
print('updated x2 parameter space:', x2_list)reliability of 'x2' for 'y3' (1 - uncertainty degree): {1: 0.689, 5: 0.531, 4: 0.515, 3: 0.657, 2: 0.478}
updated x2 parameter space: [1, 2, 3, 4, 5]
# second round of y3: fix x1
fix_x1_y3_simul_result_df = sqp.fix_param_simple_simulation(x1_list, x2_list, x3_list, fix_x = 'x1', M = 100) # fix x3: fix each x3 value one by one and run 100 times of simulation
x1_list, result_df = sqp.seqCalibration(fix_x = 'x2', fix_y = 'y3', rmse_sel = 3.176924, simul_result_df = fix_x1_y3_simul_result_df, O_list = O_list, t = 0.2, df_return = True)
print('updated x1 parameter space:', x1_list)reliability of 'x2' for 'y3' (1 - uncertainty degree): {1: 0.67, 2: 0.43, 3: 0.68, 4: 0.65, 5: 0.56}
updated x1 parameter space: [1, 2, 3, 4, 5]
# First round of y2: fix x3
fix_x3_y2_simul_result_df = sqp.fix_param_simple_simulation(x1_list, x2_list, x3_list, fix_x = 'x3', M = 100) # fix x3: fix each x3 value one by one and run 100 times of simulation
x3_list, result_df = sqp.seqCalibration(fix_x = 'x3', fix_y = 'y2', rmse_sel = 50.487752, simul_result_df = fix_x3_y2_simul_result_df, O_list = O_list, t = 0.2, df_return = True)
print('updated x3 parameter space:', x3_list)reliability of 'x3' for 'y2' (1 - uncertainty degree): {5: 0.678, 4: 0.429}
updated x3 parameter space: [4, 5]
# second round of y2: fix x2
fix_x2_y2_simul_result_df = sqp.fix_param_simple_simulation(x1_list, x2_list, x3_list, fix_x = 'x2', M = 100) # fix x3: fix each x3 value one by one and run 100 times of simulation
x2_list, result_df = sqp.seqCalibration(fix_x = 'x2', fix_y = 'y2', rmse_sel = 50.487752, simul_result_df = fix_x2_y2_simul_result_df, O_list = O_list, t = 0.2, df_return = True)
print('updated x2 parameter space:', x2_list)reliability of 'x2' for 'y2' (1 - uncertainty degree): {3: 0.25, 1: 0.421, 4: 0.396, 2: 0.333}
updated x2 parameter space: [1, 2, 3, 4]
# second round of y2: fix x1
fix_x1_y2_simul_result_df = sqp.fix_param_simple_simulation(x1_list, x2_list, x3_list, fix_x = 'x1', M = 100) # fix x3: fix each x3 value one by one and run 100 times of simulation
x1_list, result_df = sqp.seqCalibration(fix_x = 'x1', fix_y = 'y2', rmse_sel = 50.487752, simul_result_df = fix_x1_y2_simul_result_df, O_list = O_list, t = 0.2, df_return = True)
print('updated x1 parameter space:', x1_list)reliability of 'x1' for 'y2' (1 - uncertainty degree): {3: 0.443, 1: 0.34, 2: 0.634, 4: 0.541}
updated x1 parameter space: [1, 2, 3, 4]
The calibration results are as follows:
- Calibration based on y1 in round 1 led to the following outcomes:
- x1: [1,2,3,4,5] -> [3,4,5]
- x2: [1,2,3,4,5] -> [1,2,3,4,5]
- x3: [1,2,3,4,5] -> [1,2,3,4,5]
- Calibration based on y3 in round 2 led to the following outcomes:
- x1: [3,4,5] -> [4,5]
- x2: [1,2,3,4,5] -> [1,2,3,4,5]
- x3: [1,2,3,4,5] -> [1,2,3,4,5]
- Calibration based on y2 in round 3 led to the following outcomes:
- x1: [4,5] -> [4,5]
- x2: [1,2,3,4,5] -> [1,2,3,4]
- x3: [1,2,3,4,5] -> [1,2,3,4]
Moongi Choi, Andrew Crooks, Neng Wan, Simon Brewer, Thomas J. Cova & Alexander Hohl (2024) Addressing equifinality in agent-based modeling: a sequential parameter space search method based on sensitivity analysis, International Journal of Geographical Information Science, DOI: 10.1080/13658816.2024.2331536
- Author: Moongi Choi
- Email: u1316663@utah.edu
.png)







