VARS (Variogram Analysis of Response Surfaces) is a sensitivity analysis method that uses star-based sampling and variogram analysis to estimate total-order sensitivity indices (ST). This implementation demonstrates VARS as an efficient alternative for high-dimensional sensitivity analysis with limited computational budgets.
The VARS algorithm consists of two main components: star-based sampling and sensitivity index calculation through variogram analysis.
function vars_matrices_new(star_centers::Int, d::Int, h::Float64)
# Generate star centers using Sobol sequences
sampler = SobolSample(Shift())
centers = QuasiMonteCarlo.generate_design_matrices(star_centers, zeros(d), ones(d), sampler, 1)
# For each star center, generate trajectory points
for star in 1:star_centers
center = centers[:, star]
# Add center point
push!(point_vectors, center)
# Generate trajectory points along each dimension
for dim in 1:d
c_dim = center[dim]
# Create trajectory: c_dim ± k*h for k = 1,2,3...
traj_values = filter(x -> x != c_dim, unique(vcat(
c_dim % h:h:1.0, # Forward trajectory
c_dim % h:-h:0.0 # Backward trajectory
)))
for traj_val in traj_values
new_point = copy(center)
new_point[dim] = clamp(traj_val, 0.0, 0.9999)
push!(point_vectors, new_point)
end
end
end
endfunction gsa_vars_new(Y::Vector{Float64}, info::Vector, star_centers::Int, d::Int)
# Calculate global variance from center points
VY = var(Y[center_indices])
# For each dimension, compute variogram and covariogram
for dim in 1:d
variogram_sum = 0.0
covariogram_sum = 0.0
for star in 1:star_centers
# Get trajectory points for this dimension and star
dim_values = Y[trajectory_indices]
pairs = collect(combinations(dim_values, 2))
# Compute variogram: γ = 0.5 * E[(Y(x₁) - Y(x₂))²]
variogram_i = 0.5 * mean((p1 .- p2).^2)
# Compute covariogram: C = Cov(Y(x₁), Y(x₂))
covariogram_i = cov(p1, p2)
variogram_sum += variogram_i
covariogram_sum += covariogram_i
end
# Total-order index: ST_i = (γ + C) / Var(Y)
Ti[dim] = (variogram_sum + covariogram_sum) / (stars_with_data * VY)
end
endThe key challenge was accurately estimating the total number of function evaluations for VARS to match other methods' computational budgets. Unlike methods with fixed sampling patterns, the VARS sample count depends on trajectory generation which varies with star placement and step size.
function estimate_actual_vars_cost(r::Int, d::Int, h::Float64)
# Sample test centers to estimate average cost per star
test_centers = QuasiMonteCarlo.generate_design_matrices(min(10, r), zeros(d), ones(d), sampler, 1)
total_points = 0
for center in test_centers
points_for_this_center = 1 # center point
for dim in 1:d
# Simulate actual trajectory generation
c_dim = center[dim]
traj_values = filter(x -> x != c_dim, unique(vcat(
c_dim % h:h:1.0, c_dim % h:-h:0.0
)))
points_for_this_center += length(traj_values)
end
total_points += points_for_this_center
end
return round(Int, r * (total_points / size(test_centers, 2)))
endfunction find_vars_params_robust(target_cost::Int, d::Int)
h_options = [0.05, 0.1, 0.15, 0.2] # Prioritize smaller h for better accuracy
for h_val in h_options
for r_val in
estimated_cost = estimate_actual_vars_cost(r_val, d, h_val)
if estimated_cost > target_cost * 1.2 continue end # Allow 20% overage
diff = abs(estimated_cost - target_cost)
# Prefer smaller h values: penalty for larger h
score = diff + (h_val - 0.05) * 1000
if score < best_params.diff
best_params = (r=r_val, h=h_val, cost=estimated_cost, diff=score)
end
end
end
endThe parameter selection strategy prioritizes smaller h values (0.05-0.1) for better accuracy while allowing up to 20% computational budget overage. Boundary effects significantly impact actual sample counts, making empirical cost estimation essential for fair method comparison.
Based on comprehensive comparison across dimensions d = 3, 10, 20, 40 and computational budgets of 5,000, 10,000, and 25,000 function evaluations, the following performance characteristics emerged:
-
Computational Efficiency: VARS achieves excellent performance with approximately 50% of the target computational budget due to its efficient star-based sampling strategy. The method consistently provides competitive accuracy across all tested dimensions while requiring fewer function evaluations than traditional approaches.
-
Dimensional Scaling: VARS demonstrates robust scaling properties, maintaining consistent performance as dimensionality increases. In low dimensions (
d=3), VARS is competitive but eFAST shows slightly better performance. However, in medium dimensions (d=10), VARS demonstrates strong performance often outperforming the Sobol method. Most notably, in high dimensions (d=20, 40), VARS particularly excels, maintaining accuracy while other methods struggle with computational demands. -
Error Performance: Across all test cases, VARS maintains error rates typically between 0.01-0.10 even in high-dimensional scenarios (
d=40), demonstrating remarkable consistency and reliability for sensitivity analysis applications.
| Method | Sensitivity Indices | Optimal Use Case | Computational Efficiency |
|---|---|---|---|
| VARS | ST only | High dimensions, limited budget | Excellent |
| PCE (degree 2) | S1, S2, ST | Complete sensitivity analysis | Good |
| eFAST | ST, partial S1 | Low-medium dimensions | Good |
| Sobol | S1, S2, ST | Standard benchmark | Poor scaling |
-
High-Dimensional Performance: VARS emerges as the optimal choice for high-dimensional problems where computational resources are limited. The method maintains consistent performance as dimensionality increases, making it particularly valuable for screening applications in complex parameter spaces.
-
Complete Sensitivity Analysis: When comprehensive sensitivity decomposition is required (
S1,S2, andSTindices), PCE degree 2 remains the optimal choice. This approach provides full sensitivity decomposition with reasonable computational cost and enables deeper uncertainty quantification workflows. -
Computational Resource Allocation: The analysis reveals distinct computational footprints for each method. VARS typically uses 2,500-13,000 samples for 5,000-25,000 computational budgets, while PCE uses the full computational budget (
2×Nsamples). eFAST uses the full budget (N×dsamples) but struggles in high dimensions, and Sobol uses the full budget (N×(d+2)samples) with poor dimensional scaling.
-
Screening Applications: Use VARS for initial parameter screening in high-dimensional spaces (
d ≥ 10) where computational resources are constrained. The method provides reliable total-order sensitivity estimates that identify the most influential parameters efficiently. -
Detailed Analysis: Apply PCE degree 2 for comprehensive analysis of important parameters identified through VARS screening. This combination provides both computational efficiency and complete sensitivity information.
-
Hybrid Workflows: Combine VARS and PCE for computationally efficient comprehensive sensitivity analysis. Begin with VARS for broad parameter screening, then apply PCE to the reduced parameter set identified as influential.
The VARS implementation demonstrates significant advancement in sensitivity analysis for high-dimensional problems. The method fills a crucial gap in the sensitivity analysis toolkit by providing efficient, reliable total-order sensitivity estimates where traditional methods become computationally prohibitive. The robust parameter selection and accurate cost estimation ensure fair comparison with existing methods while maintaining the computational advantages that make VARS particularly attractive for resource-constrained applications.
The distributed computing implementation using Julia's Distributed.jl package enables efficient parallel execution across multiple worker processes, significantly reducing computational time for the multiple-repeat statistical analysis framework employed in method comparison.