<!-- File automatically generated using DocOnce (https://github.com/doconce/doconce/):
doconce format ipynb interactive_gstar_function_v2.do.txt --encoding=utf-8 --ipynb_admon=hrule --ipynb_disable_mpl_inline --ipynb_cite=latex-plain -->

# Sensitivity indices for Sobol's $G^{*}$ function

**Leif Rune Hellevik**

# Setup

In [1]:
# --- cell: install_chaospy ---
# @title Install chaospy (Colab-friendly)

try:
    import chaospy as cp
    import numpoly
    import numpy as np
    print("chaospy er allerede installert.")
except ImportError:
    # Installer chaospy fra PyPI. Dette drar inn numpoly automatisk.
    %pip install chaospy==4.3.21 --no-cache-dir
    import chaospy as cp
    import numpoly
    import numpy as np

print("numpy  :", np.__version__)
print("numpoly:", numpoly.__version__)
print("chaospy:", cp.__version__)

chaospy er allerede installert.
numpy  : 2.2.6
numpoly: 1.3.6
chaospy: 4.3.20


In [2]:
# --- cell: repo_setup ---
# @title Repo sync and environment setup

import os
import sys
import subprocess
from pathlib import Path

IN_COLAB = "google.colab" in sys.modules
REMOTE = "https://github.com/lrhgit/uqsa2025.git"
REPO_PATH_COLAB = Path("/content/uqsa2025")

if IN_COLAB:
    if not REPO_PATH_COLAB.exists():
        print("Cloning repository...")
        subprocess.run(
            ["git", "clone", REMOTE, str(REPO_PATH_COLAB)],
            check=True
        )
    else:
        print("Updating existing repository...")
        subprocess.run(
            ["git", "-C", str(REPO_PATH_COLAB), "pull"],
            check=True
        )
    os.chdir(REPO_PATH_COLAB)

# --- Find repo root (works locally + in Colab) ---
cwd = Path.cwd().resolve()
repo_root = next(
    (p for p in [cwd] + list(cwd.parents) if (p / ".git").exists()),
    cwd
)

PY_SRC = repo_root / "python_source"
if PY_SRC.exists() and str(PY_SRC) not in sys.path:
    sys.path.insert(0, str(PY_SRC))

print("CWD:", Path.cwd())
print("repo_root:", repo_root)
print("python_source exists:", PY_SRC.exists())
print("python_source in sys.path:", str(PY_SRC) in sys.path)

CWD: /Users/leifh/git/uqsa2025
repo_root: /Users/leifh/git/uqsa2025
python_source exists: True
python_source in sys.path: True


In [3]:
# --- cell: layout_and_numpy_patch ---
# @title Layout fix, imports, and NumPy compatibility patch

import warnings
warnings.filterwarnings("ignore")

from IPython.display import HTML

HTML("""
<style>
div.cell.code_cell, div.output {
    max-width: 100% !important;
}
</style>
""")

import numpy as np
import matplotlib.pyplot as plt
import chaospy as cp
import numpoly
import pandas as pd

import ipywidgets as widgets
from IPython.display import display, clear_output
from plot_sobol import plot_sobol_bars


# Pretty-print helpers (used across notebooks)
from pretty_printing import section_title, pretty_table, pretty_print_sobol_mc


# --- NumPy reshape compatibility patch for numpoly ---
_old_reshape = np.reshape

def _reshape_compat(a, *args, **kwargs):
    newshape = None
    if "newshape" in kwargs:
        newshape = kwargs.pop("newshape")
    if "shape" in kwargs and newshape is None:
        newshape = kwargs.pop("shape")
    if newshape is not None:
        return _old_reshape(a, newshape, *args, **kwargs)
    return _old_reshape(a, *args, **kwargs)

np.reshape = _reshape_compat
print("âœ“ numpy.reshape patched for numpoly compatibility")

âœ“ numpy.reshape patched for numpoly compatibility


# Introduction

The Sobolâ€™ G* function is a canonical benchmark for variance-based
global sensitivity analysis, introduced to study factor importance,
interactions, and effective dimensionality under controlled
conditions. Analytical Sobolâ€™ indices are available, making the
function particularly suitable for validation and interpretation.

In this notebook, the G* function is used as an interactive test case
to illustrate the meaning of first-order and total-effect indices in
the sense of Saltelli et al. The focus is on understanding sensitivity
measures, not on numerical optimisation or software details.

Concepts, notation, and interpretation of Sobolâ€™ indices are
introduced in sensitivity_introduction.ipynb, which should be read
first.

Benchmark functions are central to sensitivity analysis methodology,
as emphasised in recent function datasets for benchmarking sensitivity
analysis methods. The Sobolâ€™ G* function extends the original
G-function by introducing shape and shift parameters, allowing
systematic exploration of nonlinearity and interaction effects while
retaining analytical reference solutions.

Monte Carlo estimates shown here serve as a transparent baseline for
comparison. Sampling strategies and estimator design are discussed in
more detail in monte_carlo.ipynb.

# Sobol's $G^{*}$ function

The SobolÂ´ $G^{*}$ function has the mathematical representation:

$$
Y=G(X) =  G(X_1, X_2,\ldots,X_k,a_1, a_2,\ldots,a_k)  = \prod_{i=1}^{k} g_i 
$$

where the $g_i$ is given by:
$$
g_i = \frac{(1+\alpha_i) |2 \left (X_i+ \delta_i - I(X_i+\delta_i) \right ) -1 |^{\alpha_i}+a_i}{1+{a}_i} 
$$

and all the input factors $X_i$ are assumed to be uniformly
distributed in the interval $[0,1]$, an the coefficients $a_i$ are
assumed to be positive real numbers $(a_i \leq 0)$, $\delta_i \in
[0,1]$, and $\alpha_i >0$. Finally, $ I(X_i+\delta_i)$ denotes the
integer value for $X_i+\delta_i$. Note that for $\alpha_i=1$ and
$\delta_i=0$ $g^*$ reduces to the $g$-function, another and simpler
member of the benchmark funtions  [[1]](#Azzini_func_2022). The $\alpha_i$ and $\delta_i$ are
curvature and shift parameters, respectively.

The number of factors *k* can be varied as the reader pleases, but the
minimum number to produce a meaningful inference is set at three.

As you will be able to explore below, the sensitivity $S_i$ of $G^{*}$
with respect to a specific input factor $X_i$, will depend on the
value of the corresponding coefficient $a_i$; small values of $a_i$
(e.g. $a_i=0$) will yield a high corresponding $S_i$, meaning that
$X_i$ is an important/influential variable on the variance or
uncertainty of $G$.

We have implemented Sobol's  $G^*$ function in the code snippet below:

In [4]:
# model function
import numpy as np
from numba import jit

@jit(nopython=True)
def g(Xj, aj, alphaj, deltaj):
    z = Xj + deltaj
    frac = z - np.floor(z)
    return ((1+alphaj)*np.abs(2*frac - 1)**alphaj + aj) / (1+aj)


@jit
def G(X,a,alpha,d):
    G_vector=np.ones(X.shape[0])

    for j, aj in enumerate(a):
        np.multiply(G_vector,g(X[:,j],aj,alpha[j],d[j]),G_vector)
    return G_vector

In [5]:
# --- cell: gstar_statistics ---

# Gstar-statistics
# import modules

import numpy as np


def Vi(ai, alphai):
    return alphai**2 / ((1 + 2 * alphai) * (1 + ai) ** 2)


def V(a_prms, alpha):
    D = 1.0
    for ai, alphai in zip(a_prms, alpha):
        D *= (1.0 + Vi(ai, alphai))
    return D - 1.0


def S_i(a, alpha):
    S_i = np.zeros_like(a)
    Vtot = V(a, alpha)
    for i, (ai, alphai) in enumerate(zip(a, alpha)):
        S_i[i] = Vi(ai, alphai) / Vtot
    return S_i


def S_T(a, alpha):
    S_T = np.zeros_like(a)
    Vtot = V(a, alpha)
    for i, (ai, alphai) in enumerate(zip(a, alpha)):
        S_T[i] = (Vtot + 1.0) / (Vi(ai, alphai) + 1.0) * Vi(ai, alphai) / Vtot
    return S_T

**Note on the parameter $\delta$**

The parameters $\delta_i$ are included only to keep the slider interface consistent with other Sobol test-function notebooks.
For Sobolâ€™s $G^*$ function, the sensitivity indices depend only on $(a_i,\alpha_i)$, and $\delta_i$ does **not** enter the analytical expressions for $S_i$ or $S_i^T$.

In [7]:
# --- cell: gstar_sliders_colab ---
out = widgets.Output()

def mk_sliders(prefix, n, v0, vmin, vmax, step, desc=None, width="170px"):
    if desc is None:
        desc = [f"{prefix}{i}" for i in range(1, n+1)]
    return [
        widgets.FloatSlider(
            value=v0[i-1] if isinstance(v0, (list, tuple)) else v0,
            min=vmin, max=vmax, step=step,
            description=desc[i-1],
            continuous_update=False,
            layout=widgets.Layout(width=width),
        )
        for i in range(1, n+1)
    ]

# Parametre
a = mk_sliders("a", 4, [0.75, 0.80, 0.20, 0.20], 0.0, 2.0, 0.05)
alpha = mk_sliders("alpha", 4, [0.75, 0.20, 0.20, 0.20], 0.0, 1.0, 0.05,
                   desc=["Î±1", "Î±2", "Î±3", "Î±4"])
delta = mk_sliders("delta", 4, [0.60, 0.50, 0.20, 0.20], 0.0, 1.0, 0.05,
                   desc=["Î´1", "Î´2", "Î´3", "Î´4"])

row = widgets.Layout(display="flex", flex_flow="row wrap", gap="8px")

def redraw(*_):
    with out:
        clear_output(wait=True)

        aval = np.array([s.value for s in a], dtype=float)
        alphaval = np.array([s.value for s in alpha], dtype=float)

        Si = np.asarray(S_i(aval, alphaval), dtype=float).ravel()
        ST = np.asarray(S_T(aval, alphaval), dtype=float).ravel()

        fig, ax = plt.subplots(constrained_layout=True)

        # âœ… RIKTIG KALL (ax fÃ¸rst)
        plot_sobol_bars(ax, Si, ST, title="Sobol G* â€“ sensitivities")

        plt.show()
        plt.close(fig)

# Koble sliders til redraw
for s in (*a, *alpha, *delta):
    s.observe(redraw, names="value")

ui = widgets.VBox([
    widgets.HTML("<b>G* sliders</b>"),
    widgets.HBox(a, layout=row),
    widgets.HBox(alpha, layout=row),
    widgets.HBox(delta, layout=row),
    out
])

display(ui)
redraw()

VBox(children=(HTML(value='<b>G* sliders</b>'), HBox(children=(FloatSlider(value=0.75, continuous_update=Falseâ€¦

**Reflection**  
Use the sliders to explore how the Sobol indices change.  
 * Which parameters dominate the output uncertainty?  

 * When do first-order and total indices differ?  

 * What does this indicate about interactions?

In [9]:
# chaospy G-function and pce-approx with sliders
# --- cell 8: gstar_compare_all (clean, robust, two plots: Si and ST) ---

# MC functions (your estimator)
from monte_carlo import generate_sample_matrices_mc, calculate_sensitivity_indices_mc


# -----------------------------
# Assumes defined earlier:
#   - sliders: a, alpha, delta (lists of FloatSlider)
#   - model: G(X, a_val, alpha_val, d_val) where X is (Ns,k) or (Ns,P)
#   - analytical: S_i(a, alpha) and S_T(a, alpha)
#   - plotting helper: plot_sobol_bars(ax, Si, ST, title=...)
# -----------------------------

k = len(a)

# joint distribution (iid Uniform(0,1))
jpdf = cp.Iid(cp.Uniform(0, 1), k)

# ----- controls -----
NsMC = widgets.IntSlider(value=20000, min=2000, max=120000, step=2000,
                         description="NsMC", continuous_update=False)
NsPC = widgets.IntSlider(value=2000, min=400, max=20000, step=400,
                         description="NsPC", continuous_update=False)
p_order = widgets.IntSlider(value=2, min=1, max=6, step=1,
                            description="p", continuous_update=False)

show_tables = widgets.Checkbox(value=True, description="tables")
show_plots  = widgets.Checkbox(value=True, description="plots")

out = widgets.Output()

def _relerr(est, ref, eps=1e-12):
    est = np.asarray(est, dtype=float).ravel()
    ref = np.asarray(ref, dtype=float).ravel()
    return 100.0 * np.abs(est - ref) / (np.abs(ref) + eps)

def _current_params():
    a_val = np.array([s.value for s in a], dtype=float)
    alpha_val = np.array([s.value for s in alpha], dtype=float)
    d_val = np.array([s.value for s in delta], dtype=float)
    return a_val, alpha_val, d_val

def _safe_observe(w, handler):
    try:
        w.unobserve(handler, names="value")
    except Exception:
        pass
    w.observe(handler, names="value")

def redraw(*_):
    with out:
        clear_output(wait=True)

        # 1. alltid vis overskrift
        display(section_title(
            f"sobol indices â€” analytical vs mc vs pce  "
            f"(NsMC={NsMC.value}, NsPC={NsPC.value}, p={p_order.value})"
        ))

        # --------------------
        # current parameters
        # --------------------
        a_val, alpha_val, d_val = _current_params()

        # --------------------
        # analytical
        # --------------------
        Si_ref = np.asarray(S_i(a_val, alpha_val), dtype=float).ravel()
        ST_ref = np.asarray(S_T(a_val, alpha_val), dtype=float).ravel()

        # --------------------
        # Monte Carlo (your estimator)
        # --------------------
        Ns = int(NsMC.value)
        A_s, B_s, C_s = generate_sample_matrices_mc(
            Ns, k, jpdf, sample_method="R"
        )

        # Evaluate model on A, B, and each C_i
        f_A = np.asarray(G(np.ascontiguousarray(A_s), a_val, alpha_val, d_val), dtype=float).reshape(-1)
        f_B = np.asarray(G(np.ascontiguousarray(B_s), a_val, alpha_val, d_val), dtype=float).reshape(-1)

        f_C = np.zeros((k, Ns), dtype=float)
        for i in range(k):
            Ci = np.asarray(C_s[i], dtype=float)  # (Ns,k)
            f_C[i, :] = np.asarray(G(np.ascontiguousarray(Ci), a_val, alpha_val, d_val), dtype=float).reshape(-1)

        Si_mc, ST_mc = calculate_sensitivity_indices_mc(f_A, f_B, f_C)
        Si_mc = np.asarray(Si_mc, dtype=float).ravel()
        ST_mc = np.asarray(ST_mc, dtype=float).ravel()

        # --------------------
        # PCE (regression) - same logic as before
        # --------------------
        n_pc = int(NsPC.value)
        p = int(p_order.value)

        X = jpdf.sample(n_pc, rule="random", seed=12345)  # (k, Ns)
        y = np.asarray(G(np.ascontiguousarray(X.T), a_val, alpha_val, d_val), dtype=float).reshape(-1)

        poly = cp.expansion.stieltjes(p, jpdf)
        approx = cp.fit_regression(poly, X, y)

        Si_pc = np.asarray(cp.Sens_m(approx, jpdf), dtype=float).ravel()
        ST_pc = np.asarray(cp.Sens_t(approx, jpdf), dtype=float).ravel()

        labels = [f"X{i+1}" for i in range(k)]

        # --------------------
        # tables
        # --------------------
        if show_tables.value:
            df_Si = pd.DataFrame({
                "analytical": Si_ref,
                "mc":         Si_mc,
                "err% (mc)":  _relerr(Si_mc, Si_ref),
                "pce":        Si_pc,
                "err% (pce)": _relerr(Si_pc, Si_ref),
            }, index=labels)

            df_ST = pd.DataFrame({
                "analytical": ST_ref,
                "mc":         ST_mc,
                "err% (mc)":  _relerr(ST_mc, ST_ref),
                "pce":        ST_pc,
                "err% (pce)": _relerr(ST_pc, ST_ref),
            }, index=labels)

            
            display(section_title("first-order indices  Sáµ¢"))
            pretty_table(df_Si, floatfmt=".3f")

            display(section_title("total-effect indices  Sáµ€"))
            pretty_table(df_ST, floatfmt=".3f")

        # --------------------
        # plots (two plots: Si and ST)
        # --------------------
        if show_plots.value:
            # Plot 1: Si comparison
            fig1, ax1 = plt.subplots(figsize=(9, 3.2), constrained_layout=True)
            idx = np.arange(1, k+1)
            w = 0.25
            ax1.bar(idx - w, Si_ref, width=w, label="Si (analytical)")
            ax1.bar(idx,      Si_mc, width=w, label="Si (MC)")
            ax1.bar(idx + w,  Si_pc, width=w, label="Si (PCE)")
            ax1.set_xticks(idx)
            ax1.set_xticklabels(labels)
            ax1.set_ylim(0, 1)
            ax1.set_ylabel("Si")
            ax1.set_title("First-order Sobol indices (Si): analytical vs MC vs PCE")
            ax1.legend(ncol=3)
            plt.show()
            plt.close(fig1)

            # Plot 2: ST comparison
            fig2, ax2 = plt.subplots(figsize=(9, 3.2), constrained_layout=True)
            ax2.bar(idx - w, ST_ref, width=w, label="ST (analytical)")
            ax2.bar(idx,      ST_mc, width=w, label="ST (MC)")
            ax2.bar(idx + w,  ST_pc, width=w, label="ST (PCE)")
            ax2.set_xticks(idx)
            ax2.set_xticklabels(labels)
            ax2.set_ylim(0, 1)
            ax2.set_ylabel("ST")
            ax2.set_title("Total-effect Sobol indices (ST): analytical vs MC vs PCE")
            ax2.legend(ncol=3)
            plt.show()
            plt.close(fig2)

# --- display options row (clear + visible) ---
display_opts = widgets.HBox(
    [
        widgets.HTML("<b>Display:</b>"),
        show_tables,
        show_plots,
    ],
    layout=widgets.Layout(gap="16px", align_items="center")
)


# -----------------------------
# UI (MUST be outside out!)
# -----------------------------
ui = widgets.VBox(
    [
        widgets.HBox([NsMC, NsPC, p_order]),
        display_opts,          # ðŸ‘ˆ egen rad for checkboxes
        widgets.HBox(a),
        widgets.HBox(alpha),
        widgets.HBox(delta),
    ],
    layout=widgets.Layout(gap="10px")
)
display(ui)
display(out)

# Hook everything
for s in (*a, *alpha, *delta):
    _safe_observe(s, redraw)
for w in (NsMC, NsPC, p_order, show_tables, show_plots):
    _safe_observe(w, redraw)

redraw()

VBox(children=(HBox(children=(IntSlider(value=20000, continuous_update=False, description='NsMC', max=120000, â€¦

Output()

**Reflection** 

Compare the analytical, Monte Carlo, and PCE estimates.

Consider:

* Are the rankings of parameters consistent across methods?

* Which differences matter for decision-making â€” and which do not?

* Would increasing sample size or polynomial order change your conclusions?

# References

1. <span id="Azzini_func_2022"></span> **I. Azzini and R. Rosati**.  A Function Dataset for Benchmarking in Sensitivity Analysis, *Data in Brief*, 42, pp. 108071, [doi: https://doi.org/10.1016/j.dib.2022.108071](https://dx.doi.org/https://doi.org/10.1016/j.dib.2022.108071), 2022, <https://www.sciencedirect.com/science/article/pii/S2352340922002827>.