# CI width by changing sample size for QMC and IID Beta for the Hedged (Betting) CI Method

[This JRSSB article by Ian Waudby-Smith and Aaditya Ramdas](https://academic.oup.com/jrsssb/article/86/1/1/7043257) takes $X_1, X_2, \ldots \stackrel{\text{IID}}{\sim} F$ and computes a sequential confidence interval for $\mu = \mathbb{E}(X)$.

For Quasi-Monte Carlo (QMC), also know as low discrepancy sequences, we are going to take 

$$
X_i = \frac{1}{n} \sum_{j=1}^n T_{ij},
$$ 

where for each $i$, $\{T_{ij}\}_{j=1}^n$ is a QMC set that mimics $F$. Therefore, $X_i$ is close to $\mu$, and the sequence $\{X_i\}_{i=1}^R$ is an IID sequence based on $N = nR$ samples.

In this notebook, $F$ is a Beta Distribution.


Similarly, for QMC, for $Y = f(X)$ where $X \sim U(0, 1)$ and $\mu = \mathbb{E}(Y) = \mathbb{E}(f(X))$, we are going to take

$$
Y_i = \frac{1}{n} \sum_{j=1}^n f(x_{ij})
$$ 

Therefore, $Y_i$ is close to $\mu$, and the sequence $\{Y_i\}_{i=1}^R$ is an IID sequence based on $N = nR$ samples. 

In this notebook, we use two integrands: 

$Y = f(X) = \frac{X e^X}{e}$

$Y =
f(X,Y) = 
\begin{cases} 
1, & \text{if } X + Y > \frac{2}{3} \\
0, & \text{otherwise}
\end{cases}$

$ Y = f(X) = \begin{cases} 
1 & \text{if } x < \frac{1}{3} \\
0 & \text{otherwise}
\end{cases}$

We also use the following ridge functions:

1. $ g_{jmp}(w) = 1{\{w \geq 1\}} $
2. $ g_{knk}(w) = \frac {\min(\max(âˆ’2, w), 1) + 2} {3} $
3. $ g_{smo}(w) = \Phi (w)$
4. $ g_{fin}(w) = \min(1,\sqrt{\max(w + 2, 0)}/2) $

$w = \frac{1}{\sqrt{d}} \sum_{j=1}^{d}\Phi^{-1}(x_{j})$, $\Phi(.)$ is the CDF of standard Normal Distribution on R, denoted by $ \mathcal{N}(0,1)$, and $x \sim U(0, 1)^d$.

We have used DigitalNetB2 (Sobol) for QMC.

Importing the necessary modules:

In [27]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import norm,t
from scipy.stats import beta,uniform
from confseq.betting import betting_ci_seq
from confseq.predmix import predmix_empbern_ci_seq
import qmcpy as qp
import math

The parameters used for our numerical experiments

In [37]:
alpha = 0.05 # Significance level, confidence level = 1 - alpha

# parameteres used for the beta distribution simulations

beta_param = np.array([10,30]) #parameters for the beta distribution

# parameters used for the integrand problem

# The integrand functions:
fs = {
    "smooth_1d": lambda x: x[...,0]*np.exp(x[...,0])/np.exp(1), 
    "discontinuous_1d": lambda x: (x[...,0]) < (1/3),
    "discontinuous_2d": lambda x: (x[...,0]+x[...,1])>=(2/3),
}
# parameters used for the ridge functions

# The ridge functions:
gs = {
    "jmp": lambda w: w>=1, 
    "knk": lambda w: ((np.minimum(np.maximum(-2,w),1)) + 2) / 3,
    "smo": lambda w: norm.cdf(w),
    "fin": lambda w: np.minimum(1,((np.sqrt(np.maximum(w+2,0)))/2)),
}
d = np.array([1,2,4,16]) # The different d's to test on
ci_methods = np.array(["CLT", "EB", "Betting"]) # The different CI methods

# parameters used for the integrand problems and ridge functions


N_vary = np.array([2**8,2**10,2**12, 2**15])# The maximum sample size to be used. Recommended to keep a power of 2 since n must be a power of 2 (QMC rules).
n_vary = 2 ** np.arange(0, 7) # The vector of number of low discrepancy or QMC samples generated per replication
M = 20 # The number of times the computation is repeated

# seed settings

global_seed = 7
parent_seed = np.random.SeedSequence(global_seed)

The function to generate IID replications of QMC samples

In [29]:
def gen_qmc_samples_iid(discrete_distrib, true_measure, n = 2**8, function = None, ridge = False):
    assert isinstance(discrete_distrib,(qp.DigitalNetB2,qp.Lattice,qp.IIDStdUniform))
    assert true_measure in ["uniform","beta"]
    x_rld = discrete_distrib.gen_samples(n).reshape((discrete_distrib.replications,n,discrete_distrib.d))
    if true_measure=="beta":
        x_rld = beta(a=beta_param[0], b=beta_param[1]).ppf(x_rld)
    if ridge is True:
        return x_rld
    if function is None:
        y_rld = x_rld[...,0]
    else:
        y_rld = function(x_rld)
    return y_rld.mean(1),y_rld.flatten()

The function to return the sequence of CLT CI Widths

In [30]:
def clt_ci_seq (values, times, alpha = 0.05):
    assert np.all(times <= len(values)), f"Invalid values in times: {times[times > len(values)]}"
    ci_arr = np.zeros(len(times))
    for time in range (len(times)):
        curr_val = values[0:times[time]]
        ci_arr[time] = 2 * (t.ppf(1 - alpha / 2,times[time] - 1) * curr_val.std(ddof=1) / np.sqrt(times[time])) 
    return ci_arr

Generating the QMC samples that will be used in both the ridge and three test functions

In [31]:
x_qmc_arr = np.empty(len(n_vary), dtype=object)
for i in range(len(n_vary)):
    R_vary = N_vary.max() // n_vary[i]
    child_seed = parent_seed.spawn(1)[0]
    x_qmc_arr[i] = gen_qmc_samples_iid(discrete_distrib=qp.DigitalNetB2(d[-1],seed = child_seed,replications=(M*R_vary)), true_measure="uniform",n = n_vary[i],ridge = True)

# Using Ridge Functions:

## Varying the $R$ and $n$

QMC Numerical Experiments:

In [33]:
qmc_arr_ridge = np.empty((len(N_vary),M, len(ci_methods),len(d),len(gs), len(n_vary))) # consists of CIs (CLT, EB, Betting) for QMC, the ridge functions, the different dimensions, different R's and n's, and different N's.
for i in range (len(n_vary)):
    x_qmc = norm.ppf(x_qmc_arr[i])
    R_vary = N_vary // n_vary[i]
    print("x_qmc.shape = %s"%str(x_qmc.shape))
    for j in range (len(d)):
        w_qmc = x_qmc[:, :, :d[j]].sum(axis = 2)/np.sqrt(d[j])
        m_counter = 0
        for m in range (M):
            m_qmc = w_qmc[m_counter:m_counter + R_vary.max()]
            counter = 0
            for g in gs.values():
                y = g(m_qmc).mean(axis = 1)
                qmc_arr_ridge[:,m,0,j,counter,i] = clt_ci_seq(y,times = R_vary,alpha=alpha) # CLT CI widths
                lower_bound_qmc_integrand_eb,upper_bound_qmc_integrand_eb = predmix_empbern_ci_seq(y, times=R_vary, alpha=alpha, parallel=False, truncation =1/2) 
                # Getting the sequential EB CI widths according to the code from the paper above
                qmc_arr_ridge[:,m,1,j,counter,i] = upper_bound_qmc_integrand_eb[0] - lower_bound_qmc_integrand_eb[0] # The EB CI based on N_vary
                lower_bound_qmc_integrand_bet,upper_bound_qmc_integrand_bet = betting_ci_seq(y, times=R_vary, alpha=alpha, parallel=False, m_trunc=True, trunc_scale=3 / 4) 
                # Getting the sequential Betting CI widths according to the code from the paper above
                qmc_arr_ridge[:,m,2,j,counter,i] = upper_bound_qmc_integrand_bet[0] - lower_bound_qmc_integrand_bet[0] # The Betting CI based on N_vary
                counter = counter + 1
            m_counter = m_counter + R_vary.max()

x_qmc.shape = (655360, 1, 16)


/opt/miniconda3/envs/bet_sim/lib/python3.8/site-packages/numpy/core/fromnumeric.py:57


x_qmc.shape = (327680, 2, 16)
x_qmc.shape = (163840, 4, 16)
x_qmc.shape = (81920, 8, 16)
x_qmc.shape = (40960, 16, 16)
x_qmc.shape = (20480, 32, 16)
x_qmc.shape = (10240, 64, 16)


Here, we plot how the CLT, EB, and Betting CI width based on a total of N_vary changes as R and n changes for the different dimensions:

In [None]:
fig, axs = plt.subplots(len(d),len(gs),figsize=(20, 18))
for k in range (len(d)):
    for counter, name in enumerate(gs.keys()):
        for methods in range (len(ci_methods)):
            axs[k,counter].plot(n_vary, qmc_arr[methods,k,counter,:], label=f"{ci_methods[methods]}")
        axs[k,counter].set_xlabel("samples generated per replication (n)")
        axs[k,counter].set_ylabel("CI Width")
        axs[k,counter].set_title(f"{name}($d = {d[k]}$)")
        axs[k,counter].legend()
        axs[k,counter].set_xscale('log', base = 2)
        axs[k,counter].set_yscale('log')
fig.text(0.3,1,"CI Width vs n for different ridge functions, dimensions, and CI Methods", fontsize = 14)
fig.tight_layout()

Observations:
* In general, CLT < Betting < EB for the CI widths, which is to be expected.
* The optimum n is about $2^{10}$ or close to that for CLT. Smaller for Betting and EB at $n = 2$ or $n = 4$


Here, we print the $R$ and $n$ size at which we get the minimum CI according to the CLT, EB, and Betting CI method and compare it to the width for IID:

In [None]:
for dim in range (len(d)):
    print("\nFor d =", d[dim],":")
    counter = 0
    for name in gs.keys():
        print("")
        print("For ridge function", name,":")
        for ci in range(len(ci_methods)):
            print("The IID",ci_methods[ci], "width =", iid_arr[ci,dim,counter])
            min_val = np.min(qmc_arr[ci,dim,counter,:])
            min_indices = np.where(qmc_arr[ci,dim,counter,:] == min_val)[0]
            print("The IID_QMC",ci_methods[ci], "width is minimum when R =",R_vary[min_indices],"n =", n_vary[min_indices],"and the width =", min_val)
        counter = counter + 1

Observations:
* IID_QMC tends to perform better than IID. 

# The Integrand Problems

## Varying $R$ and $n$

Here, we vary the R and n for IID replication of QMC samples while keeping their product a constant. Note that R x n = N_vary. We will then identify the case where we get the minimum width for Betting CI and empirical Bernstein CI and compare it to IID. We will also compare the two CI methods:

IID replications of QMC experiments:

In [38]:
qmc_arr_func = np.empty((len(N_vary),M, len(ci_methods),len(fs), len(n_vary))) # consists of CIs (CLT, EB, Betting) for QMC, the integrands, different R's and n's, and different N's.
for i in range (len(n_vary)):
    R_vary = N_vary // n_vary[i]
    x_qmc = x_qmc_arr[i]
    print("x_qmc.shape = %s"%str(x_qmc.shape))
    m_counter = 0
    for m in range (M):
        m_qmc = x_qmc[m_counter:m_counter + R_vary.max()]
        counter = 0
        for f in fs.values():
            y = f(m_qmc).mean(axis = 1)
            qmc_arr_func[:,m,0,counter,i] = clt_ci_seq(y,times = R_vary,alpha=alpha) # CLT CI widths
            lower_bound_qmc_integrand_eb,upper_bound_qmc_integrand_eb = predmix_empbern_ci_seq(y, times=R_vary, alpha=alpha, parallel=False, truncation =1/2) 
            # Getting the sequential EB CI widths according to the code from the paper above
            qmc_arr_func[:,m,1,counter,i] = upper_bound_qmc_integrand_eb[0] - lower_bound_qmc_integrand_eb[0] # The EB CI based on N_vary
            lower_bound_qmc_integrand_bet,upper_bound_qmc_integrand_bet = betting_ci_seq(y, times=R_vary, alpha=alpha, parallel=False, m_trunc=True, trunc_scale=3 / 4) 
            # Getting the sequential Betting CI widths according to the code from the paper above
            qmc_arr_func[:,m,2,counter,i] = upper_bound_qmc_integrand_bet[0] - lower_bound_qmc_integrand_bet[0] # The Betting CI based on N_vary
            counter = counter + 1
        m_counter = m_counter + R_vary.max()


x_qmc.shape = (655360, 1, 16)


/opt/miniconda3/envs/bet_sim/lib/python3.8/site-packages/numpy/core/fromnumeric.py:57


x_qmc.shape = (327680, 2, 16)
x_qmc.shape = (163840, 4, 16)
x_qmc.shape = (81920, 8, 16)
x_qmc.shape = (40960, 16, 16)
x_qmc.shape = (20480, 32, 16)
x_qmc.shape = (10240, 64, 16)


Here, we plot how the Betting CI and EB CI width based on a total of N_vary changes as R and n change:

In [None]:
plt.plot(n_vary,ci_vector_qmc_integrand_bet, label = "Betting CI");
plt.plot(n_vary,ci_vector_qmc_integrand_eb, label = "EB CI");
plt.xlabel("samples generated per replication (n)");
plt.ylabel("CI width");
plt.title("CI width vs n of QMC for $f(X)$ integrand");
plt.legend();
plt.xscale('log', base = 2)
plt.yscale('log')

Some further observations:
* The CI width for both methods initially tends to decrease as number of IID replication (R) increases but tends to increase later
* Betting performs better for the most part

Here, we print the R and n size at which we get the minimum CI according to the Betting and EB CI method and compare it to the width for IID:

In [None]:
min_ci_qmc_bet = np.min(ci_vector_qmc_integrand_bet) # The smallest Betting CI width for IID replications of QMC
min_ci_index_qmc_bet = np.argmin(ci_vector_qmc_integrand_bet) # The index at which we get the smallest Betting CI width for IID Replications of QMC
min_ci_qmc_eb = np.min(ci_vector_qmc_integrand_eb) # The smallest EB CI width for IID replications of QMC
min_ci_index_qmc_eb = np.argmin(ci_vector_qmc_integrand_eb) # The index at which we get the smallest EB CI width for IID Replications of QMC

print("IID Betting CI width at sample size N_vary =",N_vary, "is", ci_iid_integrand_bet)
print("IID EB CI width at sample size N_vary =",N_vary, "is", ci_iid_integrand_eb)
print("")
print("IID Replications of QMC Betting CI width based on sample size N_vary =",N_vary, "is lowest when R =",R_vary[min_ci_index_qmc_bet],
      "and n =",n_vary[min_ci_index_qmc_bet],"\nThe CI width for this R and n is", min_ci_qmc_bet)
print("")
print("IID Replications of QMC EB CI width based on sample size N_vary =",N_vary, "is lowest when R =",R_vary[min_ci_index_qmc_eb],
      "and n =",n_vary[min_ci_index_qmc_eb],"\nThe CI width for this R and n is", min_ci_qmc_eb)

Some further observations:
* The CIs through IID replications of QMC perform better than IID
* Betting CI performs better than EB CI for QMC and similar for plain IID (a bit better for EB)

Some further observations:
* The CI width for both methods initially tends to decrease as number of IID replication (R) increases but tends to slightly increase later
* Betting CI performs better for most part