Demonstrate an application of approximation with Chebyshev polynomials.

We'll look to approximate the function
$$
f(t) = \frac{1}{2} - \frac{1}{\pi} \mathbb{E}\left\{
\frac{\bar{x} - t}{s_x}
\right\}
$$
where $\bar{x} = (X_1 + X_2) / 2$, $s_x = (X_1^2 + X_2^2)/2 - \bar{x}^2$, and $X_1$, $X_2$ are independent identitically distributed normal random variables with mean $\mu_{ML}$ and variance $\sigma_{ML}^2$.

For context, this function comes up when computing default Bayes factors for hypothesis testing of the mean of normally distributed data with unknown variance. See the blog post [Introduction to Objective Bayesian Hypothesis Testing](https://medium.com/towards-data-science/introduction-to-objective-bayesian-hypothesis-testing-06c9e98eb90b) and the reference [1] for more details.

1: Berger, J. and J. Mortera (1999). Default bayes factors for nonnested hypothesis testing. Journal of the American Statistical Association 94 (446), 542–554. Postscript for paper: https://www2.stat.duke.edu/~berger/papers/mortera.ps

In [1]:
import numpy as np

# These are the coefficient of the approximation when represented as 
# a series of Chebyshev polynomials of the second kind
coefs = [
    0.77143906574069909,     0.85314572098378538,     0.024348685879360281,
    -0.080216391111436719,   -0.016243633646524293,   0.014244249784927322,
    0.0083546074842089004,   -0.0013585115546325592,  -0.0032124111873301194,
    -0.00091825774110923682, 0.0007309343106888075,   0.00071403856022216007,
    9.4913853419609061e-05,  -0.00023489116699729724, -0.00017729416753392774,
    -8.6319144348730995e-06, 7.1368665041116644e-05,  5.0436256633845485e-05,
    1.8715564507905244e-06,  -2.1699237167998914e-05, -1.6200449174481386e-05,
]

In [2]:
def g(s):
    # Invert the mapping to go from the domain [0, ∞] -> [-1, 1]
    t = s / (1 + s)
    t = 2 * t - 1
    return np.polynomial.chebyshev.chebval(t, coefs)

In [3]:
# Approximate f using Chebyshev polynomials
def f(mu, sigma, t):
    t = np.sqrt(2) * (mu - t) / sigma
    mult = 1.0
    if t < 0:
        mult = -1.0
        t = -t
    return 0.5 - mult * g(t) / np.pi

In [4]:
# Approximate f using a brute force simulation approach
N = 1000000
def f_sim(mu, sigma, t):
    vals = np.random.normal(mu, sigma, size=(N, 2))
    X1 = vals[:, 0]
    X2 = vals[:, 1]
    xbar = (X1 + X2) / 2.0
    sx = np.sqrt(((X1 - xbar)**2 + (X2 - xbar)**2) / 2.0)
    vals = 0.5 - np.arctan((xbar - t) / sx) / np.pi
    return np.mean(vals)

In [5]:
# Compare the approximations against each other
mu_ml = 0.123
std_ml = 1.5
np.random.seed(0)
for t in [0.0, 0.01, 0.1, 0.2, 0.3, 5, 10, 100]:
    ft = f(mu_ml, std_ml, t)
    fpt = f_sim(mu_ml, std_ml, t)
    rerr = np.abs((ft - fpt) / fpt)
    print(t, f(mu_ml, std_ml, t), f_sim(mu_ml, std_ml,t), rerr)

0.0 0.47059206926744995 0.4705617410005234 0.0006829181799177236
0.01 0.47297735696592846 0.47284440543778317 0.00044800674755964237
0.1 0.49449429024483676 0.4947273411923894 0.0001393914288991927
0.2 0.518426675093887 0.5184967680183266 2.2982920448911075e-06
0.3 0.5422530672885648 0.5421840161788148 0.0008338542554252333
5 0.943798414491211 0.9438468766277546 7.809732402546117e-05
10 0.9726189450498345 0.9726397857047364 4.339034701734454e-05
100 0.9973033311969053 0.9973052128447453 1.2957899605273032e-06
