# The Gaffke Conjecture

Norbert Gaffke https://www.math.uni-magdeburg.de/institute/imst/ag_gaffke/files/pp1304.pdf and Learned-Miller and Thomas https://arxiv.org/abs/1905.06208 proposed a 
new test for the mean of a nonnegative random variable.

Suppose $X$ is a nonnegative random variable with CDF $F$. 

The expected value of $X$ is
\begin{equation}
\mathbb{E}X = \int_0^\infty xdF(x).
\end{equation}
Recall the identity for nonnegative random variables:
\begin{eqnarray} 
\int_0^\infty (1-F(x))dx &=& \int_0^\infty \mathbb{P}(X>x) dx \\
&=& \int_0^\infty dx \int 1_{y > x} dF(y) \\
&=& \int dF(y) \int_0^\infty 1_{y > x} dx \mbox{ (by Tonelli's theorem)}\\
&=&  \int dF(y) y \\
&=& \mathbb{E}X
\end{eqnarray}

Define $F^{-1}(p) := \inf \{x: F(x) \ge p \}$, the generalized inverse of $F$.
Then $F(F^{-1}(p)) \ge p$ for all $p \in [0, 1]$.
Since $F$ is monotone increasing, if $U \sim U[0, 1]$,
\begin{equation}
\mathbb{P} \{F^{-1}(U) \le x \} = \mathbb{P} \{F(F^{-1}(U)) \le F(x) \} = \mathbb{P} \{U \le F(x) \} = F(x).
\end{equation}
Thus $F^{-1}(U) \sim F$. 
(Note: this derivation isn't quite rigorous, but the result is true.)
This identity is behind a common way of generating IID observations from an arbitrary distribution:
generate IID uniform random variables and apply $F^{-1}$ to them.

Now if $X \sim F$, 
\begin{equation}
\mathbb{E} X = \mathbb{E}(F^{-1}(U)) = \int_0^1 F^{-1}(p) dp.
\end{equation}

## Motivation/Intuition for Gaffke's conjecture

Let $\{U_j\}_{j=1}^n$ be IID $U[0, 1]$, so $\{F^{-1}(U_j)\}_{j=1}^n$ are IID $F$.
If we have a sample $\{X_j\}$ IID $F$, we can think of them as $\{F^{-1}(U_j)\}_{j=1}^n$.
Let $(u_{(j)})_{j=1}^n$ be increasing values in $[0, 1]$ and let $u_{(n+1)} := 1$. 
\begin{equation}
\mathbb{E}X = \int_0^1 F^{-1}(p) dp \ge \sum_{j=1}^n F^{-1}(u_{(j)})(u_{(j+1)} - u_{(j)}),
\end{equation}
since $F^{-1}$ is an increasing function (the Riemann sum using the left values is less than the integral).
So we could lower-bound the mean by 
\begin{equation}
\sum_{j=1}^n X_{(j)}(U_{(j+1)} - U_{(j)})
\end{equation}
if we could observe $\{U_j\}$, but we can't. 
However, we could imagine simulating IID uniform random variables and using the observed
values of $X_j$, and accounting for the simulation uncertainty by using a small quantile of
the resulting distribution.
That is essentially Gaffke's method.

Suppose $\{X_j\}_{j=1}^n$ are IID nonnegative random variables with mean $\mu$.
Let  $\{U_j\}_{j=1}^n$ be IID $U[0, 1]$ independent of $\{X_j\}$.
Let $U_{(j)}$ be the $j$th order statistic of $\{U_j\}$, with $U_{(n+1)} := 1$.
Let $x = (x_j)_{j=1}^n$.
Consider the function
\begin{equation}
K(x) := \mathbb{P}_U \left ( \sum_{j=1}^n x_{(j)} (U_{(j+1)} - U_{(j)})  \le \mu \right ) 
\end{equation}

Gaffke conjectures that
\begin{equation}
\mathbb{P}\{ K(X) \le \alpha \} \le \alpha.
\end{equation}
(Indeed, he conjectures that it holds if $\{X_j\}$ are independent, nonnegative, and have the same mean $\mu$, even
if they are not identically distributed.)

Extensive simulations have not found a case where it fails, but the general case has not been
proved.

# Proof of some special cases

For $n=1$, the test is equivalent to Markov's inequality:
\begin{eqnarray}
\mathbb{P}_X \left \{ \mathbb{P}_U \sum_{j=1}^n X_{(j)} (U_{(j+1)} - U_{(j)}) \le \mu \} \le \alpha \right \} &=& \mathbb{P}_X \{ \mathbb{P}_U X(1-U) \le \mu \} \le \alpha \} \\
   &=& \mathbb{P}_X \{ \mathbb{P}_U XU \le \mu \} \le \alpha \} \\
   &=& \mathbb{P}_X \{ \mathbb{P}_U U \le \mu/X \} \le \alpha \} \mbox{ (this assumes $X \ne 0$, but it is true regardless)}\\
   &=& \mathbb{P}_X \{ \mu/X \wedge 1 \le \alpha \} \} \\
   &=& \mathbb{P}_X \{ \mu/X \le \alpha \} \} \\ 
   &=& \mathbb{P}_X \{ X \ge \mu/\alpha \} \} \\ 
   &\le& \alpha.  \mbox{ (by Markov's inequality)}
\end{eqnarray}

For arbitrary $n$, if the population is binary, the test is equivalent to the standard, most powerful
Binomial test, and if the support of $X_j$ contains only two (nonnegative) points, the test remains valid, 
as we shall see.

Suppose $X_i \sim \mathrm{Bern}(p)$, so 
\begin{equation}
F(x) := \left \{ \begin{array}{ll}
     1-p, & x \in [0,1) \\
     1, &  x \ge 1,
     \end{array}
     \right .
\end{equation}
and
\begin{equation}
F^{-1}(q) = 
\begin{cases} 
0, & q \in [0, 1-p) \\
1, &  q \in [1-p, 1].
\end{cases}
\end{equation}
Let $S := \sum_j X_j \sim \mathrm{Binom}(n,p)$ be the sample sum.
Let $\mathbf{U} = \{U_j\}_{j=1}^n$ be IID $U[0,1]$, independent of $\{X_j\}$, and let $U_{(n+1)} := 1$.
\begin{align}
    \sum_{i=1}^n X_{(i)} \left (U_{(i+1)} - U_{(i)} \right ) 
    &= 
    \sum_{n-S+1}^n \left (U_{(i+1)} - U_{(i)} \right ) \nonumber \\
    &= 
    1 - U_{(n-S+1)}.
\end{align}
By symmetry, $1-U_{(n-S+1)}$ has the same distribution as
$U_{(S)}$.
We are thus interested in 
$\mathbb{P} \{U_{(S)} \le p\} = \mathbb{E} 1_{U_{(S)} \le p}$.

The event $U_{(s)} \le p$ is the event that there are $s$ or more "successes" in $n$ independent $\mathrm{Bernoulli}(p)$ trials, i.e., the upper tail probability of a $\mathrm{Binom}(n,p)$ random variable from $s$ to $n$.
Conditional on $X$ (and thus on $S=s$), 
\begin{equation}
 \mathbb{P}_{\mathbf{U}} \{U_{(s)} \le p \} =
  \sum_{j=s}^n \binom{n}{j}p^j(1-p)^{n-j}. 
\end{equation}
We need 
\begin{equation}
\mathbb{P}_S \left \{ \sum_{j=S}^{n} \binom{n}{j}p^j(1-p)^{n-j} \le \alpha \right \}.
\end{equation}
Partitioning on $S$ gives
\begin{eqnarray}
\mathbb{P}_S \left \{ \sum_{j=S}^n \binom{n}{j}p^j(1-p)^{n-j} \le \alpha \right \} &=&
\sum_{s=0}^n \mathbf{1}_{\sum_{j=s}^n \binom{n}{j}p^j(1-p)^{n-j} \le \alpha} \mathbb{P}_S(S=s) \nonumber \\
&=& 
\sum_{s=0}^n \mathbf{1}_{\sum_{j=s}^n \binom{n}{j}p^j(1-p)^{n-j} \le \alpha}
\binom{n}{s} p^s (1-p)^{n-s}.
\end{eqnarray}
The indicator is zero when $s$ is small, and becomes $1$ once $s$ is sufficiently large.
The smallest $s$ for which the indicator is $1$ is the
first $s$ for which the upper tail probability is not greater than $\alpha$.
I.e.,
\begin{eqnarray}
\sum_{s=0}^n \mathbf{1}_{\sum_{j=0}^s \binom{n}{j}p^j(1-p)^{n-j} \le \alpha}
\binom{n}{s} p^s (1-p)^{n-s} &=&
\sum_{s: \sum_{j=s}^n \binom{n}{j}p^j(1-p)^{n-j} \le \alpha} \binom{n}{s} p^s (1-p)^{n-s} \nonumber \\
&\le& \alpha. \;\;\Box
\end{eqnarray}
This is thus equivalent to the most powerful one-sided test for a Binomial $p$.

## General 2-point distributions

We now consider IID nonnegative random variables 
$\{X_i\}$ with
two points of support, $\{a, b\}$, $0 \le a < b$, with mass $(1-p)$ at $a$ and mass $p$ at $b$.
Then $\mathbb{E} X_i := \mu = (1-p)a + pb$.
If we knew $a$ and $b$, we could transform to $Y_i := \frac{X_i-a}{b-a}$. 
These $Y_i$ are
IID $\mathrm{Bernoulli}(p)$, for which we just saw Gaffke's conjecture holds.

We do not actually need to know $a$ and $b$ and transform $X_i$ that way in order to obtain a valid 
P-value:
\begin{eqnarray}
 \mathbb{P}_\mathbf{U} \left \{ \sum_{i=1}^n X_{(i)} \left (U_{(i+1)} - U_{(i)} \right ) \le \mu \right \}
& = &
 \mathbb{P}_\mathbf{U} \left \{ \frac{\sum_{i=1}^n X_{(i)} \left (U_{(i+1)} - U_{(i)} \right ) - a}{b-a} \le \frac{\mu-a}{b-a} \right \} \nonumber \\
 & = & 
 \mathbb{P}_\mathbf{U} \left \{ \sum_{i=1}^n 
 \frac{X_{(i)}-a}{b-a} \left (U_{(i+1)} - U_{(i)} \right ) + \right . \nonumber \\
 && \left . + \; \frac{a}{b-a} \left [ \sum_{i=1}^n \left (U_{(i+1)} - U_{(i)} \right ) -1 \right ] \le \frac{\mu-a}{b-a} \right \} \nonumber \\
 &=& 
 \mathbb{P}_\mathbf{U} \left \{ \sum_{i=1}^n 
 \frac{X_{(i)}-a}{b-a} \left (U_{(i+1)} - U_{(i)} \right ) + \right . \nonumber \\
 && \left . + \; \frac{a}{b-a} \left [ 1 - U_{(1)} - 1 \right ] \le \frac{\mu-a}{b-a} \right \} \nonumber \\
 &=&
 \mathbb{P}_\mathbf{U} \left \{ \sum_{i=1}^n \frac{X_{(i)} -a}{b-a} \left (U_{(i+1)} - U_{(i)} \right ) \le \frac{\mu-a(1-U_{(1)})}{b-a} \right \} \nonumber \\
 &\ge& 
 \mathbb{P}_\mathbf{U} \left \{ \sum_{i=1}^n \frac{X_{(i)} -a}{b-a} \left (U_{(i+1)} - U_{(i)} \right ) \le \frac{\mu-a}{b-a} \right \}.
 \nonumber \\
 && {}
\end{eqnarray}
The final inequality implies that Gaffke with $X_i$ is only less than $\alpha$ when Gaffke with (Bernoulli) $Y_i$ is less than $\alpha$. 

In symbols,
\begin{equation}
    \left ( \mathbb{P}_\mathbf{U} \left \{ \sum_{i=1}^n X_{(i)} \left (U_{(i+1)} - U_{(i)} \right ) \le \mu \right \} \le \alpha \right )
    \subset
    \left (\mathbb{P}_\mathbf{U} \left \{ \sum_{i=1}^n \frac{X_{(i)} -a}{b-a} \left (U_{(i+1)} - U_{(i)} \right ) \le \frac{\mu-a}{b-a} \right \} \le \alpha \right ).
\end{equation}
Then by the Gaffke result for Bernoulli random variables, we obtain:
\begin{eqnarray}
 \mathbb{P}_\mathbf{X} \left ( \mathbb{P}_\mathbf{U} \left \{ \sum_{i=1}^n X_{(i)} \left (U_{(i+1)} - U_{(i)} \right ) \le \mu \right \} \le \alpha \right )
& \le &
   \mathbb{P}_\mathbf{X} \left (\mathbb{P}_\mathbf{U} \left \{ \sum_{i=1}^n \frac{X_{(i)} -a}{b-a} \left (U_{(i+1)} - U_{(i)} \right ) \le \frac{\mu-a}{b-a} \right \} \le \alpha \right ) \nonumber \\
   &\le& \alpha
\end{eqnarray}
Gaffke is thus conservative if the lower bound of the two-point distribution is greater than 0, even if we do not know what this lower bound is. Since the last inequality in (11) is strict if $a > 0$, Gaffke is less sharp if we do not use the correct lower bound. 

## Open problem

A number of smart people have spent time trying to prove Gaffke's result. 
No proof is known, to the best of my knowledge.