# $E$-values and Betting Scores

Core references: 

+ Grünwald, P., R. de Heide, and W. Koolen, 2023. Safe Testing. https://arxiv.org/abs/1906.07801

+ Shafer, G., 2021. Testing by betting: A strategy for statistical and scientific communication,
_Journal of the Royal Statistical Society Series A: Statistics in Society_, _184_, 407–431, https://doi.org/10.1111/rssa.12647

+ Vovk, V., and R. Wang, 2021. E-values: Calibration, combination and applications, _Ann. Statist. 49_ (3) 1736-1754. https://doi.org/10.1214/20-AOS2020

+ Lecture notes by V. Vovk. https://www.isibang.ac.in/~statmath/pcm2020/talk1.pdf, https://www.isibang.ac.in/~statmath/pcm2020/talk2.pdf

+ Wang, R., and A. Ramdas, 2021. False discovery rate control with e-values, https://arxiv.org/pdf/2009.02824.pdf

$E$-values are a way of quantifying evidence about a statistical hypothesis. 
They are closely related to $P$-values, but more general in many ways, and possibly easier to understand.
They have a frequentist interpretation, but they are often easier to construct than $P$-values, and they
make it much easier to account for sequential testing and multiple testing than $P$-values do.


In particular (paraphrasing Shafer, 2021, and Grünwald et al., 2023):

+ An $E$-value is the observed value of a nonnegative random variable whose expected value under the null is 1: $\mathbb{E}_0 E = 1$. In contrast, a $P$-value is the observed value of a nonnegative random variable whose probability distribution under the null is dominated by the uniform distribution: $\mathbb{P}_0 \{P \le x\} \le x$, $\forall x \in [0, 1]$. It is generally a much more straightforward to construct $E$-values than $P$-values.

+ $E$-values are like the returns on a bet. Most people know it's possible to win a lot of money by "getting lucky"
and winning a bet with long odds, or by identifying bets where the payoff odds don't reflect the chance odds. Fewer people understand $P$-values. It's common to think that a small $P$-value means the alternative is true or that the probability that the null is true is small--two common misconceptions.

+ Any particular bet implies an alternative hypothesis. The betting score is the likelihood of the alternative divided by the likelihood of the null. Likelihood ratios have intuitive appeal.

+ Power calculations require a fixed significance level, not a $P$-value, so there's no direct analog of power for $P$-values. In contrast, a bet also implies a target: a value for the betting score that might be expected if the alternative hypothesis is true. 

+ Betting strategies can be "informed" by almost everything, provided the bets are "predictable" from the data available before the bet is made. Betting scores can include arbitrarily complex strategies to "win" that use all currently available data to inform the next bet, including prior probability distributions, hunches about what's really going on, changing sets of covariates, and more--all while rigorously maintaining the $E$-value property, and hence, frequentist validity. The decision of what experiment to perform next (or whether to perform another experiment) can be based on anything. In contrast, the validity of $P$-values generally requires pre-specifying the entire analysis.  Betting scores thus may correspond better to how Science is conducted: a single hypothesis might be tested many times, and each experiment (including its design, what is measured, and the test used) might be informed by previous experiments--and the alternative may evolve from new information in other fields, intuition, elicited expert opinion, etc.

+ Betting scores can often be combined by multiplication, which corresponds to "reinvesting" the winnings in future bets. Betting scores can always be combined using averages (with or without weights). In contrast, combining $P$-values to get a valid $P$-value is much more subtle.

+ There is a tendency to treat $P$-values as if they did not depend on models and methods, only on the data. This may contribute to the mistake of treating a large $P$-value as evidence that the null is true. In contrast, most people understand that there's more than one betting strategy in any particular game, so the fact that an $E$-value depends on the strategy might be easier to remember, and users might be less prone to interpreting a small $E$-value as evidence that the null is true, rather than simply that a particular way of betting against the null did not multiply the initial stake by a large factor.



## Notation
In this chapter, we will use the notation $\mathbb{P}(\cdot) := \mathbb{E}_\mathbb{P} (\cdot)$ to denote expectation with respect to the distribution $\mathbb{P}$.
The expected value of the indicator function of a measurable set $A$ is the 
probability of $A$; in some sense, this notation generalizes from $\mathbb{E}_\mathbb{P} (1_A) = \mathbb{P}(A)$ to the expectation of other functions.

Any distributions that appear in the same expression will be assumed to have a single dominating measure
$\mu$, so we can talk about densities with respect to $\mu$.
The density of $\mathbb{P}$ with respect to $\mu$ will be denoted $f_\mathbb{P}$, so that 
$d\mathbb{P}(\omega) = f_\mathbb{P}(\omega) d\mu(\omega)$.

Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space.
Let $\mathcal{I}$ be a totally ordered set with order relation $\le$.
Suppose that for all $i \in \mathcal{I}$, $\mathcal{F}_i$ is a sub-sigma-algebra of $\mathcal{F}$,
and that if $i < j$, $\mathcal{F}_i \subset \mathcal{F}_j$.
Then $\mathbb{F} := \{\mathcal{F}_i\}_{i \in \mathcal{I}}$ is a _filtration_ and $(\Omega, \mathcal{F}, \mathbb{F}, \mathbb{P})$ is
a _filtered probability space_.

Filtrations arise naturally in studying stochastic processes.
Let $\sigma(X)$ be the sigma-algebra generated by the random variable $X$ (the smallest sigma-algebra for which $X$ is measurable, i.e., the smallest sigma algebra that contains the pre-image $X^{-1}(B)$
of every Borel subset $B \subset \mathcal{B}$), and let  $\sigma(X_j : j \le i) := \sigma(\cup_{j \le i} \sigma(X_j))$.
As the process evolves, a richer and richer set of events becomes measurable.
Let $(X_i)_{i \in \mathbb{N}}$ be a stochastic process on the probability space $(\Omega, \mathcal{F}, \mathbb{P})$,
and define $\mathcal{F}_i := \sigma(X_j : j \le i)$.
Then $\mathbb{F} := \{\mathcal{F}_i\}_{i \in \mathcal{I}}$ is a filtration.

## Warm-up 1: betting as evidence

A _predictor_ or _forecaster_ claims that $Y$ is a random variable with probability distribution $\mathbb{P}_0$.
The value $y$ of $Y$ will be revealed (to the predictor and you) later.
The predictor backs up the claim by offering to sell you any payoff $S(y)$ for the price $\mathbb{E}_0 S(Y) =: \mathbb{P}_0 S(Y)$,
the expected value of $S(Y)$ (before it is observed), computed
on the assumption that the predictor is right--that $Y$ is indeed a random variable and has distribution $\mathbb{P}_0$.
The payoff is required to be nonnegative, so that all you risk is what you bet: the expected value of $S$ on the assumption that $Y \sim \mathbb{P}_0$.

If you buy the bet $S$, your payoff is $S(y)$ if $Y=y$, and your _betting score_ is $S(y)/\mathbb{P}_0 S(Y)$, the amount by which you multiplied your initial stake.
Without loss of generality, we may assume that $\mathbb{P}_0 S(Y) = 1$ and allow you to buy any multiple of the bet $S$ you can afford; then your (eventual) betting score is $S(y)$. 
You don't have to bet your whole fortune, but if you withhold a fraction $\beta$ of your current fortune and 
bet the remaining fraction $1-\beta$ on $S$, that is equivalent to betting your whole fortune on $S' = \beta + (1-\beta)S$, which also has expected value 1 under the null since $\mathbb{P}_0 S = 1$.
That is, betting only a fraction of your current fortune its just another bet that is expected to break even
under the predictor's hypothesis, so without loss of generality, we can assume that you bet your entire fortune
on some $S$.

It isn't necessary that *you* believe $Y$ is really a random variable: you can still
bet if you think the predictor's claim is wrong, that is, if you think you can make money betting on some
$S$ with $\mathbb{P}_0 S(Y) = 1$.

Now suppose there is a series of trials, $(Y_j)$, which might or might not be random; and if they are random,
they might or might not be independent.
The predictor is allowed to make a series of predictions, say $\mathbb{P}_{0j}$ for
$j = 1, \ldots$.
The predictor need not make a prediction for every trial, 
and the prediction for the $j$th trial, $\mathbb{P}_{0j}$, might depend on the outcome of previous trials, $\{Y_i \}_{i<j}$.
(This is much closer to how science is conducted than the assumption that trials are independent
and involve the same parameters.)
Shafer (2021) writes:

> The probabilistic predictions that can be associated with a scientific hypothesis usually go beyond a single comprehensive probability distribution. In some cases, a scientist may begin with a joint probability distribution P for a sequence of variables $Y_1 , \ldots, Y_N$  and formulate a plan for successive experiments that will allow her to observe them. But the scientific enterprise is usually more opportunistic. A scientist might perform an experiment that produces $Y_1$’s value $y_1$ and then decide whether it is worthwhile to perform the further experiment that would produce $Y_2$’s value $y_2$. Perhaps no one even thought about $Y_2$ at the outset. One scientist or team tests the hypothesis using $Y_1$, and then, perhaps because the result is promising but not conclusive, some other scientist or team comes up with the idea of further testing the hypothesis with a second variable $Y_2$ from a hitherto uncontemplated new experiment or database.


Suppose you start with \\$1, and you are allowed to bet on any or all of the predictions: before the $j$th 
trial the predictor offers the prediction $Y_j \sim \mathbb{P}_{j0}$, which can depend on previous trials.
You are allowed to buy any nonnegative $S(Y_j)$ for the price $\mathbb{P}_{j0} S(Y_j)$, which is assumed to
be \\$1. 
Your fortune after the first bet is settled is $S_1(y_1)$. 
The predictor now offers to sell you any $S_2(Y_2)$ for its expected value under the null $Y_2 \sim \mathbb{P}_{02}$,
again assumed to be \\$1.
If you bet your current fortune to by a multiple of $S_2(Y_2)$, then
your fortune when the second bet settles is $S_1(y_1)S_2(y_2)$, etc.: betting scores on successive bets multiply to give your current fortune.

If you end up making a lot of money, that is evidence that the predictor was wrong--or that you were very lucky. 
If you don't end up making a lot of money, maybe the predictor was right--or maybe you chose bad bets (you didn't bet
on the right alternative).
Regardless, it is not evidence that the predictor was right.
This is the same asymmetry involved in hypothesis tests: a large $P$-value is not evidence that the null is true.

## Warm-up 2: hypothesis tests as bets

See [hypothesis testing](./tests.ipynb).

Core idea: if you can make money betting against the null hypothesis (by making bets
that are expected to be break-even of the null hypothesis is true), that's evidence that the
null hypothesis is false.

In the typical setup for hypothesis testing, we observe data $X \sim \mathbb{P}$.
To test the null hypothesis test $\mathbb{P} = \mathbb{P}_0$, we choose a function $\phi(\cdot)$ with the property that $\mathbb{P}_{\mathbb{P}_0,U}\phi(X,U) = \alpha$, where $U$ is an auxilliary uniform random variable
independent of $X$, only needed for randomized tests.

We reject the hypothesis $\mathbb{P} = \mathbb{P}_0$ if $U \le \phi(X)$.

We can think of $\phi(X)$ as an "all-or-nothing" bet that pays $1/\alpha$ times the stake (which we will 
take to be \\$1) if $U \le \phi(X)$, and pays 0 otherwise. 
If the null is true, the expected value of such a bet is \\$1.
That is, $X$ plays the role of $Y$, above, and $S(Y)$ is $1/\alpha$ if $U \le \phi(Y)$ and zero otherwise.

Two scenarios:
+ bet once in a while, don't reinvest your winnings
+ bet whenever you want, reinvest your winnings

The first is like $P$-values and standard tests of significance: all-or-nothing bets, 
with no "combining evidence" across experiments.
The second leads to betting scores and $E$-values.

Multiple testing: suppose a hypothesis is tested 20 times at significance level 5%, producing one
"significant" result. From a testing perspective, we have to adjust for multiplicity to understand
how strong the evidence is that the null is false, and that adjustment requires knowing the dependence among the experiments. From an $E$-value perspective, the betting score is 1: \\$20 was wagered, and \\$20 was won.

## Betting scores for simple nulls are likelihood ratios, and vice versa

Suppose we have a nonnegative random variable $S(Y)$ with expected value $1$ under the null $Y \sim \mathbb{P}_0$,
i.e., $\int S(y) d\mathbb{P}_0(y) = 1$. 
Thus the measure $\mathbb{Q}$ defined by $d\mathbb{Q}(y) := S(y) d\mathbb{P}_0(y)$ is also a probability measure: $\mathbb{Q}(y) \ge 0$ and $\int d\mathbb{Q}(y) = 1$.
Hence, $S(y) = f_\mathbb{Q}(y)/f_{\mathbb{P}_0}(y)$ is the likelihood ratio of $\mathbb{Q}$ to $\mathbb{P}_0$.
The distribution $\mathbb{Q}$ is called _the alternative implied by $S$_.

Conversely, suppose $\mathbb{Q}$ is a probability distribution for $Y$.
Then $S(Y) := f_\mathbb{Q}(Y)/f_{\mathbb{P}_0}(Y)$ is a betting score, since it is nonnegative and
\begin{eqnarray}
\mathbb{P}_0 (f_\mathbb{Q}(Y)/f_{\mathbb{P}_0}(Y)) &=& \int (f_\mathbb{Q}(y)/f_{\mathbb{P}_0}(y)) d\mathbb{P}_0(y) \\
&=& \int (f_\mathbb{Q}(y)/f_{\mathbb{P}_0}(y)) f_{\mathbb{P}_0}(y) d\mu(y) \\
&=& \int f_\mathbb{Q}(y) d\mu(y) \\
&=& \int d\mathbb{Q}(y) \\
&=& 1.
\end{eqnarray}

As mentioned above, the betting formulation makes sense even if $Y$ isn't a random variable.
But suppose I think $Y \sim \mathbb{Q} \ne \mathbb{P}_0$.
What payoff function $S$ should I bet on?

If the goal is to grow my capital at the fastest rate (the Kelly criterion), I want to maximize
\begin{equation}
\mathbb{Q} \ln S = \mathbb{Q} \ln \left( f_\mathbb{R}(Y)/f_{\mathbb{P}_0}(Y) \right)
\end{equation}
for some measure $\mathbb{R}$.
Gibbs' inequality says that
\begin{equation}
\mathbb{Q} \ln \left ( f_\mathbb{Q}(Y)/f_{\mathbb{P}_0}(Y) \right ) \ge \mathbb{Q} \ln \left ( f_\mathbb{R}(Y)/f_{\mathbb{P}_0}(Y) \right )
\end{equation}
for any distribution $\mathbb{R}$ for $Y$ (dominated by $\mu$).

Thus the optimal payoff function to bet on is $S(Y) = f_\mathbb{Q}(Y)/f_{\mathbb{P}_0}(Y)$.

(The Kullback-Leibler (KL) Divergence between a distribution $\mathbb{Q}$ and a distribution or subdistribution $\mathbb{P}$ that have densities $f_{\mathbb{Q}}$ and $f_{\mathbb{P}}$ with respect to a common dominating measure is
\begin{equation}
D(\mathbb{Q}||\mathbb{P}) := \mathbb{Q}[\ln(f_{\mathbb{Q}}/f_{\mathbb{P}})].
\end{equation}
Gibb's inequality is an inequality on the KL divergence.)


## Picking the bet: why maximize the expected log payoff (rather than the expected payoff)?

This is connected to the idea of repeated betting, rather than one-shot bets.
As noted previously, if you maximize the expected return on a single bet, you risk going broke.
Maximizing the expected return on a single bet puts all your money on the single outcome with the highest probability.



## The Neyman-Pearson Lemma and the optimality of all-or-nothing bets

Neyman and Pearson propose maximizing _power_: among all tests of $\mathbb{P}_0$ with level $\alpha$, find the test with greatest power against some alternative $\mathbb{Q}$.

The Neyman-Pearson lemma says that the optimal randomized test $\phi(x,u)$ equals 1 for all $x$ with $f_\mathbb{Q}(x)/f_{\mathbb{P}_0}(x) > c$ for some $c$ and is randomized on the set $f_\mathbb{Q}(x)/f_{\mathbb{P}_0}(x) = c$ to make its level exactly $\alpha$. 
For instance, let $I_c$ denote the subset of $\mathcal{X}$ for which $f_\mathbb{Q}(x)/f_{\mathbb{P}_0}(x) > c$ and $P_c$ denote the subset of $\mathcal{X}$ for which $f_\mathbb{Q}(x)/f_{\mathbb{P}_0}(x) = c$.
Then $c = \sup\{b \in \Re^+ : \mathbb{P}_0 I_b \le \alpha \}$.
Define 
\begin{equation}
u^* := \frac{\alpha-\mathbb{P}_0 I_c}{\mathbb{P}_0 P_c}.
\end{equation}
Then the test function 
\begin{equation}
\phi(x, u) := \left \{ \begin{array}{ll}
       1, & x \in I_c, u \in [0, 1] \\
       1,  & x \in P_c, u \in [u^*, 1] \\
       0, & \mbox{ otherwise }
       \end{array}
       \right .
\end{equation}
gives a most powerful test of $\mathbb{P}_0$ against the alternative $\mathbb{Q}$.
If $\mathbb{P}_0 I_c = \alpha$, it is non-randomized, and it is the unique most powerful test.

Among all "all-or-nothing" bets, the Neyman-Pearson test maximizes $\mathbb{Q} (S(Y) \ge 1/\alpha)$.


## Implied targets

If you bet on $S$, implicitly you are suggesting that $Y \sim \mathbb{Q}$, 
where $f_\mathbb{Q}(y) := S(y) f_{\mathbb{P}_0}(y)$.
Implicitly, you expect 
\begin{eqnarray}
\mathbb{Q} [\ln S(Y)] &=& \int \ln S(y) d\mathbb{Q}(y) \\
&=&  \int \ln S(y) S(y) d\mathbb{P}_0(y) \\
&=& \mathbb{P}_0 [S(Y) \ln S(Y)].
\end{eqnarray}
The _implied target_ is the expected rate of growth in your wealth if the alternative $\mathbb{Q}$ for which your
bet $S$ is optimal is in fact true.

## Composite nulls

Suppose that the forecaster claims that $Y \sim \mathbb{P}$ for some (otherwise unspecified) $\mathbb{P} \in \mathcal{P}_0$, i.e., the null hypothesis is composite, rather than simple.
We can test such a hypothesis using betting by using nonnegative payoffs $S(Y)$ that have expected value no greater than 1
for any $\mathbb{P} \in \mathcal{P}_0$.
That is, the forecaster offers to sell any $S$ such that $\sup_{\mathbb{P} \in \mathcal{P}_0} \mathbb{P}S(Y) \le 1$
for \\$1.

## Composite alternatives

The bets $S$ can be chosen in countless ways, including maximizing the expected
growth rate for weighted mixtures of alternatives or maximizing the minimum growth rate over a set of alternatives.



# Betting-based testing protocols testing

Suppose a statistical model gives a probability distribution $\mathbb{P}_0$ for data $Y$

The _Forecaster_ proposes a probability distribution for a future observation.
The statistician is the _Skeptic_, who is testing the Forecaster. _Reality_ reveals the value of variables.

**Protocol for a single experiment or prediction:**  

+ Forecaster claims $Y \sim \mathbb{P}_0$
+ Skeptic selects a random variable $S \ge 0$ such that $\mathbb{P}_0 S(Y) = 1$.  
+ Reality announces $y$
+ $K \leftarrow S(y)$ is the Skeptic's winnings.

$\mathbb{P}_0 (K \ge 1/\alpha) \le \alpha$ for all $\alpha \in (0, 1]$, by Markov's inequality.

**Protocol for a sequence of experiments or predictions, with side information:**  
This protocol is for testing whether a forecaster is any good.
The forecaster and the skeptic have access to extra information: at step $i$, Reality
reveals a value $x_i$ (think of it as an independent variable).

+ $K_0 := 1$ (Skeptic's initial bankroll is \\$1)
+ For $j = 1, \ldots$:
    - Reality announces $x_j$, the independent variable
    - Forecaster selects $\mathbb{P}_{0j}$ and claims $Y_j \sim \mathbb{P}_{0j}$
    - Skeptic selects $S_j$ such that $\mathbb{P}_{0j} (S_j) = K_{j-1}$
    - Reality announces $y_j$
    - $K_j \leftarrow S_j(y_j)$

At each step, this process generates a value that would be a martingale if every $Y_j$ had the distribution $\mathbb{P}_{0j}$ conditional on previous values $Y_i$, $i < j$. Thus Ville's inequality applies, and if
the forecaster's forecasts (nulls) were correct, $\mathbb{P} \{ \exists j \in \mathbb{N} : K_j \ge 1/\alpha \} \le \alpha$.

**Protocol for an abstract bounded strategy game:** (see Shafer & Vovk, 2019)  
This protocol is for a "number-guessing game."

+ Skeptic announces $K_0$ (initial bankroll)
+ For $j = 1, \ldots,$:
    - Forecaster announces $m_j \in [-1, 1]$ (a prediction)
    - Skeptic announces $M_j \in \Re$ (the bet, which can be positive or negative)
    - Reality announces $y_j \in [-1, 1]$
    - $K_j \leftarrow K_{j-1} + M_j(y_j - m_j)$
    
In the protocol above,
there's no loss in generality in setting $m_j=0$ (since reality can always announce $y_j' = y_j-m_j$ and everything else stays the same), and no increase in generality in replacing $[-1, 1]$ by $[-C, C]$ (by rescaling).
We will take $m_j = 0$ in the following discussion, in essence, removing Forecaster from the protocol.

It's possible that the Skeptic could get arbitrarily rich, with $K_j \rightarrow \infty$.
But Reality can always prevent that by setting $y_j = m_j$ (or $y_j = 0$ in the special case).
So Skeptic can't guarantee they will get rich. 
But to prevent Skeptic from becoming infinitely rich, Forecaster and Reality might have to behave in a particular way.

**Notation:** (See Table 1.1 of Shafer & Vovk, 2019)  

Concept          |   Definition | <div style="width:400px">Notation</div>   |
:----------------|:---------------------|:-------------------------------------- |
situation        | sequence of moves by Reality | $s= y_1 \cdots y_j$ |
situation space  | set of all situations        | $\mathbb{S}$ |
initial situation| empty sequence               | $\Box$ |
path             | complete sequence of moves by Reality | $\omega = y_1y_2 \cdots$ |
$j$th element of a path |  | $\omega_j$ |
prefix of a path | first $j$ elements of the path | $\omega^j = \omega_1 \omega_2 \cdots \omega_j$ |
sample space     | set of all paths             | $\Omega$ |
process | real-valued function on $\mathbb{S}$ | $S: \Omega \rightarrow \mathbb{R}^\infty$ |
$j$th term in the process $S$ | | $S_j : \omega \in \Omega \mapsto S(\omega^j) \in \mathbb{R}$ |
predictable process | function $T$ on $\mathbb{S} \setminus \{\Box\}$ for which $T_j$ depends only on $\omega^{j-1}$ | |
event            | subset of sample space       | $A \subset \Omega$ |
variable         | function on the sample space | $X: \Omega \rightarrow \mathbb{R}$ |
strategy for Skeptic | initial stake and predictable process of "moves" | $\psi = (\psi^\mathrm{stake}, \psi^M)$ |
capital process for strategy $\psi$ | sequence of fortunes that result from playing strategy $\psi$ | $K^\psi$, where $K_0^\psi := \psi^\mathrm{stake}$; $K_j^\psi := K_{j-1}^\psi + \psi_j^My_j$ |

In the next-to-last definition, $\psi^\mathrm{stake}$ is the initial stake $K_0$ for
the strategy and $\psi^M(\omega^j)$ is the move $M_j$ the strategy $\psi$ makes in situation $\omega^{j-1}$.

The set of strategies comprises a vector space: for any real number $\beta$, if $\psi = (\psi^\mathrm{stake}, \psi^M)$
is a strategy, so is 
\begin{equation}
\beta \psi := (\beta \psi^\mathrm{stake}, \beta\psi^M),
\end{equation}
and if
$\psi^1$ and $\psi^2$ are strategies, so is 
\begin{equation}
\psi^1 + \psi^2 := (\psi^{1,\mathrm{stake}}+\psi^{2,\mathrm{stake}}, \psi^{1,M} + \psi^{2,M}).
\end{equation}

Suppose $\beta \in [0, 1]$ and the strategies $\psi^1$ and $\psi^2$ have the same initial stake $K_0$. 
Then the convex combination of strategies 
\begin{equation}
\beta \psi^1 + (1-\beta) \psi^2
\end{equation}
amounts to splitting the initial stake across the two strategies ($\beta K_0$ for the first and $(1-\beta)K_0$ for the
second), and playing them in parallel. The capital process for the combination is
\begin{equation}
K^{\beta \psi_1 + (1-\beta) \psi_2} = \beta K^{\psi_1} + (1-\beta) K^{\psi_2}.
\end{equation}

Suppose that $\beta_i \ge 0$, $i \in \mathbb{N}$ with $\sum_{i \in \mathbb{N}} \beta_i = 1$;
that the strategies $\psi^i$, $i \in \mathbb{N}$ have the same initial capital $K_0$;
and that the sums $\sum_i \beta_i \psi^{i,M}(s)$ converge for every $s \in \mathbb{S} \setminus \{\Box\}$,
then 
\begin{equation}
\sum_i \beta_i K^{\psi^i}
\end{equation}
converges and is the capital process of the strategy $\sum_i \beta_i \psi^i$.
That amounts to splitting the initial capital across countably many accounts and playing the corresponding
strategy with each such account.

**Definition:**  
If Skeptic has a strategy $\psi$ that guarantees that $K_j^\psi \ge 0$ for all $j$ and either
$\lim_{j \rightarrow \infty} K_j^\psi = \infty$  
or some event $A$ occurs, then Skeptic can _force $A$_.
In the language of probability, we would say that $A$ is _almost sure_.

If Skeptic has a strategy $\psi$ that forces $A$, Skeptic has a strategy that forces $A$ and starts with $K_0 = 1$.
If Skeptic has a strategy $\psi$ that forces $A$, Skeptic has a strategy that forces $A$ and ensures $K_j \ge c$
for any $c \in (-\infty, K_0)$.


**Definition:**  
If Skeptic has a strategy $\psi$ that guarantees that $K_j^\psi \ge 0$ for all $j$ and either
$\sup_{j \rightarrow \infty} K_j^\psi = \infty$  
or some event $A$ occurs, then Skeptic can _weakly force $A$_.

**Lemma:** (Shaver & Vovk, Lemma 1.5)    
If Skeptic can weakly force $A$ in the protocol above, Skeptic can force $A$ in that protocol.

_Proof_:  Suppose Skeptic can weakly force $A$ using the strategy $\psi$.
Define the strategy $\psi'$ as follows:

+ Play $\psi$ starting from $K_0 := K_0^\psi$ until $K_j \ge K_0+1$ (or forever if that never happens). Let $m$ be the smallest $j$ s.t. $K_j \ge K_1+1$. Starting at round $m+1$, play the strategy 
\begin{equation}
M_j := \frac{K_m-1}{K_m} \psi_j^M.
\end{equation}
That amounts to setting aside one unit of capital that will never be lost in future play.
+ Continue until $K_j \ge K_m+1$, then set aside another unit of capital by scaling down the moves $\psi_j^M$ further.
+ Keep repeating that process

The resulting strategy has a nonnegative capital process. If $\omega \notin A$, 
the original strategy $\psi$ had a supremum of $\infty$, so this modified strategy would set aside a unit of capital infinitely many times: its capital would tend to infinity. Thus it forces $A$.

**Lemma:** (Shafer & Vovk, Lemma 1.6)  
If Skeptic can weakly force each of the events $A_1, A_2, \ldots$ in the previous protocol, then Skeptic can weakly force
$\cap_i A_i$ in that protocol.

_Proof:_  
Suppose strategy $\psi^k$ weakly forces $A_k$ and (wlog) starts with initial capital $K_0 = 1$.
Since the strategy guarantees $K_j^{\psi^k} \ge 0$ for all $j$, $M_j \le K_{j-1}$; thus 
\begin{equation}
K_j^{\psi^k} \le 2K_{j-1}^{\psi^k},
\end{equation}
and hence $K_j^{\psi^k} \le 2^j$ and $|\psi_j^{k,M}| \le 2^j$ for all $k$ and all $j$.
Now define 
\begin{equation}
\psi := \sum_{k \in \mathbb{N}} 2^{-k} \psi^k.
\end{equation}
The sums implicit in that sum all converge: the convex combination makes sense mathematically.
Since $\psi^k$ weakly forces $A_k$, $\psi$ weakly forces $A_k$, starting with initial capital $2^{-k}$.
The strategy $\psi$ thus forces $\cap_k A_k$.

**Lemma:** (Shafer & Vovk, Lemma 1.7)  
Suppose $\kappa > 0$. In the protocol above, Skeptic can weakly force
\begin{equation}
{\lim \sup}_{j \rightarrow \infty} \bar{y}_j \le \kappa
\end{equation}
and 
\begin{equation}
{\lim \inf}_{j \rightarrow \infty} \bar{y}_j \ge -\kappa.
\end{equation}

_Proof:_  
Assume wlog that $\kappa \le 1/2$. Let $\psi$ have $K_0 := 1$ and $M_j := \kappa K_{j-1}$.
Then $K_0^\psi = 1$ and
\begin{equation}
 K_j^\psi = \prod_{i=1}^j (1 + \kappa y_i).
\end{equation}
Suppose $\omega$ has $\sup_j K_j^\psi(\omega) = C_\omega < \infty$.
Then 
\begin{equation}
\sum_{i=1}^j \ln (1+\kappa y_i) \le \ln C_\omega, \;\; \forall j.
\end{equation}
Now $\ln(1+t) \ge t - t^2$ for $t \ge -1/2$, so for all $j$, 
\begin{equation}
\kappa \sum_{i=1}^j y_i - \kappa^2 \sum_{i=1}^j y_i^2 \le \ln C_\omega.
\end{equation}
But $\sum_{i=1}^j y_i^2 \le j$, so
\begin{equation}
\kappa \sum_{i=1}^j y_i - \kappa^2 j \le \ln C_\omega,
\end{equation}
i.e.,
\begin{equation}
\bar{y}_j \le \frac{\ln C_\omega}{\kappa j} + \kappa,
\end{equation}
so the lim sup holds for that $\omega$.
The same argument, mutatis mutandi, shows that the lim inf holds too.

**Proposition:** (Shafer & Vovk, Proposition 1.2)  
Skeptic can force $\lim_{j \rightarrow \infty} \bar{y}_j = 0$.

_Proof:_  
Skeptic can weakly force $\lim \sup \le \kappa$ and $\lim \inf \ge -\kappa$ for $\kappa = 2^{-k}$, $k = 1, \ldots$.
Hence Skeptic can weakly force the intersection.
Hence Skeptic can force the intersection.

These ideas can be used to prove important theorems in probability (e.g., the weak law or large numbers)
without mentioning or defining probability.
The basic equivalence is that the occurrence of an event with probability zero corresponds to the Skeptic winning an unbounded amount of money.


### Warranties

A _warranty_ is a claim that a set of hypotheses contains the truth, analogous to a confidence set.
A _$1/\alpha$_ warranty corresponds to a $1-\alpha$ confidence set: the set of parameters for which the betting score is not greater than $1/\alpha$.
The Skeptic's bet against any of the hypotheses (i.e., parameters) in the warranty set would not have turned \\$1 into as much as \\$$1/\alpha$, but the Skeptic would have won at least \\$$1/\alpha$ betting against any hypothesis not in the $1/\alpha$-warranty set.
Warranty sets are automatically nested: the $1/\alpha$-warranty set contains the $1/\alpha'$-warranty set if $\alpha \ge \alpha'$ for any given betting scheme.

## E-values formalized

**Definition.**  
Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space,
and let $E$ be a random variable $E: \Omega \rightarrow [0, \infty]$ such that
$\mathbb{P}(E) := \int_{\Omega} E(\omega) d\mathbb{P}(\omega) \le 1$. 
(Note that $E$ may take the value $\infty$, which
corresponds to the strongest possible evidence that the data do not come from $\mathbb{P}$.)
Then **$E$ is an $E$-variable for $\mathbb{P}$.**

Let $\mathcal{P}$ be a collection of probability distributions on the measurable space $(\Omega, \mathcal{F})$,
and let $E$ be a random variable $E: \Omega \rightarrow [0, \infty]$ such that for all $\mathbb{P} \in \mathcal{P}$,
$\mathbb{P}(E) \le 1$.
Then **$E$ is an $E$-variable for $\mathcal{P}$.**

The set of all $E$-variables for a collection $\mathcal{P}$ of probability distributions is $\mathcal{E}(\mathcal{P})$.

The observed value of an $E$-variable is an $E$-value (or $e$-value).

**Definition.**     
Suppose $\Omega$ is a sample space, $\mathcal{F}$ is a sigma-algebra on $\Omega$,
and $\mathbb{F}$ is a filtration on $\mathcal{F}$ indexed by $\mathbb{N}$.
Let $\mathcal{P}_0$ be a collection of distributions such that for each $\mathbb{P} \in \mathcal{P}_0$,
$(\Omega, \mathcal{F}, \mathbb{F}, \mathbb{P})$ is a filtered probability space.
The nonnegative random variable $E_j$ is a $\mathcal{F}_{j-1}$-conditional $E$-variable (for the null hypothesis $\mathcal{P}_0$) if it is $\mathcal{F}_j$-measurable and 
\begin{equation}
\forall \mathbb{P} \in \mathcal{P}_0, \;\; \mathbb{P} \left ( \mathbb{E}_\mathbb{P} (E_j | \mathcal{F}_{j-1}) \le 1 \right ) = 1.
\end{equation}
If for every $j \in \mathbb{N}$, $E_j$ is a $\mathcal{F}_{j-1}$-conditional $E$-variable
for $\mathcal{P}_0$,
$\{E_j\}_{j \in \mathbb{N}}$ is a _conditional $E$-variable collection relative to $\mathbb{F} = \{\mathcal{F}_j\}_{j \in \mathbb{N}}$ for $\mathcal{P}_0$_, or an _$E$-process for $\mathcal{P}_0$_.

The running product of an $E$-process for $\mathcal{P}_0$ is a test supermartingale for $\mathcal{P}_0$:
define $E^j := \prod_{i=1}^j E_j$.
Then for every $\mathbb{P} \in \mathcal{P}_0$, $(E^j)_{j \in \mathbb{N}}$ is
a nonnegative supermartingale starting at 1.

Conversely, suppose $(T_i)_{i \in \mathbb{N}}$ is a test supermartingale for every $\mathbb{P} \in \mathcal{P}_0$.
Then $(T_i)_{i \in \mathbb{N}}$ is an $E$-process for $\mathcal{P}_0$. (Ramdas et al., 2021)
Hence, for any stopping time $\tau$, $T_\tau$ is an $E$-value for $\mathcal{P}_0$.
(This allows us to use Ville's inequality to turn $E$-processes into sequentially valid
$P$-values.)


**Definition.**  
Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space,
and let $P$ be a random variable $P: \Omega \rightarrow [0, 1]$ such that
$\forall p \in [0, 1]$,
$\mathbb{P}(P \le p) \le p$.
Then **$P$ is a P-variable for $\mathbb{P}$.**

Let $\mathcal{P}$ be a collection of probability distributions on the measurable space $(\Omega, \mathcal{F})$,
and let $P$ be a random variable $P: \Omega \rightarrow [0, 1]$ such that for all $\mathbb{P} \in \mathcal{P}$,
$\forall p \in [0, 1]$,
$\mathbb{P}(P \le p) \le p$.
Then **$P$ is a P-variable for $\mathcal{P}$.**

The set of all $P$-variables for a collection $\mathcal{P}$ of probability distributions is $\mathcal{P}(\mathcal{P})$.


The observed value of a $P$-variable is a $P$-value.

## Picking the bets

The _GRO (growth rate optimal) criterion_ for testing the null $\mathcal{P}_0$
against a simple alternative $\mathbb{Q}$ is
\begin{equation}
 \mathrm{GRO}(\mathbb{Q}) := \sup_{E \in \mathcal{E}(\mathcal{P}_0)} \mathbb{Q} \ln E
\end{equation}
Grünwald et al. (2023) show this is equal to
$D(\mathbb{Q} || \mathbb{P}_0^*)$ for a particular essentially unique
subdistribution $\mathbb{P}_0^*$.

The _GROW (growth rate optimal in the worst case) criterion_ for testing the composite null 
$\mathcal{P}_0$ against the composite alternative $\mathcal{Q}$ is
\begin{equation}
\mathrm{GROW}(\mathcal{Q}) := \sup_{E \in \mathcal{E}(\mathcal{P}_0)} \inf_{\mathbb{Q} \in \mathcal{Q}}
\mathbb{Q} \ln E.
\end{equation}


The _REGROW (relative growth rate in the worst case) criterion_ requires a bit of math and definitions
to set up, but the idea is first to imagine that, if $\mathcal{Q}$ is true, we
know _which_ $\mathbb{Q} \in \mathcal{Q}$ is true, so we could pick $E \in \mathcal{E}(\mathcal{P}_0)$ to maximize $\mathbb{Q} \ln E$.
Then we consider picking $E$ so that no matter which $\mathbb{Q} \in \mathcal{Q}$ is true, the expected
growth rate would be near that optimum. This is analogous to _regret_ in the context of decision theory.


## Converting between $E$-variables and $P$-variables

Because an $E$-variable is nonnegative and has expected value 1 under the null, the chance that it exceeds $1/\alpha$
is at most $\alpha$ if the null is true.
Thus it's easy to convert an $E$-value into a $P$-value.
But it isn't as easy to convert an $P$-variable into an $E$-variable, because all an $E$-variable has to satisfy is nonnegativity and unit expectation, while a $P$-variable has to have $\mathbb{P} (P \le p) \le p$ for all $p \in [0, 1]$.
There are transformations that force that inequality, but there is no clear choice among them.

### $P$ to $E$ calibration function

Suppose $f : [0, 1] \rightarrow [0, \infty]$  is a ($P$-to-$E$) calibrator if, for any probability space 
$(\Omega, \mathcal{F}, \mathbb{P})$ and any $P$-variable $P \in  \mathcal{P}_\mathbb{P}$,
$f(P) \in \mathcal{E}(\mathcal{P})$.

A calibrator $f$ *dominates* a calibrator $g$ if $f \ge g$; $f$ *strictly dominates* $g$ if $f \ge g$ and $f \ne g$.
A calibrator is *admissible* if it is not strictly dominated by any other calibrator.

The following proposition (Vovk & Wang, 2021 Proposition 2.2)
says that a calibrator is a nonnegative decreasing
function on $[0, 1]$ whose integral is at most 1.

**Proposition.**  
A decreasing function $f : [0, 1] \rightarrow [0, \infty]$ is a calibrator if
and only if $\int_0^1 fdp \le 1$. 
It is admissible if and only if it is upper semicontinuous,
$f(0) = \infty$, and $\int_0^1 fdp = 1$.

Examples.

\begin{equation}
f^\kappa(p) := \kappa p^{\kappa−1}, \;\; \kappa \in (0, 1).
\end{equation}

\begin{equation}
f(p) := \int_0^1 \kappa p^{\kappa-1} d\kappa = \frac{1 + p - p\ln p}{p (\ln p)^2}.
\end{equation}



### $E$ to $P$ calibration function

An $E$-to-$P$ calibrator transforms $E$-variables to $P$-variables.
A decreasing function $f : [0, \infty] \rightarrow [0, 1]$ is an $E$-to-$P$ calibrator if, 
for any probability space $(\Omega, \mathcal{F}, \mathbb{P})$ and any $E$-variable $E$
$f(E)$ is a $P$-variable for $\mathbb{P}$. 
There is a clear "best" $E$-to-$P$ calibrator: $\min(1, 1/x)$.


**Proposition:** Vovk & Wang (2020)  
The function $f : [0, \infty] \rightarrow [0, 1]$, $f(x) := \min(1, 1/x)$
is an $E$-to$P$ calibrator.
It dominates every other $E$-to-$P$ calibrator, so it is the only admissible $E$-to-$P$ calibrator.

_Proof:_   
For any nonnegative random variable $E$ with expected value $1$ and any $x \in [0, 1]$, 
\begin{eqnarray}
 \mathbb{P}\{f(E) \le x \} &=& \mathbb{P}\{E \ge 1/x \} \\
 &\le & \frac{1/x}{ \mathbb{P}E} \mbox{ (by Markov's inequality)}\\
 &=& 1/x \mbox{ (since } \mathbb{P}E \le 1 \mbox{ )}
\end{eqnarray}
To show $\min(1, 1/x)$ is admissible, let $f$ be any other $E$-to-$P$-calibrator, and suppose
there is some $x \in [0, \infty]$ for which $f(x) < \min(1, 1/x)$.

+ Suppose $f(x) < \min(1, 1/x) = 1/x$ for some particular $x > 1$. 
Define an $E$-variable $E$ that has two possible values, 0 and $1/x$, with
$\mathbb{P} \{E = 0 \} = 1-1/x$ and $\mathbb{P} \{E = x \} = 1/x$.
Then 
Then 
\begin{equation}
 \mathbb{P} \{f(E) \le f(x)\} = \mathbb{P} \{f(E) = f(x)\} = 1/x > f(x),
\end{equation}
so $f(E)$ is not a $P$-variable. 

+ Suppose $f(x) < \min(1, 1/x) = 1$ for some particular $x \in [0, 1]$.
Define an $E$-variable $E$ with $\mathbb{P}\{ E = x \} = 1$. 
Then $\mathbb{P} \{f(E) \le f(x) \} = 1$, but $f(x) < 1$, so $f(E)$ is not a $P$-variable.


## Combining $E$-values

Combining $P$-variables is difficult, because the combination needs to have a distribution that
satisfies $\mathbb{P}_0\{P_{\mathrm{combined}} \le x\} \le x$, $ \forall x \in [0, 1]$: in general, linear combinations and products of $P$-values do not satisfy that requirement.
(See [combining tests](./tests-combo.ipynb).)
Fisher's combining function is an example that works for independent $P$-values.
Sequentially valid $P$-values are also in general difficult to construct; the notable exception is
$P$-values based on nonnegative supermartingales--which are examples of $E$-values.

In contrast, every convex linear combination of nonnegative random variables that each have expected value 1 is a nonnegative random variable with expected value 1, and a product of independent nonnegative random variables with expected value 1 is a nonnegative random variable with expected value 1.
The relative simplicity of combining $E$-variables may be a good reason to use them in applications that involve testing
multiple hypotheses, intersection hypotheses, etc.

**Definition.**  
An $E$-merging function of $K$ $E$-values is an increasing Borel function 
$F : [0, \infty]^K \rightarrow [0, \infty]$
such that, for any probability space $(\Omega, \mathcal{F}, \mathbb{P})$,
if $\{E_j\}_{j=1}^K$ are $E$-variables variables on that space, so is $F(E_1, \ldots, E_K)$.

An $E$-merging function $F$ is _symmetric_ if it is invariant under permutations of its arguments,
i.e., if $F(E_1, \ldots, E_K) = F(E_{\pi_1}, \ldots, E_{\pi_K})$ for every permutation $\pi$ of $\{1, \ldots, K\}$.

**Definition.**  
An $E$-merging function $F$ _essentially dominates_ an $E$-merging function $G$ if, for all $e \in [0, \infty)^K$,
\begin{equation}
G(e) > 1 \;\; \Rightarrow \;\; F(e) \ge G(e).
\end{equation}
That is, whenever the $E$-values merged by $G$ gives evidence against the null, the same $E$-values
merged by $F$ gives stronger evidence against the null.

**Proposition.**  (Vovk & Wang)  
The arithmetic mean $M_K(e_1, \ldots, e_K) := \frac{1}{K}\sum_{i=1}^K e_i$ essentially dominates
all other symmetric $E$-merging functions.

**Definition.**  
An _independent $E$-merging function ($iE$-merging function)_ of $K$ 
independent $E$-values is an increasing Borel function 
$F : [0, \infty]^K \rightarrow [0, \infty]$
such that, for any probability space $(\Omega, \mathcal{F}, \mathbb{P})$,
if $\{E_j\}_{j=1}^K$ are independent $E$-variables variables on that space, then $F(E_1, \ldots, E_K)$
is an $E$-value on that space.

Examples of $iE$-merging functions include multiplication and averaging. In particular, the $U$-statistics
are valid $iE$-merging functions:
\begin{equation}
   U_n(e_1, \ldots, e_k) := \frac{1}{{K \choose n}} \sum_{\{k_1, \ldots, k_n \} \subset \{1, \ldots, K\}} 
   e_{k_1}e_{k_1} \cdots e_{k_n}, \;\;\; n \in \{0, 1, \ldots, K\}.
\end{equation}
For $n=0$, $U_n = 1$. For $n=1$, $U_n$ is the mean of all $K$ $E$-values. For $n=K$, $U_n$ is the product of all $K$
$E$-values.
Convex combinations of $U$-statistics of $E$-values are also $iE$-merging functions.

**Definition.**
An $iE$-merging function $F$ _weakly dominates_ an $iE$-merging function $G$ if, for all 
$(e_1, \ldots , e_K) \in [1, \infty)^K$  (_not necessarily in_ $[0, \infty)^K$, 
\begin{equation}
F(e_1, \ldots ,e_K) \ge G(e_1, \ldots , e_K).
\end{equation}

**Proposition.** (Vovk & Wang)  
The product $(e_1, \ldots, e_k) \rightarrow \prod_{i=1}^K e_j$
weakly dominates every $iE$-merging function.


Multiplication can also be used to combine sequential $E$-values: if the expected value of the next bet given the outcome of all previous bets is always 1, then the betting scores can be multiplied.

It isn't always desirable to use symmetric $E$-merging functions.

Suppose $(E_{jt})_{t \in \mathbb{N}}$ is an $E$-process for $\mathcal{P}_0$ for the filtration $(\mathcal{F}_{jt})_{t \in \mathbb{N}}$, $j = 1, \ldots, K$, 
and let $(\gamma_{jt})_{t \in \mathbb{N}}$ be predictable with respect to $(\mathcal{F}_{jt})$
and satisfy $\gamma_{jt} \ge 0$ and $\sum_{j=1}^K \gamma_{jt} \le 1$.

Define $\gamma_t \cdot E_t := \sum_{j=1}^K  \gamma_{jt} E_{jt}$.
Then $(\gamma_t \cdot E_t)_{t \in \mathbb{N}}$ is an $E$-process for $\mathcal{P}_0$.


Can adaptively bet more on $E$-processes that are growing large.

## Testing multiple hypotheses using $E$-values

See:
+ Marcus, R., E. Peritz, and K.R. Gabriel, 1976. On Closed Testing Procedures with Special Reference to Ordered Analysis of Variance, _Biometrika, 63_, 655-660, https://doi.org/10.2307/2335748

+ pp1743ff of Vovk & Wang. 

We have a set of $K$ composite null hypotheses, $\{\mathcal{P}_k \}_{k=1}^K$.
We want to test all $K$ hypotheses in such a way that the chance of erroneously rejecting one or more true nulls is 
at most $\alpha$; that is,
we want the _familywise error rate_ to be at most $\alpha$.

Let $\mathcal{P}$ denote the set of all intersections of subsets of $\{\mathcal{P}_k\}$.
The _closure principle_ says that if we test as follows.
Suppose we have a level $\alpha$ test of every (composite) null in $\mathcal{P}$.
If we reject a null $\mathcal{P}_0 \in \mathcal{P}$ only if its test and the 
test of every hypothesis in $\mathcal{P}$ that is a subset of $\mathcal{P}_0$ all reject, then
the familywise error rate is at most $\alpha$.

The proof is simple. Let $A$ be the event that any _true_ $\mathcal{P}_k$ is rejected.
Let $B$ be the event that the intersection of all the true nulls is rejected;
because $\mathcal{P}$ is closed under intersections, that is one of the hypotheses in $\mathcal{P}$.
Then
\begin{equation}
\mathbb{P} \{A \cap B \} = \mathbb{P}(A | B) \mathbb{P}(B) \le \alpha
\end{equation}
since the intersection is tested at level $\alpha$.
But since we only reject a true null if we also reject the intersection of all
true nulls, $\{A \cap B \} = A$, so $\mathbb{P}(A) \le \alpha$.

Let's translate this into tests using $E$-values.
Suppose that for each hypothesis $k$, we have an $E$-variable $E_k$, i.e., a nonnegative extended-real-valued random variable such that
\begin{equation}
   \mathbb{P}E_k \le 1 \;\;\; \forall \mathbb{P} \in \mathcal{P}_k.
\end{equation}
Vovk & Wang give two algorithms, one for any $E$-values and one for sequential $E$-values.


### Algorithm 1: adjusting $E$-values for multiplicity

+ Input: a multiset of $E$-values $e_1, \ldots, e_K$.
+ Order them so that $e_{(1)} \le e_{(2)} \le \cdots \le e_{(K)}$.
+ Set $S_0 := 0$.
+ for $i = 1, \ldots, K$:
    - $S_i := S_{i-1} + e_{(i)}$
+ for $k=1, \ldots, K$:
    - $e_{(k)}^* := e_{(k)}$
    - for $i = 1, \ldots, k-1$:
        + $e := \frac{e_{(k)} + S_i}{i+1}$
        + $e_{(k)}^* := \min(e_{(k)}^*)$
    
### Algorithm 2: adjusting sequential $E$-values for multiplicity

+ Input: a sequence of $E$-values $e_1, \ldots, e_K$.
+ let $a$ be the product of all the $E$-values that are less than $1$, or $1$ if there are none
+ for $k=1, \ldots, K$:
    - $e_k^* := ae_k$
    
Both of these ways of adjusting the $E$ values gives familywise valid $E$-values, in the sense that the expected
value of the maximum (across composite nulls) $E$-value is not greater than $1$.

## Controlling the false discovery rate (FDR) with $P$-values and $E$-values

See:

+ Benjamini, Y. and Y. Hochberg, 1995. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing, _JRSSB, 57_, 289-300. https://www.jstor.org/stable/2346101

+ Benjamini, Y. and D. Yekutieli, 2001. The control of the false discovery rate in multiple testing under dependency, _Ann. Statist., 29_, (4) 1165-1188. https://doi.org/10.1214/aos/1013699998

+ Wang and Ramdas, 2022. False Discovery Rate Control with E-values, _Journal of the Royal Statistical Society Series B, 84_, 822–852. https://doi.org/10.1111/rssb.12489 

The _False Discovery Rate_ (FDR) of a testing procedure is the expected fraction of rejected null hypotheses that were indeed false. That is, let $V$ denote the number of incorrectly rejected null hypotheses and let $R$ denote the total number of rejected null hypotheses, and define
\begin{equation}
  \mbox{FDR} := \mathbb{E}(V/R | R>0) \mathbb{P}(R>0).
\end{equation}
(The conditioning is to prevent division by zero; it amounts to defining $V/R$ to be zero when $V$ (and therefore also $R$)
is 0.
This makes sense because if no hypothesis was rejected, no hypothesis was erroneously rejected.)

The Benjamini-Hochberg procedure for testing $K$ hypotheses at FDR level $\alpha$ is as follows:
Let $\{P_1, \ldots, P_K\}$ be the $P$-values of the $K$ hypotheses, 
and let $\{$\{P_{(1)}, \ldots, P_{(K)}\}$
be the $P$-values ordered so that $P_{(1)} \le \cdots \le P_{(K)}$.

+ Find the largest $k$ such that $P_{(k)} \le \frac{k}{K} \alpha$.
+ Reject the corresponding null hypotheses.

This procedure works for independent $P$-values and $P$-values that have _positive regression dependence_.
It guarantees that $\mbox{FDR} \le \frac{m_0}{m}\alpha \le \alpha$, where $m_0$ is the number of true null
hypotheses.
The Benjamini-Yekutieli procedure works for arbitrary dependence. It involves an additional
function $c(K):= \sum_{i=1}^K 1/i$:

+ Find the largest $k$ such that $P_{(k)} \le \frac{k}{Kc(K)} \alpha$.
+ Reject the corresponding null hypotheses.


