freqseq
stands for "Frequentist Sequential" and is an implementation of one sided frequentist sequential A/B testing in python
. Credit for the initial derivation goes to Evan Miller. I've simply gone over his derivation in more granularity, added calculations dealing with treatment assignment bias, and written a python
implementation.
notes.md
- a derivation of the testing approachapp.py
- astreamlit
app which generates test statistics for various input conditionsfreqseq
- apython
package containing implementation of the testing strategy
Suppose we are running a random experiment to determine the efficacy of some intervention with a binary response (the response taking the value of
What we want here is a sequential test, a test that allows for early stopping if results look promising without incurring a heightened false positive rate.
The basic strategy is to construct a random walk dependent on conversions in a treatment and control group. We can then define a region such that the random walk escapes that region at fixed rates under the both the hypothesis of equal conversion rates in the treatment and control groups and the hypothesis of unequal rates in each group.
The test is defined by three parameters:
-
$N$ - the maximum amount of time to run the test (vertical barrier) -
$d$ - the horizontal barrier - if the walk escapes$d$ , we reject the null hypothesis. -
$\sigma$ - a variance conversion factor to transform the data into a random walk when treatment assignment probability is biased.
Here is an example of what the region could look like:
See here for more information on test construction.
Modeling and code development was done in conda
with python 3.9.12
.
Requirements are stored in requirements.txt
.
To run the app, activate your conda
environment, then run streamlit run app.py
The app should look something like this: