In [1]:
import numpy as np

from flagger import BayesFlaggerBeta
from flagger_tools import get_sampler_model
from tmvbeta import TMVBeta



#### This notenook explains the basic steps of setting up and using the Bayes flagger with beta marginals.

First, we create a sampler to generate data. This will not be necessary if practice, of course.

In [2]:
# We use the parameters of the first example in the paper
params = np.load("Data/parameters.npz")
a = params['a']
b = params['b']
cov = params['cov']

In [3]:
# The cheating rate is 20%
r = 0.2
P0 = TMVBeta(a[0], b[0], cov[0])
P1 = TMVBeta(a[1], b[1], cov[1])
sampler = get_sampler_model(r, P0, P1)

Next, we initialize the flagger. It takes two parameters: The number of test takers that should get flagged per test administration, `K`, and the size of the feature vector, `M`.

In [4]:
bayes_flagger = BayesFlaggerBeta(K=20, M=10)

Now, let's generate some data. Here, we draw 100 samples. The sampler returns the feature vectors, `X`, and the true labels, `c`. That is, the $n$th test taker belongs to the critical group if `c[n] = 1` and the reference group if `c[n] = 0`. In practice, `c` would be unknown at this point.

In [5]:
c, X = sampler(N=100)

We can now flag a first batch of test takers. Note that in this example the fist batch will always be flagged randomly since the flagger is initialized with uninformative priors. However, the steps shown here are the same for all subsequent administrations. 

First, the flagger needs to be given the feature vectors of the current administration:

In [6]:
bayes_flagger.observe(X)

Next, we calculate the posterior probabilities of test takers belonging to the critical group. Note that we pass a parameter `update_model=False`, meaning we do not want to update the underlying model yet, only get the critical group probabilities.

In [7]:
bayes_flagger.update_posterior(update_model=False)

Based on the calculated probabilities we can flag `K` test takers for review.

In [8]:
bayes_flagger.flag()

The selected test takers are then reviewed. Note that here we pass the true labels `c` that were provided by our sampler. In practice, `c` would be given by the outcomes of the reviews.

In [9]:
bayes_flagger.review(c)

Now that we have acccess to `K` true labels, so we can redo the posterior update with more information. Also, this time we do want to learn and update the model and, therefore, pass `update_model=True`.

In [10]:
bayes_flagger.update_posterior(update_model=True)

This closes the loop. The model is now updated and the flagger ready for the next administration. 