Galaxy Zoo Accuracy

What's the maximum accuracy that an automatic GZ classifier could have?

The perfect classifier would always give the same answer as the recorded votes of the volunteers. So the maximum accuracy is limited by the noise in those votes. 

Let's consider truth as the vote fraction that an **infinite population** of volunteers would produce. We actually measure the vote fraction that a **finite sample** of volunteers produce. 

In the binary classification context, there's two ways our final label (based on the measured vote fraction $\hat{\mu}$) could be different to the 'true' label (based on the 'true' vote fraction $\mu$): 

$$ p(\textrm{wrong label}) = p( \hat{\mu} > 0.5 \cap \mu < 0.5) + p( \hat{\mu} < 0.5 \cap \mu > 0.5) $$

For simplicity of notation, let's look only at the case of a false positive for now:

$$ p(\textrm{false positive}) = p( \hat{\mu} > 0.5 \cap \mu < 0.5) $$

$$= p( \hat{\mu} > 0.5 \mid \mu < 0.5) * p( \mu < 0.5) $$

$\hat{\mu} = X / N $ for $X$ yes votes (successes) of $N$ votes 

$$= p( \frac{X}{N} > 0.5 \mid \mu < 0.5) * p( \mu < 0.5) $$

$$= \int_{0}^{0.5} p( \frac{X}{N} > 0.5 \mid \mu) * p(\mu) d\mu $$



For the left expression, if we assume that the votes can be modelled as a binomial distribution then

$$= \int_{0}^{0.5} \textrm{Bin}(X > \frac{N}{2} \mid N, p=\mu) * p(\mu) d\mu $$


I suppose I could perhaps solve this analytically or with mathematica. For now, let's approximate true $mu$ as also being discrete - just like $\hat{\mu}$

TODO sum equation

In [1]:
import scipy

ModuleNotFoundError: No module named 'scipy'

In [None]:
# https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binom.html
from scipy.stats import binomial

N = 40

def p_bin_under_half(mu):
    return 1 - binomial.cdf(k=int(N/2), n=N, p=mu)

def p_mu_uniform(mu):
    return 1 / N  # treat as discrete on a grid of size N