# Plausability of Lottery Luck

#### [Dylan D. Daniels](http://statistics.berkeley.edu/people/dylan-david-daniels) and [Philip B. Stark](www.stat.berkeley.edu/~stark), Department of Statistics, University of California, Berkeley
#### Based on MATLAB code by [Skip Garibaldi](http://www.garibaldibros.com/)

This tool appraises whether it is plausible that a given individual won a set of lottery prizes honestly. 

The code reads a comma-separated values file (CSV) of wins and odds.

The user inputs the number of residents of the state, and a tiny "threshold" probability.

The code outputs a lower bound on the amount everyone in the state would have had to spend for any of them to have a tiny chance of winning so often, where "tiny" is the threshold number chosen by the user.

If the required spending amount is, for example, several times the median house price in the state, it may call into question whether the winner won honestly.

The current version can analyze data for only one gambler at a time. 

The code implements the mathematics described in the first link below. The third link is to a public lecture about the method, and results for reported lottery winners in Florida. 

See:
+ Arratia, R., S. Garibaldi, L. Mower, and P.B. Stark, 2015. Some people have all the luck. _Mathematics Magazine_, _88_ 196–211. doi:10.4169/math.mag.88.3.196.c, Reprint: http://www.stat.berkeley.edu/~stark/Preprints/luck15.pdf http://www.jstor.org/stable/10.4169/math.mag.88.3.196
+ Arratia, R., S. Garibaldi, L. Mower, and P.B. Stark, 2015. Some people have all the luck &hellip; or do they? _MAA Focus_, August/September, 37–38. http://www.maa.org/sites/default/files/pdf/MAAFocus/Focus_AugustSeptember_2015.pdf
+ https://www.youtube.com/watch?v=s8cHHWNblA4
+  Lottery odds: To win, you’d have to be a loser. Lawrence Mower, _Palm Beach Post_, 28 March 2014. http://www.mypalmbeachpost.com/news/news/lottery-odds-to-win-youd-have-to-be-a-loser/nfL57


## Instructions:
1. Compile a CSV file for each gambler. The CSV file should contain three columns:  "Probability," "Number," and "Cost." 

Each row corresponds to one type of wager. "Probability" is the chance of winning that wager; "Number" is the number of times the gambler collected on that wager; and "Cost" is the cost per ticket or play on that wager.

2. Put the filename of your CSV file in the box below, along with the values of POPULATION and THRESHOLD.

3. On the toolbar of this browser window (under the jupyter logo), click "Cell" --> "Run All". Wait a bit for your results to appear at the bottom of this page. 

In [86]:
from __future__ import print_function, division

# Put the name of your CSV file here:
# CSV_FILENAME = 'FILL_ME_IN.csv'
CSV_FILENAME = 'manning-edit.csv'
# CSV_FILENAME = 'pandya.csv'

# set the cutoff probability
POPULATION = 5 # 10**7   # population of North Carolina
THRESHOLD =  0.05 # 10**(-7) # one in ten million threshold

CUT = THRESHOLD / POPULATION # Bonferroni cutoff probability

debugMode = True
print(CUT)

0.01


In [92]:
import numpy as np
from scipy.special import betainc
from scipy.optimize import minimize

def binTail(p, n, t):
    return betainc(n, t - n + 1, p)

def constraintFn(p, n):
    return lambda x: np.sum(np.log(binTail(p, n, x))) - np.log(CUT)

def objectiveFn(c):
    return lambda x: np.dot(x, c)

def solve(x0, upperBoundVec, p, n, c, eps, debugMode, maxiter):
    cons = ({'type': 'ineq', 'fun': constraintFn(p, n)})
    bnds = tuple((n[i], upperBoundVec[i]) for i in range(len(n)))
    return minimize(objectiveFn(c), x0, method='SLSQP', jac=(lambda x: c),
                    constraints=cons, bounds=bnds,
                    options={'disp': debugMode, 'maxiter': maxiter, 'eps': eps})

def readCsv(filename):
    with open(CSV_FILENAME, 'r') as f:
        firstLine = f.readline()
        if firstLine.strip() != "Probability,Number,Cost":
            raise Exception('First line of CSV must be "Probability,Number,Cost"')
    values = np.loadtxt(filename, dtype=np.float_, delimiter=',', skiprows=1)
    values = np.atleast_2d(values)
    pValues = values[:,0]
    nValues = values[:,1]
    cValues = values[:,2]
    return (pValues, nValues, cValues)

def calculateBound(eps, debugMode, maxiter):
    (p, n, c) = readCsv(CSV_FILENAME)
    that = n / p
    x0 = that / 4
    return solve(x0, that, p, n, c, eps, debugMode, maxiter)

def solveProblem(tries=3, debugMode=False, epsilon = 1e-6, epsFac=5, maxiter=10**4):
    # Try up to epsFac values of the Hessian step size, related by powers of 10 (Hessian approximation step sizes)
    epsIndex = 0
    optimalValues = []
    i = 0
    for epsIndex in range(epsFac):
        for i in range(tries):
            optimOutput = calculateBound(epsilon*10**epsIndex, debugMode, maxiter)
            if optimOutput['status'] == 0:
                optimalValues.append(optimOutput['fun'])
                print(optimOutput)
    if len(optimalValues) == 0:
        raise Exception('Something went wrong while solving the problem. Please ask for assistance.')
    bestValue = np.min(optimalValues)
    if debugMode:
        print("Found {} local minima: {}".format(len(optimalValues), optimalValues))
    print("Everyone in the population would have to spend at least ${:,} dollars to have probability {} that at least one would win so much."
          .format(np.int(bestValue),THRESHOLD))
    return bestValue

In [93]:
solveProblem(tries = 5, debugMode=debugMode)

Inequality constraints incompatible    (Exit mode 4)
            Current function value: 2414017.5508
            Iterations: 81
            Function evaluations: 708
            Gradient evaluations: 78
Inequality constraints incompatible    (Exit mode 4)
            Current function value: 2414017.5508
            Iterations: 81
            Function evaluations: 708
            Gradient evaluations: 78
Inequality constraints incompatible    (Exit mode 4)
            Current function value: 2414017.5508
            Iterations: 81
            Function evaluations: 708
            Gradient evaluations: 78
Inequality constraints incompatible    (Exit mode 4)
            Current function value: 2414017.5508
            Iterations: 81
            Function evaluations: 708
            Gradient evaluations: 78
Inequality constraints incompatible    (Exit mode 4)
            Current function value: 2414017.5508
            Iterations: 81
            Function evaluations: 708
            Gradi

2412581.7509473749