### Classic:

Riddler Nation’s neighbor to the west, Enigmerica, is holding an election between two candidates, A and B. Assume every person in Enigmerica votes randomly and independently, and that the number of voters is very, very large. Moreover, due to health precautions, 20 percent of the population decides to vote early by mail.

On election night, the results of the 80 percent who voted on Election Day are reported out. Over the next several days, the remaining 20 percent of the votes are then tallied.

What is the probability that the candidate who had fewer votes tallied on election night ultimately wins the race?

### The Attempt With Code Math

1) Determine range of % of vote where it is feasible for a candidate to be behind & then take the lead
- The only way a candidate can come back after losing assuming 20% of vote is left is if they can exceed 50%. 
- This means the mininum is going to be 30% of total, which translates to 3/8 = .375 of the day of vote (+1 vote)
- The max will be the halfway point of the vote on Election day, or 40% of total (4/8 = 0.5) (-1 vote)

2) Once range is established can sum up the likelihood of discrete vote counts (e.g. P(Win | 30% Day of), P(Win | 30.1% Day of), ....)
- For each vote amount in the range established can solve the following: 
    - Likelihood of that vote amount ---> P(X = x)
    - Likelihood of enough votes out of those from mail-in  ---> P(X >= x)
- Summing up the values will yield the solution


In [1]:
import numpy as np
from scipy.stats import binom

In [2]:
# Example of P(X >= 4)
1 - binom.cdf(4 - 1, 10, 0.5)

0.828125

In [3]:
# Example of P(X = 4)
binom.pmf(4,10,0.5)

0.20507812500000022

In [4]:
def conditionalProb(election_day_count, n, prob = 0.5):
    """Take in election day count, generate a likelihood of this, and generate a likelihood of getting atleast the mininum required to win"""
    
    # likelihood of election day getting exact success of X = election day count
    likelihood = binom.pmf(election_day_count, n * 0.8, prob)
    
    # min raw votes needed from mailin
    need = n - election_day_count - (n/2) + 1
    
    # total mail in votes
    tot_mail = n * 0.2
    
    # calculate likelihood of getting ATLEAST need (binom.cdf does X < x)
    # need to solve: 1 - binom.cdf, which gives X > x, so subtract 1 to get proper result 
    likelihood_need = 1 - binom.cdf(need - 1, tot_mail, prob)
    
    return likelihood * likelihood_need

In [5]:
total_votes = 10000

vote_list = [100,1_000,10_000,100_000, 1_000_000, 10_000_000]

for total_votes in vote_list:
    
    election_day_votes = total_votes * 0.8
    min_votes = .3 * total_votes + 1
    max_votes = .4 * total_votes
    election_day = np.arange(min_votes, max_votes) # vote possibilities where losing but can still win
    print(f"With a total vote count of {total_votes}, a candidate is in range\n of possibility if they have {min(election_day)} to {max(election_day)} of the Election Day Votes")
    solution = sum(np.array([conditionalProb(xi, total_votes) for xi in election_day]))
    print(f"Likelihood of Candidate losing Election Day but winning after mail-in votes counted: {solution:.6f}\n")

With a total vote count of 100, a candidate is in range
 of possibility if they have 31.0 to 39.0 of the Election Day Votes
Likelihood of Candidate losing Election Day but winning after mail-in votes counted: 0.038120

With a total vote count of 1000, a candidate is in range
 of possibility if they have 301.0 to 399.0 of the Election Day Votes
Likelihood of Candidate losing Election Day but winning after mail-in votes counted: 0.061091

With a total vote count of 10000, a candidate is in range
 of possibility if they have 3001.0 to 3999.0 of the Election Day Votes
Likelihood of Candidate losing Election Day but winning after mail-in votes counted: 0.069633

With a total vote count of 100000, a candidate is in range
 of possibility if they have 30001.0 to 39999.0 of the Election Day Votes
Likelihood of Candidate losing Election Day but winning after mail-in votes counted: 0.072462

With a total vote count of 1000000, a candidate is in range
 of possibility if they have 300001.0 to 39999