# COMS W1002 Computing in Economics/Social Sciences Project 1
## Election Simulation

Project created by Professor Karl Sigman and modified by Cannon






__Problem 1:__ (40 points)
Using the 51 [Electoral College](https://www.archives.gov/electoral-college/2000) numbers that were used in the 2000 USA Presidential Election, estimate the number of ways there could have been a tie in the 2000 Presidential Election.  Use Monte Carlo simulation (using *iid* Bernoulli (1/2); e.g., a fair coin flip for each state) to simulate many elections. The fraction of elections that end in tie multiplied by the number of possible outcomes is your estimate. The exact answer to this question was determined in 2009 by K. Sigman and O. Watanabe to be 17,150,271,124,366. Your estimate should be fairly close to that number. 


Repeat using the [EC numbers](https://www.archives.gov/electoral-college/2020) from the 2020 election. That is create a new list of EC values and feed it to your function.  *NOTE: Maine and Nebraska do not use a winner-take-all electoral college system. In this assignment we will make the simplifying assumtion that they do.*

***To earn full marks you must write a parameterized function as described in the comments below to perform your simulation. Do not change the function header.***

`Monte Carlo Simulation`: It is used to model the probability of different outcomes in processes that are uncertain or involve random variables. It involves running a large number of simulations or trials with random inputs to observe the range of possible outcomes.


In [13]:
import random

#  list the EC values for the 2000 election
v_2000 = [9,3,8,6,54,8,8,3,3,25,13,4,4,22,12,7,6,8,9,4,10,12,18,10,7,11,3,5,4,4,15,5,33,14,3,21,8,7,23,4,8,3,11,32,5,3,13,11,5,11,3]
# list the EC values for the 2020 election
v_2020 = [9,3,11,6,55,9,7,3,3,29,16,4,4,20,11,6,6,8,8,4,10,11,16,10,6,10,3,5,6,4,14,5,29,15,3,18,7,7,20,4,9,3,11,38,6,3,13,12,5,10,3]

# Define the function to estimate the number of tie outcomes
def target_estimator(ec, target, trials): #ec is the electoral college votes from the respective list, target is the number a candidate needs in order to be ina tie, trials is the number of times I'm gona run this simulation
    tie_count = 0 #set counter of ties to zero to use it with the for loop
    total_outcomes = 2 ** len(ec)  #the total number of outcomes is either head or tails to the power of the length of the list (in this case 51)
    
    for _ in range(trials):
        candidate_1_votes = sum([random.choice([0, ec[i]]) for i in range(len(ec))]) #this iterates over the number of electoral college votes and creates a list by ranodmizing choice between 0 and item in iteration form the list
    
        if candidate_1_votes == target:
            tie_count += 1
            
    # Scale up to estimate the total number of ties
    estimated_tie_outcomes = tie_count / trials * total_outcomes #by dividing the tie count by n we are estimating the probability of success (in this case ties) and then ,ultiplying that times the total number of outcomes 
    return estimated_tie_outcomes

# Calculate the number of tie outcomes for the 2000 and 2020 elections
result_2000 = target_estimator(v_2000, 269, 1000000)
result_2020 = target_estimator(v_2020, 269, 1000000)

print('Estimated number of ties (2000):', round(result_2000))
print('Estimated number of ties (2020):', round(result_2020))


Estimated number of ties (2000): 17359124763700
Estimated number of ties (2020): 16911016600776


Beginning dummy trials, please ignore

In [13]:
#Dummy trials, please ignore
list = [2,3,4,5,6,7,8,9]
y = len(list)
x = range(len(list))
for i in range(len(list)):
    print(i)

0
1
2
3
4
5
6
7


Example from class

In [None]:
#Dummy trials, please ignore
import random 

def dice():
    return int(random.random()*6+1)

#probability: 
def dim_sum(x,trials):
    count=0
    for i in range(trials):
        if dice()+dice()==x:
            count+=1
            return count/trials

When using the sum(), I can use it to add values in an iterable not to add strings; that would be called string concatenation and uses a different element called join.

In [21]:
#dummy trials, please ignore
#understanding the function random.choice()
import random
# Choosing from a list of numbers
numbers = [1, 2, 3, 4, 5]
print(random.choice(numbers))  # Outputs one of the numbers randomly

# Choosing from a list of strings
colors = ["red", "blue", "green"]
print(random.choice(colors))  # Outputs one of the colors randomly


5
red


End dummy trials

__Problem 2:__ In the year 2000 (Bush versus Gore), the situation right before the election was this: Bush had (in his pocket) 24 states totaling 210 EC votes, while Gore had 10 states totaling 146 EC votes. There were 17 states left over, the “Battleground States”, in which it was supposedly unclear who would win them. Look at the file [2000.pdf](http://www.cs.columbia.edu/~cannon/2000.pdf) to see exactly what states made up the 17, and the EC numbers for them.

__Part I:__ (30 points) First assume that each Battleground State outcome is determined by an *iid* fair coin toss; Bernoulli (1/2). Simulate (using 1 million copies to average, using Monte Carlo) to obtain the probability that Bush would win the election, and the probability that Gore would win the election, and the probability of a tie.

***To earn full marks you must write a parameterized function as described in the comments below to perform your simulation. Do not change the function header.***

## Ask why my probabilities are being so high for both, shouldn't it be different? As in, if the probability of 1 is 0.99, then the probability of the other should be 1-0.99?

In [19]:
#understanding the context: Bush had 24 states won, which is equivalent to 210 EC votes
#Gore had 10 states, which is equivalent to 210 EC votes
#there were 17 states left that could go for either, this are called swing states. Adding to total of 182 votes that had not yet been decided

#list EC values that remain in play>>Must be a list of length 17! Bceause of the 17 states left
v_in_play=[6,3,25,22,7,4,18,10,11,4,4,7,23,11,11,5,11]

# this function returns an estimate for the probability that candidate 1
# wins in a US Presidential election given that they already have
# ec_in_the_bag EC votes.
# v_left is a list of the remaining EC numbers,
# trials is the number of trials to be used in the MC simulation.
# This function assumes that the probability of winning any remaining state
# is 0.5

import random

def ec_estimator(ec_in_the_bag, v_left, trials):
    wins = 0 #counter of wins for candidate 1 for the loop
    
    for _ in range(trials):
        total_votes = ec_in_the_bag  # Start with the votes already in the bag
        
        # Simulate each swing state
        for votes in v_left:
            if random.random() < 0.5: #the probability of winning is 0.5 or less because this is likea coin toss
                ec_in_the_bag += votes
        
        # Check if the candidate won
        if total_votes >= 270:
            wins += 1
            
    # Estimate the probability of winning
    return wins / trials
#Do I need to substract the probability calculated from 1?

# Example simulations
v_in_play = [6, 3, 25, 22, 7, 4, 18, 10, 11, 4, 4, 7, 23, 11, 11, 5, 11]

# Estimate for Bush win
print('Bush win: ', ec_estimator(210, v_in_play, 1000000))

# Estimate for Gore win
print('Gore win: ', ec_estimator(146, v_in_play, 1000000))

# Estimate for Tie:
def target_estimator(v_left, target, trials):
    ties = 0
    for _ in range(trials):
        total_votes = 0
        for votes in v_left:
            if random.random() < 0.5:
                total_votes += votes
        if total_votes == target:
            ties += 1
    return ties

#estimate for Tie
print('tie: ',target_estimator(v_in_play,269-146,1000000)/2**len(v_in_play))
#Are this supposed to be substractions? Do they affect the way the function works?

#check that it's the same (close) with
print('tie: ',target_estimator(v_in_play,269-210,1000000)/2**len(v_in_play))



Bush win:  0.999999
Gore win:  0.999998
tie:  0.05657196044921875
tie:  0.05780792236328125


In [29]:
#verification of list
v_in_play=[6,3,25,22,7,4,18,10,11,4,4,7,23,11,11,5,11]
x = len(v_in_play)
x
y = sum(v_in_play)
y

182

 __Part II:__ (30 points) In the [2000.pdf](http://www.cs.columbia.edu/~cannon/2000.pdf) file, you will also see the probabilities that had been determined by extensive polling for Gore winning each of the 17 states. Denote these probabilities by $p_1,...,p_{17}$. No longer are they all *p = 1/2* as we assumed in Part I. For example, for the state of Wisconsin (WI), *p = 0.946*, while for the state of Nevada (NV), p = 0.146. Only for the state of Maine (ME) is p = 0.5. Now re-do the simulation in Part I using the 17 Bernoulli $(p_i)$. The idea now is that each of the 17 states has its own coin so to speak.
 
***You can do this one however you wish but someone who uses a parameterized function or functions for doing this will earn more praise and admiration than someone who does not***

In [24]:


# your code here

# this time I have provided you with less structure.
# The goal is for you to develop a function
# or functions similar to those above.
# This part will test your ability to design solutions on your own.
import random

# Probabilities for each of the 17 states
p_left = [0.946, 0.146, 0.446, 0.522, 0.301, 0.444, 0.574, 0.641, 0.735, 0.370, 0.326, 0.620, 0.368, 0.615, 0.579, 0.539, 0.486]
v_in_play = [6, 3, 25, 22, 7, 4, 18, 10, 11, 4, 4, 7, 23, 11, 11, 5, 11]

def ec_estimator_varying_probs(ec_in_the_bag, v_left, p_left, trials):
    wins = 0
    
    # Simulate the elections for each trial
    for _ in range(trials):
        total_votes = ec_in_the_bag  # Start with the votes already in the bag like we did previously
        
        # Simulate each state using the respective probability for Gore
        for i in range(len(v_left)):
            if random.random() < p_left[i]:
                total_votes += v_left[i]  # Gore wins the state
        
        # Check if the candidate has won (>= 270 votes)
        if total_votes >= 270:
            wins += 1
            
    # Estimate the probability of winning
    return wins / trials



# Estimate for Bush win (starting with 210 electoral votes)
print('Bush win: ', ec_estimator_varying_probs(210, v_in_play, p_left, 1000000))

# Estimate for Gore win (starting with 146 electoral votes)
print('Gore win: ', ec_estimator_varying_probs(146, v_in_play, p_left, 1000000))

# Estimate for Tie (Gore with 146, Bush with 210)
def target_estimator_varying_probs(v_left, p_left, target, trials):
    ties = 0
    for _ in range(trials):
        total_votes = 0
        for i in range(len(v_left)):
            if random.random() < p_left[i]:
                total_votes += v_left[i]  # Gore wins the state
        if total_votes == target:
            ties += 1
    return ties

# Estimate for Tie (with modified parameters)
print('Tie (Gore 146, Bush 210): ', target_estimator_varying_probs(v_in_play, p_left, 269 - 146, 1000000) / 2 ** len(v_in_play))
print('Tie (Gore 146, Bush 210): ', target_estimator_varying_probs(v_in_play, p_left, 269 - 210, 1000000) / 2 ** len(v_in_play))





Bush win:  0.911875
Gore win:  0.131547
Tie (Gore 146, Bush 210):  0.077789306640625
Tie (Gore 146, Bush 210):  0.03211212158203125
