## Motivations
 
The year is 2016. You, as a young and vibrate electoral candidate in May of the coming election in November sit with your election team in a meeting to decide where to spend the next 5 months of your campaign. In reality, you could visit everywhere at least once, but your intuition says there may be a better way to garnish just enough support to win, you your team comes up with an idea: to only visit states that have the least population. Because of the design of the electoral college, this is completely a useful strategy. You don't need all of the public vote to win, just enough states with enough electoral college votes. Picking states with the fewest amount of people that allows you to win gives you a massive advantage, since you don't have to knock on a lot of doors, kiss as many babies, and most importantly, spend lot's of money to win in your campaign. Granted, your opponent may focus on battleground states, but appealing to states with the fewest amount of people is cheap, and of course you care about campaign spending, you're all about the money. In reality, So how do you choose the best combination of states with the least population to actually visit? 

In a recent post by [Geoffrey De Smet titled, 'How to become US president with less than a quarter of the votes'](https://www.optaplanner.org/blog/2016/12/06/HowToBecomeUSPresidentWithLessThanAQuarterOfTheVotes.html), an optimization method using optaplanner is used to find a combination of states that allows for an election win, with the least amount of states. This notebook shows how to use a specific optimization method, simulated annealing, to produce the same outcome. Simulated annealing is an attractive tool for optimization problems because of it's simplicity. It's also attractive for it's ability to handle non-linear constraints, a requirement for many classic constraint algorithms. If you can code a for loop, you can optimize.

## Some initial loading 

This block loads up our requirements, and provides a look at the data available through De Smet's post. It's just a simple csv file with the names of the states, the population, and the number of electoral college votes available for that state.

In [28]:
%matplotlib inline

import pandas as pd
import numpy as np
from scipy import stats
import math
import random
import matplotlib
import matplotlib.pyplot as plt

random.seed(54321)

data = pd.read_csv('./data/president2016.txt')
data.columns = ["name", "population", "votes"]
total_votes = data['votes'].sum()
total_population = data['population'].sum()

data.head()

Unnamed: 0,name,population,votes
0,Alabama,4858979,9
1,Alaska,738432,3
2,Arizona,6828065,11
3,Arkansas,2978204,6
4,California,39144818,55


## Defining our simulation

Here we begin the writing of the simulated annealing code. For people familar with markov chain monte carlo simulations, it should look familar. If you aren't familar with MCMC, the simple explaination is:

1. We create a random variable with a probability distribution function and randomly pick it's initial state.
2. We write a for loop that at each step, picks any other state and randomly generates a uniform random variable. Depending on the odds of being in the old state vs. the new state, and the outcome of the coin, we update the state.

While I won't dive in too deeply as to why this algorithm works, I'll breifly mention it's closely tied to physics, specifically to [statistical mechanics and the maximum entropy principle](http://bayes.wustl.edu/etj/articles/theory.1.pdf).

Going back to our real world problem, we assign to each election outcome a probability that is a function of the number of voters in each state that we win, where an outcome with *more* voters has *less* of a probability than an outcome with less voters. Then we can apply simulated annealing.

Let's start with a way to generate an election outcome, and a way to randomly transition to a new election outcome. To pick a new outcome, we simply pick a state at random and change it's outcome. For simplicity, we define an outcome as a hash table of states listed with True and False.

In [54]:
def getInitialState():
    """Generates a single random outcome.
    """
    state = {};
    def setState(record):
        state[record['name']] = True
    data.apply(setState, axis=1) 
    return state

def generateNewState(prev):
    """Takes the previous state, copies it, then changes a randomly selected 
       state outcome.
    """
    state = {};
    # Copy the states
    for key in prev:
        state[key] = prev[key]
    # Randomly select a state and update it to it's opposite value
    updating = random.choice(state.keys())
    state[updating] = not state[updating]
    return state

## Defining the cost

The cost function will end up being a simple computation using the exponential, but will

1. Penalize outcomes that don't let you win.
2. Penalize outcomes that use a higher amount of voters.

If all goes well, the code will randomly "anneal" to an outcome that lets you win, and gives you the least amount of voters.

In [52]:
def computeEnergy(state):
    #Get the winning states and the data
    winning_states = [key for key in state if state[key]]
    winning_state_data = data[data.name.isin(winning_states)]
    losing_state_data = data[np.logical_not(data.name.isin(winning_states))]
    #Compute sums of votes and population
    winning_votes = winning_state_data['votes'].sum()
    population_energy = winning_state_data['population'].sum()
    win_energy = 0
    if (winning_votes < (total_population - winning_votes)):
        win_energy = total_population
    return population_energy + win_energy

In [61]:
def simulate(nsteps=10000, updateTemp=100):
    beta = .001
    state = getInitialState()
    energy = computeEnergy(state)
    for n in range(nsteps):
        if (n % updateTemp):
            beta *= 1.05
        possibleState = generateNewState(state)
        newEnergy = computeEnergy(possibleState)
        shouldTransition = random.uniform(0.0, 1.0) < math.exp(- beta * (newEnergy - energy))
        if (shouldTransition):
            state = possibleState
            energy = computeEnergy(state)
    print energy
    return state