### Allan Lichtman and his Thirteen Keys

Having predicted nine out of the ten last presidential elections since 1984 with astounding accuracy, including the 2016 upset in which President Donald Trump defeated Secretary Hillary Clinton, Allan Lichtman is one of the most accurate psephologists to have walked this planet. The only time his model, titled 'The Thirteen Keys to the White House', failed him was during the 2000 presidential election between President George W. Bush and Vice President Al Gore, when he predicted the victory of the Gore-Lieberman ticket. Lichtman still defends his prediction, with the election being one of the most drawn-out and controversial elections in United States history, ending in demands for a recount in the state of Florida, where Bush and Dick Cheney won the election by just 537 votes or a margin of 0.009% which is within the margin required for a recount, being turned down by the infamous United States Supreme Court ruling in Bush vs. Gore. There were reports of irregularities in the Florida vote count, and if Al Gore won the state of Florida, he would have won the electoral college vote. Nevertheless, Gore conceded the election after the ruling, despite winning the popular vote. Hence, there is a chance that Lichtman was right even for this election.

To further corroborate this model, Lichtman applied it in retrospect to elections from 1860 to 1980. In these elections, Lichtman keys predicted all but two elections accurately in terms of the final winner, and all elections accurately in terms of the popular vote.

The anomalies were during the 1876 and 1888 elections. The 1876 election was widely disputed and abnormal, and the election was decided as a result of a compromise between the two parties, where Republican Rutherford B. Hayes was recognized as the winner instead of Democrat Samuel Tilden in exchange of the ending of the Reconstruction policies of the post-Civil War era. The model would have presented Tilden as the winner. In 1888, Democrat Grover Cleveland won the popular vote against Republican Benjamin Harrison, but lost due to the electoral college makeup. Cleveland, who would eventually be re-elected in a rematch against Harrison in the next election for a non-consecutive second term would have been predicted as the winner by this model for the 1888 election.

So, how does this model work?

This model is a quantitative system that uses thirteen binary responses, known as keys, where a true response favors the party that controls the White House, or the incumbent party. The model makes an approximation of political stability using these keys to predict the winning candidate. If five or fewer keys are against the incumbent party, i.e., false, there is an indication of political stability, or regularity, thus projecting the incumbent party candidate as the winner. If six or more keys are false, however, this indicates a situation of political instability, thus giving the challenging party an advantage.

The following are Allan Lichtman's Thirteen Keys:

|#|Key|Description|
|-|-|-|
|1|Party mandate|Does the incumbent party have a post-midterm majority in the House of Representatives?
|2|Uncompetitive incumbent primary|A competitive primary indicates discontent with the incumbent president
|3|Incumbent president seeking re-election|Incumbency is seen as a plus by this model
|4|No significant third party campaign|-
|5|Strong short-term economy|Lack of a recession
|6|Strong long-term economy|Real per capita economic growth during the term equals or exceeds mean growth during the previous two terms
|7|Major policy change|The incumbent administration is capable to make major policy changes, thus showing either bipartisanship or lack of opposition
|8|No social unrest|No sustained, widespread social unrest
|9|No scandal|Incumbent administration faces no scandal allegations that directly affects members of the administration, the cabinet, or the president themselves
|10|No foreign policy or military failures|-
|11|Major foreign policy or military success|The lack of failures and the presence of success in this realm have very different effects. One is not a complement of the other
|12|Charismatic incumbent party nominee|The candidate of the incumbent party is a charismatic leader or a national hero 
|13|Uncharismatic challenger|The candidate of the challenging party is dull and lackluster

From the above, one can see that this model is extremely qualitative. While this model is incredibly successful, and can be used to corroborate the results of my model, it has its cons.

The obvious disadvantage is the lack of a qualitative measure. The use of binary true-false 'keys' is not exactly the most reassuring. Moreover, it

## Methodology

The report will initiate a complete analysis of existing models to predict elections, and then introduce a slightly adjusted and altered model to predict the 2024 United States presidential elections. To simplify the forecasting analysis of one of the closest elections in United States history, we will not be considering third-party candidates and independents, except in cases where the candidate has a momentum that is enough to raise their vote share in the upper single digits, thereby having the potential to be a spoiler.

Additionally, we will only analyze competitive races, i.e., states that have been given a rating of lean, likely, or toss-up by the non-partisan Cook Political Report's Cook Political Voting Index (PVI) for the Electoral College. In other words, we will not analyze solid Republican and solid Democratic states that have voted for their candidate of choice by double-digit margins in the last two elections. Table 1.1 lists the states that will be analyzed by our model under their political rating. The number of electoral votes that are allocated to these states as a result of the 2020 United States Census has also been listed with the state.

|Likely Democratic|Leans Democratic|Toss-Up|Leans Republican|Likely Republican|
|-----------------|----------------|-------|----------------|-----------------|
|Maine (2)||Arizona	(11)||Florida (30)|
|Minnesota (10)||Georgia (16)||Iowa (6)|
|Nebraska (NE-02) (1)||Michigan (15)||Maine (ME-02) (1)|
|New Hampshire (4)||Nevada (6)||Texas (40)|
|New Mexico (5)||North Carolina (16)|||
|Virginia (13)||Pennsylvania (19)|||
|||Wisconsin (10)|||

To provide context, each state has a minimum of 3 electoral votes, and the United States Constitution ensures that the total number of electoral votes per state is adjusted according to population after every census to equal exactly 538 votes. This also leads to redistricting of congressional districts. Essentially, the number of electoral votes for each state is equal to the size of the congressional delegation representing the state, i.e., two U.S. senators and the total number of U.S. representatives. Note that the states of Maine and Nebraska allocate their electoral votes differently. The other 48 states and the District of Columbia have a *winner-takes-all* method, where the candidate that wins the popular vote wins all of the state's electoral votes. Maine and Nebraska allocate two of their state's electoral votes, representing their U.S. senators, to the winner of the popular vote in the state, and the other electoral votes are allocated to the winner of each congressional district, representing their U.S. House representatives. This brings us to why we will not analyze the national popular vote since it is irrelevant to who will win the election.

205 electoral votes are considered competitive in the 2024 presidential elections. Notice how there are no states that lean towards a party, showing the increase in political polarization in the United States.

From the above discussion, one can assume that by the end of the night on November 5, 2024, decision desks across the country will call 191 electoral votes for the Harris-Walz Democratic ticket and 142 electoral votes for the Trump-Vance Republican ticket by default, some states with less than a quarter of the votes being counted, as one of the candidates will gain a mathematically insurmountable lead within hours of the polls closing. After an arduous and excruciatingly long counting process, the remaining 205 electoral votes will be called.

## The model



### Nate Silver and FiveThirtyEight

In [60]:
import pandas as pd
import numpy as np

# Load the datasets
polls = pd.read_csv('president_polls.csv')
historical_polls = pd.read_csv('president_polls_historical.csv')
averages = pd.read_csv('presidential_general_averages.csv')
prev = pd.read_csv('1976-2020-president.csv')

# Clean the data: filter relevant columns and states
relevant_columns = ['poll_id', 'state', 'numeric_grade', 'answer', 'pct']
polls = polls[relevant_columns]
polls = polls.dropna()  # Drop rows with missing values

# Filter only lean, likely, and toss-up states
state_list = ['Florida', 'Maine', 'Minnesota', 'Nebraska CD-2', 'New Hampshire', 'New Mexico', 'Virginia', 'Maine CD-2', 'Texas', 'Iowa', 'Michigan', 'Pennsylvania', 'Wisconsin', 'Georgia', 'North Carolina', 'Arizona', 'Nevada']
candidates = ['Trump', 'Harris']

polls = polls[polls['state'].isin(state_list)]
polls = polls[polls['answer'].isin(candidates)]

  historical_polls = pd.read_csv('president_polls_historical.csv')


In [62]:
def calculate_weighted_poll_average(df):
    state_results = {}
    for state in state_list:
        state_polls = df[df['state'] == state]
        candidate_results = {}
        for candidate in candidates:
            candidate_polls = state_polls[state_polls['answer'] == candidate]
            if candidate_polls.empty:
                continue
        
            # Normalize weights based on numeric grade (e.g., A=1.0, B=0.8, C=0.5)
            candidate_polls['weight'] = candidate_polls['numeric_grade'] / candidate_polls['numeric_grade'].sum()

            # Calculate the weighted average for each candidate
            weighted_avg = np.sum(np.array(candidate_polls['pct']) * np.array(candidate_polls['weight']))
            candidate_results[candidate] = weighted_avg
        state_results[state] = candidate_results
    return state_results

# Apply the function
weighted_avg_results = calculate_weighted_poll_average(polls)
print("Weighted Polling Averages by State:")
for state, averages in weighted_avg_results.items():
    print(f"{state}: {averages}, {sum(averages)}")

Weighted Polling Averages by State:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  candidate_polls['weight'] = candidate_polls['numeric_grade'] / candidate_polls['numeric_grade'].sum()
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  candidate_polls['weight'] = candidate_polls['numeric_grade'] / candidate_polls['numeric_grade'].sum()
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  c

TypeError: unsupported operand type(s) for +: 'int' and 'str'

In [36]:
averages = pd.read_csv('honors-option/presidential_general_averages.csv')
averages[averages['candidate'] == 'Donald Trump']

Unnamed: 0,candidate,date,pct_trend_adjusted,state,cycle,party,pct_estimate,hi,lo
1,Donald Trump,2020-11-03,57.36126,Alabama,2020,,,,
3,Donald Trump,2020-11-02,57.36126,Alabama,2020,,,,
5,Donald Trump,2020-11-01,57.47665,Alabama,2020,,,,
7,Donald Trump,2020-10-31,56.96877,Alabama,2020,,,,
9,Donald Trump,2020-10-30,56.94080,Alabama,2020,,,,
...,...,...,...,...,...,...,...,...,...
21291,Donald Trump,2020-10-07,67.68261,Wyoming,2020,,,,
21293,Donald Trump,2020-10-06,67.79659,Wyoming,2020,,,,
21295,Donald Trump,2020-10-05,67.89560,Wyoming,2020,,,,
21297,Donald Trump,2020-10-04,67.95186,Wyoming,2020,,,,


In [1]:
import numpy as np
import pandas as pd

# Number of simulations
n_simulations = 10000

# Electoral college votes for each state (example for a few states)
states = {
    'California': {'votes': 55, 'poll': 0.60},  # Example polling probability
    'Texas': {'votes': 38, 'poll': 0.45},
    'Florida': {'votes': 29, 'poll': 0.48},
    'Ohio': {'votes': 18, 'poll': 0.50},
    # Add all other states
}

# Function to simulate election outcomes
def simulate_election(states, n_simulations):
    results = []
    for _ in range(n_simulations):
        electoral_votes = 0
        for state, data in states.items():
            poll_result = np.random.rand() < data['poll']  # Simulate based on poll
            if poll_result:
                electoral_votes += data['votes']  # Add votes if candidate wins state
        results.append(electoral_votes)
    return results

# Run simulations
results = simulate_election(states, n_simulations)

# Calculate probabilities of different outcomes
results_df = pd.DataFrame(results, columns=['Electoral Votes'])
outcome_probabilities = results_df['Electoral Votes'].value_counts(normalize=True)

# Display results
print(outcome_probabilities)


Electoral Votes
73     0.0868
55     0.0858
84     0.0821
102    0.0755
111    0.0700
93     0.0692
140    0.0640
122    0.0625
0      0.0578
18     0.0556
29     0.0550
47     0.0514
56     0.0478
38     0.0478
85     0.0450
67     0.0437
Name: proportion, dtype: float64
