#**Bayesian Statistics**

There are two fundamental philosophies when it comes to statistical inference: frequentist and bayesian perspectives. The frequentist approach interprets probability as the long-run frequency of events across repeated trials and treats parameters as fixed but unknown constants. In contrast, the Bayesian approach views parameters as random variables with associated probability distributions. Initial beliefs about the parameters are combined with the observed data to yield posterior distributions. By formally incorporating prior knowledge, such as historical patterns, expert judgment, or previous datasets, the Bayesian approach produces more stable and realistic estimates, especially when data are limited or noisy. Additionally, it is especially well-suited for scenarios involving continuous data monitoring, since bayesian inference can seamlessly update the posterior estimates as new information becomes available.

# **Application: Predicting Election Results with Pyro**

Bayesian approaches are a natural fit for election forecasting because they help us integrate what we already know about past elections with whatever fresh polling data we have on hand. By capturing long-term trends and specific state interactions, a Bayesian model can bind everything together rather than considering each state as a separate coin flip. In this situation, we are using Pyro because it allows us to create probabilistic models directly in Python while maintaining access to tools like sampling and inference methods.

The concept is straightforward: use previous elections to create a prior assumption about each swing state, then utilize poll data to refine those assumptions, and lastly, run thousands of simulations to determine how frequently Democrats win 270 electoral votes.

The model gives an initial probability of around 0.68 that the Democrats will win before any polls are conducted. The 2012 baseline and historical voting trends are reflected in this figure.

Using two different forms of polling data, the notebook next generates two posterior predictions:
  1. Synthetic Poll (generated from the true 2016 state preferences)

  *   A posterior of about 0.58 is produced when the model is given a poll that
      reflects the actual underlying 2016 findings. This is a positive indicator. Although the poll only included swing states and contained sampling noise, it demonstrated how the model answered by reducing the likelihood of a Democratic victory while maintaining some uncertainty
  2. Scaled Actual 2016 Vote Shares

  *   A similar update is obtained when the actual 2016 vote percentages are scaled to the same poll size. In this way, it behaves like a real "in-cycle" poll and demonstrates how the model would have changed expectations throughout the course of the campaign.



##**Bayes Theorem**

In [None]:
# This file includes code adapted from Pyro tutorials
# Source: https://pyro.ai/examples/elections.html
# Licensed under the Apache License, Version 2.0

# Simplified Bayesian election model using swing states only
# Outputs TWO posterior results:
# 1. Using synthetic poll generated from 2016 results
# 2. Using actual 2016 vote percentages scaled to poll size

!pip install pyro-ppl
import pandas as pd
import torch
import numpy as np
import pyro
import pyro.distributions as dist

BASE_URL = "https://raw.githubusercontent.com/pyro-ppl/datasets/master/us_elections/"

# ============================================================
# LOAD DATA
# ============================================================

electoral_college_votes = pd.read_pickle(BASE_URL + "electoral_college_votes.pickle")
ec_votes_tensor = torch.tensor(electoral_college_votes.values,
                               dtype=torch.float).squeeze()

frame = pd.read_pickle(BASE_URL + "us_presidential_election_data_historical.pickle")

# Historical swing states (2000–2020)
swing_states = ['FL','PA','MI','WI','OH','NC','GA','NV', 'CO', 'NH']
swing_indices = [frame.index.get_loc(st) for st in swing_states]

In [None]:
# ============================================================
# PRIOR FROM HISTORICAL DATA
# ============================================================

results_2012 = torch.tensor(frame[2012].values, dtype=torch.float)
prior_mean = torch.log(results_2012[..., 0] / results_2012[..., 1])

idx = 2 * torch.arange(10)
all_results = torch.tensor(frame.values, dtype=torch.float)
logits = torch.log(all_results[..., idx] / all_results[..., idx + 1]).transpose(0, 1)

mean = logits.mean(0)
sample_cov = (1/(logits.shape[0] - 1)) * (
    (logits.unsqueeze(-1) - mean) * (logits.unsqueeze(-2) - mean)
).sum(0)

prior_covariance = sample_cov + 0.01 * torch.eye(sample_cov.shape[0])
prior_dist = dist.MultivariateNormal(prior_mean, covariance_matrix=prior_covariance)

This block constructs the prior distribution over each state's latent Democratic vs. Republican preference in 2016.

Here, we take the Democratic and Republican vote totals from 2012 and convert them into log-odds, which are used as the expected party preference for each state in 2016. 2012 is used as the prior because it is the most recent election before 2016, which in election modeling, the previous result is usually the single best predictor of the next.

We then use data from 1976-2012 for covariance, since historical patterns over 40 years reveal more information about the volatility of the state outcomes.

We also add regularization to the covariance to avoid overconfidence in the covariance.

In [None]:
# ============================================================
# NATIONAL OUTCOME FUNCTION
# ============================================================

def election_winner(alpha_logits):
    dem_win_state = (alpha_logits > 0).float()
    dem_votes = ec_votes_tensor * dem_win_state
    return (dem_votes.sum() >= 270).float()


This function computes the national election outcome given a vector of state-level log-odds.

In [None]:
# ============================================================
# POSTERIOR INFERENCE VIA IMPORTANCE SAMPLING
# ============================================================

def posterior_win_prob_given_y(y_obs, allocation, num_alpha_samples=5000):
    """Approximate P(Dem win | observed poll y_obs)."""
    alpha_samples = prior_dist.sample((num_alpha_samples,))
    dem_win = torch.stack([election_winner(a) for a in alpha_samples])

    binom = dist.Binomial(total_count=allocation, logits=alpha_samples)
    log_lik = binom.log_prob(y_obs).sum(-1)

    maxlog = log_lik.max()
    weights = torch.exp(log_lik - maxlog)

    return ((weights * dem_win).sum() / weights.sum()).clamp(1e-6, 1 - 1e-6)

This function computes the posterior probability of a Democratic win given poll data. It samples state-level preferences from the prior, checks which scenarios result in a Democratic win, and weights each scenario by how likely it is to produce the observed poll results. The weighted average of these outcomes gives the posterior probability, updating our prior belief based on the poll data.

In [None]:

# ============================================================
# PRIOR DEM WIN PROBABILITY (PRINT THIS)
# ============================================================

print("\n==============================================")
print("Computing PRIOR distribution…")
print("==============================================")

alpha_prior_samples = prior_dist.sample((25000,))
prior_wins = torch.stack([election_winner(a) for a in alpha_prior_samples])
prior_prob = prior_wins.mean().item()

print(f"\nPrior probability of DEMOCRATIC win (national): {prior_prob:.4f}")
print("\n==============================================\n")



Computing PRIOR distribution…

Prior probability of DEMOCRATIC win (national): 0.6783




This block computes the prior probability of a Democratic win before seeing any poll data. It samples many scenarios from the prior distribution, checks which ones result in a Democratic victory using the election_winner function, and averages these outcomes to estimate the national win probability based solely on historical data and 2012 results.

In [None]:

# ============================================================
# LOAD TRUE 2016 RESULTS
# ============================================================

test_data = pd.read_pickle(BASE_URL + "us_presidential_election_data_test.pickle")
results_2016 = torch.tensor(test_data.values, dtype=torch.float)
true_alpha_2016 = torch.log(results_2016[..., 0] / results_2016[..., 1])


Load the actual 2016 election results for each state and convert them into log-odds to represent the true underlying Democratic versus Republican preference. These values are later used to generate synthetic and real polls for comparison with the model's predictions.

In [None]:
# ============================================================
# POLL SIZE AND ALLOCATION
# ============================================================

TOTAL_POLL = 1500
allocation = torch.zeros(51)
per_state = TOTAL_POLL // len(swing_states)

for st in swing_states:
    allocation[frame.index.get_loc(st)] = per_state

# Remainder → Florida
allocation[frame.index.get_loc('FL')] += TOTAL_POLL - allocation.sum()


This block sets up the poll size and how respondents are allocated across states. A total of 1,500 poll respondents is distributed evenly among the swing states, with any remaining respondents added to Florida. This allocation is used to simulate state-level polling data for both synthetic and real 2016 polls.

In [None]:

# ============================================================
# OPTION 1 — SYNTHETIC POLL GENERATED FROM MODEL
# ============================================================

print("\nGenerating SYNTHETIC poll results based on 2016 true preferences...\n")

y_synth = torch.zeros(51)

for st in swing_states:
    idx = frame.index.get_loc(st)
    total_polled = allocation[idx]

    p_dem = torch.sigmoid(true_alpha_2016[idx])
    y_synth[idx] = dist.Binomial(total_count=total_polled, probs=p_dem).sample()

# ============================================================
# OPTION 2 — ACTUAL 2016 PERCENTAGES AS POLL RESULTS
# ============================================================

print("\nGenerating poll using ACTUAL 2016 vote percentages...\n")

y_real = torch.zeros(51)

for st in swing_states:
    idx = frame.index.get_loc(st)
    total_polled = allocation[idx].item()

    dem_votes = results_2016[idx, 0]
    rep_votes = results_2016[idx, 1]
    total_votes = dem_votes + rep_votes

    p_dem = dem_votes / total_votes
    y_real[idx] = (p_dem * total_polled).round()



Generating SYNTHETIC poll results based on 2016 true preferences...


Generating poll using ACTUAL 2016 vote percentages...



Here, we generate two types of polls for the swing states. The first, synthetic poll, simulates survey results by sampling from a binomial distribution using the true 2016 state preferences, serving as a sanity check for the model. The second uses the actual 2016 vote percentages scaled to the poll size to create a “realistic” poll, which allows comparison of the model's posterior predictions against the actual election outcomes.

In [None]:
# ============================================================
# COMPUTE BOTH POSTERIORS
# ============================================================

posterior_synth = posterior_win_prob_given_y(y_synth, allocation)
posterior_real = posterior_win_prob_given_y(y_real, allocation)

tensor(0.5776)

Finally, we compute the posterior probability of a Democratic win for both the synthetic and actual 2016 polls. Using the posterior_win_prob_given_y function, it updates the prior belief based on the observed poll data, producing posterior estimates that reflect how likely Democrats are to win given either the model-generated or real poll results.

## **Conclusion & Further Reading**

This projects explores a concise yet comprehensive Bayesian election model. The notebook demonstrates how a model may transition from a history-based expectation to a poll-informed prediction by using historical voting data as a multivariate prior, transforming state preferences into log-odds, and updating those beliefs with polling information through importance sampling. Despite only concentrating on swing states and employing a simple binomial likelihood, the model still captures the key trends of the 2016 election: Democrats begin with a strong historical lead, but their prospects decline as poll data that mirror the reality of 2016 are included.

The transparency of the entire system is what makes it so helpful. Every assumption is made publicly. You can change the covariance to alter the degree of correlation between states and can even include additional weight for polling errors by baking it in. Due to the underlying components like priors, likelihoods, and sampling being arranged in a structured manner, Pyro makes any future adjustments simple.

There are some natural directions we can take this further if we would like to explore:



### **Enhancing via Monte Carlo Methods**




The notebook uses importance sampling, which is simple and effective for minimal models, but once the model becomes more expressive with more states, parameters, turnout uncertainty, it's important that we bolster our important sampling methods.  The weights become unbalanced, and the majority of the samples contribute little to nothing.
By sampling directly from the posterior instead of reweighting previous samples, Monte Carlo methods address this issues. Monte Carlo brings improved scaling where it can handle higher-dimensional posteriors without collapsing. This implies that you may use latent variables to account for poll bias, add further layers, or model all 50 states with assurance. MCMC also provide more accurate uncertainty.
The whole form of the distribution is represented by posterior samples from MCMC, not only the "best-fitting" portion. This provides more accurate predictions of tail scenarios and correlated changes. MC also bring a more realistic simulation result. Every MCMC draw is a self-consistent "world" that honors the revised posterior. You get a more complex set of results when you turn those draws into election simulations, which is exactly how forecasting boutiques create their thousands of simulations.


### **Introducing Time Series Structure**


The election is now treated by the model as a single, static update: one prior, one poll, and one posterior. Polls arrive in a sequence, not all at once, and actual campaigns change over time. This slow progression may be captured by the model if a time-series component is added. One popular method is to represent each state's latent log-odds as a random walk or a basic autoregressive process, in which today's preference is yesterday's preference plus a little, normally distributed change. This promotes transitions over sudden jumps and the model, which has a temporal prior in place, can update the posterior each time fresh polling data comes in, creating a trajectory of state preferences leading up to Election Day. The time-series strucuter also helps in distinguishing actual opinion changes from poll noise, as it favors gradual shifts unless the evidence suggests otherwise. In reality, this transforms the model from a single-step into a model that monitors how uncertainty and voter behavior changes over the course of the campaign.