### Calculate the pairwise correlation of logits 
- Election odds imply a probability for each state, and a probability of a sweep
- How correlated do these odds have to be for the joint probability of a sweep to be fair?
- Analytical approach: use a multivariate Gaussian for an underlying continuous, translate that to success/failure in each swing state with the percentile threshold
    - Let's say that the correlation matrix has only one non-diagonal entry (all states are equally correlated)
    - What is the correlation required to make prediction market pricing fair?


In [34]:
import numpy as np
from scipy.stats import norm
np.set_printoptions(threshold=np.inf)

In [49]:
def generate_correlated_bernoulli(n, p, rho, num_samples):
    """
    Generate correlated Bernoulli random variables.

    Parameters:
    - n: int, number of Bernoulli variables
    - p: list of probabilities for each Bernoulli variable
    - rho: float, correlation coefficient for the latent normal distribution
    - num_samples: int, number of samples to generate

    Returns:
    - samples: ndarray, shape (num_samples, n), matrix of correlated Bernoulli samples
    """
    correlation_matrix = np.full((n, n), rho)
    np.fill_diagonal(correlation_matrix, 1)
    
    # Generate multivariate normal samples with the specified correlation matrix
    mean = np.zeros(n)
    latent_normals = np.random.multivariate_normal(mean, correlation_matrix, size=num_samples)
    
    # Calculate thresholds based on probabilities p
    thresholds = norm.ppf(p)
    
    # Convert latent normal variables to Bernoulli by applying thresholds
    samples = (latent_normals > thresholds).astype(int)
    
    return samples

def find_all_ones_samples(samples):
    """
    Find the indices where all outcomes are ones in each sample.

    Parameters:
    - samples: ndarray, shape (num_samples, n), matrix of Bernoulli samples

    Returns:
    - indices: list, indices of rows where all outcomes are ones
    """
    all_ones_indices = np.where(np.all(samples == 1, axis=1))[0]
    return all_ones_indices

n = 7  # Number of Bernoulli variables
p = [0.8, .66, .48, .58, .8, .5, .3]
rho = 0.6  # Desired correlation coefficient between the Bernoulli variables
num_samples = 100000  # Number of samples to generate

correlated_bernoulli_samples = generate_correlated_bernoulli(n, p, rho, num_samples)
all_ones_indices = find_all_ones_samples(correlated_bernoulli_samples)

In [50]:
len(all_ones_indices) / num_samples

0.0675

In [43]:
# Example usage:
n = 7 
p = [0.2, .34, .52, .42, .2, .5, .7]
rho = 0.4  # Desired correlation coefficient between the Bernoulli variables
num_samples = 100000  # Number of samples to generate

correlated_bernoulli_samples = generate_correlated_bernoulli(n, p, rho, num_samples)
all_ones_indices = find_all_ones_samples(correlated_bernoulli_samples)

In [44]:
len(all_ones_indices) / num_samples

0.1106