# Texas Candidate Strength Analysis - Interactive Exploration

This notebook provides interactive analysis of candidate strength using district-level election data.

## What This Notebook Does

1. **Load Data**: Import district-level election results (House/Senate/Congressional)
2. **Calculate Metrics**: Compute candidate strength scores and performance indicators
3. **Visualize**: Create charts and maps showing candidate performance
4. **Compare**: Analyze candidates across elections and geographies

## Getting Started

Run each cell in order by pressing Shift+Enter

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from candidate_strength_model import CandidateStrengthAnalyzer

# Set display options
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)
pd.set_option('display.width', None)

# Set plot style
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

print("Libraries imported successfully")

## 1. Load Data and Initialize Analyzer

Choose your geographic level:
- `'house'` - 150 State House districts (most granular)
- `'senate'` - 31 State Senate districts
- `'congressional'` - 38 U.S. Congressional districts

In [None]:
# Initialize analyzer at State House level (most detailed)
analyzer = CandidateStrengthAnalyzer(geographic_level='house')

print(f"Loaded {len(analyzer.data):,} records")
print(f"Years: {sorted(analyzer.data['year'].unique())}")
print(f"Offices: {analyzer.data['office'].unique().tolist()}")

# Preview data
analyzer.data.head(10)

## 2. Quick Race Analysis - 2024 U.S. Senate

Compare all candidates in the 2024 Senate race

In [None]:
# Analyze 2024 Senate race
senate_2024 = analyzer.analyze_race(2024, 'U.S. Senate')

# Display key metrics
senate_2024[[
    'candidate', 'party', 'is_incumbent', 'statewide_pct',
    'overall_strength_score', 'avg_vs_top_ticket', 'win_rate',
    'avg_overperf_opposite_districts'
]]

### Interpretation

**Key Metrics:**
- `overall_strength_score`: Higher = stronger candidate (can be positive or negative)
- `avg_vs_top_ticket`: How much better/worse than Presidential candidate (positive = strong personal brand)
- `win_rate`: % of districts won
- `avg_overperf_opposite_districts`: Performance in unfavorable districts (crossover appeal)

In [None]:
# Visualize candidate strength comparison
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Chart 1: Overall Strength Score
senate_major = senate_2024[senate_2024['party'].isin(['D', 'R'])]
colors = {'D': 'blue', 'R': 'red'}

ax1 = axes[0]
bars = ax1.barh(senate_major['candidate'], senate_major['overall_strength_score'], 
                color=[colors[p] for p in senate_major['party']])
ax1.set_xlabel('Overall Strength Score')
ax1.set_title('2024 Senate: Candidate Strength Comparison')
ax1.axvline(x=0, color='black', linestyle='--', alpha=0.3)
ax1.grid(axis='x', alpha=0.3)

# Chart 2: vs Top of Ticket
ax2 = axes[1]
bars = ax2.barh(senate_major['candidate'], senate_major['avg_vs_top_ticket'],
                color=[colors[p] for p in senate_major['party']])
ax2.set_xlabel('Performance vs. Presidential Candidate (percentage points)')
ax2.set_title('2024 Senate: Personal Vote Beyond Party')
ax2.axvline(x=0, color='black', linestyle='--', alpha=0.3)
ax2.grid(axis='x', alpha=0.3)

plt.tight_layout()
plt.show()

print("\nKey Insight:")
allred_vs_ticket = senate_major[senate_major['candidate']=='Allred']['avg_vs_top_ticket'].values[0]
cruz_vs_ticket = senate_major[senate_major['candidate']=='Cruz']['avg_vs_top_ticket'].values[0]
print(f"Allred outperformed Harris by {allred_vs_ticket:.2f} points")
print(f"Cruz underperformed Trump by {abs(cruz_vs_ticket):.2f} points")

## 3. District-Level Deep Dive - Cruz vs. Allred

Analyze performance district-by-district

In [None]:
# Get detailed district-level analysis
cruz_detail = analyzer.calculate_candidate_performance(2024, 'U.S. Senate', 'Cruz')
allred_detail = analyzer.calculate_candidate_performance(2024, 'U.S. Senate', 'Allred')

print(f"Cruz districts analyzed: {len(cruz_detail)}")
print(f"Allred districts analyzed: {len(allred_detail)}")

# Show Cruz's best districts (highest overperformance vs Trump)
print("\n=== CRUZ'S TOP 10 DISTRICTS (vs Trump) ===")
cruz_top = cruz_detail.nlargest(10, 'vs_top_ticket')[[
    'district', 'percentage', 'vs_statewide', 'vs_top_ticket', 
    'partisan_lean', 'dem_margin'
]]
display(cruz_top)

print("\n=== ALLRED'S TOP 10 DISTRICTS (vs Harris) ===")
allred_top = allred_detail.nlargest(10, 'vs_top_ticket')[[
    'district', 'percentage', 'vs_statewide', 'vs_top_ticket',
    'partisan_lean', 'dem_margin'
]]
display(allred_top)

In [None]:
# Scatter plot: District partisan lean vs. candidate performance
fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# Cruz
ax1 = axes[0]
scatter1 = ax1.scatter(cruz_detail['dem_margin'], cruz_detail['percentage'],
                       c=cruz_detail['vs_top_ticket'], cmap='RdYlGn',
                       s=50, alpha=0.6, edgecolors='black', linewidth=0.5)
ax1.axhline(y=50, color='black', linestyle='--', alpha=0.3)
ax1.axvline(x=0, color='black', linestyle='--', alpha=0.3)
ax1.set_xlabel('District Partisan Lean (Negative = More R, Positive = More D)')
ax1.set_ylabel('Cruz Vote %')
ax1.set_title('Cruz Performance by District Partisan Lean')
plt.colorbar(scatter1, ax=ax1, label='vs Trump (green=better, red=worse)')
ax1.grid(alpha=0.3)

# Allred
ax2 = axes[1]
scatter2 = ax2.scatter(allred_detail['dem_margin'], allred_detail['percentage'],
                       c=allred_detail['vs_top_ticket'], cmap='RdYlGn',
                       s=50, alpha=0.6, edgecolors='black', linewidth=0.5)
ax2.axhline(y=50, color='black', linestyle='--', alpha=0.3)
ax2.axvline(x=0, color='black', linestyle='--', alpha=0.3)
ax2.set_xlabel('District Partisan Lean (Negative = More R, Positive = More D)')
ax2.set_ylabel('Allred Vote %')
ax2.set_title('Allred Performance by District Partisan Lean')
plt.colorbar(scatter2, ax=ax2, label='vs Harris (green=better, red=worse)')
ax2.grid(alpha=0.3)

plt.tight_layout()
plt.show()

## 4. Historical Comparison - Cruz 2018 vs 2024

In [None]:
# Compare Cruz across elections
cruz_history = analyzer.compare_candidates_across_elections('Cruz')

print("=== TED CRUZ PERFORMANCE OVER TIME ===")
display(cruz_history[[
    'year', 'office', 'statewide_pct', 'overall_strength_score',
    'avg_vs_top_ticket', 'win_rate', 'districts_won',
    'avg_overperf_opposite_districts'
]])

In [None]:
# Visualize Cruz's trajectory
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Statewide performance
ax1 = axes[0, 0]
ax1.plot(cruz_history['year'], cruz_history['statewide_pct'], 
         marker='o', linewidth=2, markersize=10, color='red')
ax1.axhline(y=50, color='black', linestyle='--', alpha=0.3)
ax1.set_ylabel('Vote Percentage')
ax1.set_title('Cruz Statewide Performance')
ax1.grid(alpha=0.3)

# Strength score
ax2 = axes[0, 1]
ax2.plot(cruz_history['year'], cruz_history['overall_strength_score'],
         marker='o', linewidth=2, markersize=10, color='darkred')
ax2.axhline(y=0, color='black', linestyle='--', alpha=0.3)
ax2.set_ylabel('Strength Score')
ax2.set_title('Cruz Overall Strength Score')
ax2.grid(alpha=0.3)

# vs Top of Ticket
ax3 = axes[1, 0]
ax3.plot(cruz_history['year'], cruz_history['avg_vs_top_ticket'],
         marker='o', linewidth=2, markersize=10, color='orange')
ax3.axhline(y=0, color='black', linestyle='--', alpha=0.3)
ax3.set_ylabel('Percentage Points vs Top Ticket')
ax3.set_title('Cruz vs Top-of-Ticket (Negative = Underperformed)')
ax3.grid(alpha=0.3)

# Districts won
ax4 = axes[1, 1]
ax4.bar(cruz_history['year'], cruz_history['districts_won'], color='red', alpha=0.7)
ax4.axhline(y=75, color='black', linestyle='--', alpha=0.3, label='Majority (75/150)')
ax4.set_ylabel('Districts Won')
ax4.set_title('Cruz Districts Won')
ax4.legend()
ax4.grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

# Summary stats
print("\n=== KEY FINDINGS ===")
pct_change = cruz_history.iloc[1]['statewide_pct'] - cruz_history.iloc[0]['statewide_pct']
strength_change = cruz_history.iloc[1]['overall_strength_score'] - cruz_history.iloc[0]['overall_strength_score']
ticket_change = cruz_history.iloc[1]['avg_vs_top_ticket'] - cruz_history.iloc[0]['avg_vs_top_ticket']
print(f"2018 → 2024 Statewide change: {pct_change:.2f} points")
print(f"2018 → 2024 Strength score change: {strength_change:.2f}")
print(f"2018 → 2024 vs Top-of-Ticket change: {ticket_change:.2f} points")
print("\nConclusion: Cruz won by more in 2024 but strength score DECLINED")
print("(Won due to stronger R environment, not personal improvement)")

## 5. O'Rourke Comparison - 2018 vs 2022

In [None]:
# Compare O'Rourke across elections
beto_history = analyzer.compare_candidates_across_elections("O'Rourke")

print("=== BETO O'ROURKE PERFORMANCE OVER TIME ===")
display(beto_history[[
    'year', 'office', 'statewide_pct', 'overall_strength_score',
    'avg_vs_top_ticket', 'win_rate', 'districts_won'
]])

# Compare
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

metrics = [
    ('statewide_pct', 'Statewide Vote %'),
    ('overall_strength_score', 'Strength Score'),
    ('avg_vs_top_ticket', 'vs Top-of-Ticket')
]

for idx, (metric, title) in enumerate(metrics):
    ax = axes[idx]
    ax.bar(beto_history['year'].astype(str) + '\n' + beto_history['office'], 
           beto_history[metric], color='blue', alpha=0.7)
    ax.set_ylabel(metric.replace('_', ' ').title())
    ax.set_title(title)
    if metric != 'statewide_pct':
        ax.axhline(y=0, color='black', linestyle='--', alpha=0.3)
    ax.grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

print("\n=== KEY FINDING ===")
print("O'Rourke had POSITIVE strength score in 2018 (+1.39)")
print("But NEGATIVE strength score in 2022 (-4.39)")
print("Lost crossover appeal after national exposure")

## 6. Custom Analysis - Choose Your Own Race

Modify the cell below to analyze any race

In [None]:
# CUSTOMIZE THIS: Change year and office to analyze different races
YEAR = 2022  # Options: 2018, 2020, 2022, 2024
OFFICE = 'Governor'  # Options: 'President', 'U.S. Senate', 'Governor', 'Lieutenant Governor', 'Attorney General'

# Run analysis
race_analysis = analyzer.analyze_race(YEAR, OFFICE)

print(f"\n=== {YEAR} {OFFICE.upper()} RACE ===")
display(race_analysis[[
    'candidate', 'party', 'statewide_pct', 'overall_strength_score',
    'avg_vs_top_ticket', 'win_rate', 'districts_won'
]].sort_values('overall_strength_score', ascending=False))

# Visualize
major_party = race_analysis[race_analysis['party'].isin(['D', 'R'])]

fig, ax = plt.subplots(figsize=(10, 6))
colors = {'D': 'blue', 'R': 'red'}
bars = ax.barh(major_party['candidate'], major_party['overall_strength_score'],
               color=[colors[p] for p in major_party['party']])
ax.set_xlabel('Overall Strength Score')
ax.set_title(f'{YEAR} {OFFICE}: Candidate Strength Comparison')
ax.axvline(x=0, color='black', linestyle='--', alpha=0.3)
ax.grid(axis='x', alpha=0.3)
plt.tight_layout()
plt.show()

## 7. Export Results for Further Analysis

In [None]:
# Export detailed district analysis to CSV
cruz_detail.to_csv('cruz_2024_district_analysis.csv', index=False)
allred_detail.to_csv('allred_2024_district_analysis.csv', index=False)

print("Exported Cruz district analysis to: cruz_2024_district_analysis.csv")
print("Exported Allred district analysis to: allred_2024_district_analysis.csv")
print("\nThese files contain detailed performance metrics for every district")

## Next Steps

1. **Try different geographic levels**: Change `geographic_level` in cell 2 to 'senate' or 'congressional'
2. **Analyze other races**: Modify the YEAR and OFFICE in cell 6
3. **Deep dive into specific districts**: Filter `cruz_detail` or `allred_detail` for districts of interest
4. **Compare multiple candidates**: Create your own comparison visualizations

For more advanced analysis, see: `02_advanced_candidate_analysis.ipynb`