# Project Canary: Policy Momentum Score Calculation

This notebook combines the three data vectors into a single **Policy Momentum Score** that detects emerging energy policy trends.

## Scoring Methodology
The Policy Momentum Score is a weighted combination of:
- **40% Money Vector**: Lobbying and grant spending intensity
- **30% People Vector**: Job posting activity intensity
- **30% Paper Vector**: Regulatory filing from the department of energy

## Alert System
Alerts are triggered when the score exceeds 1.5x the 3-month rolling average, indicating accelerating momentum.

In [25]:
# Import required libraries
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings('ignore')

print("Libraries imported successfully!")

Libraries imported successfully!


## 1. Load Cleaned Data

Load the normalized intensity scores from the data cleaning step.

In [26]:
# Load cleaned data from CSVs
# Note: In the actual workflow, these would come from exported CSVs
# For this demo, we'll recreate the cleaned data from the raw CSVs

# Load raw data
money_df = pd.read_csv('../data/money_vector.csv')
people_df = pd.read_csv('../data/people_vector.csv')
paper_df = pd.read_csv('../data/paper_vector.csv')

# Convert dates
money_df['date'] = pd.to_datetime(money_df['date'])
people_df['date'] = pd.to_datetime(people_df['date'])
paper_df['date'] = pd.to_datetime(paper_df['date'])

# Aggregate and normalize money vector
money_df['year_month'] = money_df['date'].dt.to_period('M')
money_monthly = money_df.groupby('year_month')['spend_amount'].sum().reset_index()
money_monthly['date'] = money_monthly['year_month'].dt.to_timestamp()
money_min = money_monthly['spend_amount'].min()
money_max = money_monthly['spend_amount'].max()
money_monthly['money_norm'] = (money_monthly['spend_amount'] - money_min) / (money_max - money_min)

# Aggregate and normalize people vector
people_df['year_month'] = people_df['date'].dt.to_period('M')
people_monthly = people_df.groupby('year_month').size().reset_index(name='job_count')
people_monthly['date'] = people_monthly['year_month'].dt.to_timestamp()
people_min = people_monthly['job_count'].min()
people_max = people_monthly['job_count'].max()
people_monthly['people_norm'] = (people_monthly['job_count'] - people_min) / (people_max - people_min)

# Aggregate and normalize paper vector
paper_df['year_month'] = paper_df['date'].dt.to_period('M')
paper_monthly = paper_df.groupby('year_month')['keyword_count'].sum().reset_index()
paper_monthly['date'] = paper_monthly['year_month'].dt.to_timestamp()
paper_min = paper_monthly['keyword_count'].min()
paper_max = paper_monthly['keyword_count'].max()
paper_monthly['paper_norm'] = (paper_monthly['keyword_count'] - paper_min) / (paper_max - paper_min)

print("✓ Data loaded and normalized")
print(f"  Money vector: {len(money_monthly)} months")
print(f"  People vector: {len(people_monthly)} months")
print(f"  Paper vector: {len(paper_monthly)} months")

✓ Data loaded and normalized
  Money vector: 12 months
  People vector: 4 months
  Paper vector: 13 months


## 2. Merge Data Vectors

Combine all three vectors into a single dataframe by date.

In [27]:
# Start with money vector as base
merged_df = money_monthly[['date', 'money_norm']].copy()

# Merge people vector
merged_df = merged_df.merge(
    people_monthly[['date', 'people_norm']],
    on='date',
    how='outer'
)

# Merge paper vector
merged_df = merged_df.merge(
    paper_monthly[['date', 'paper_norm']],
    on='date',
    how='outer'
)

# Sort by date
merged_df = merged_df.sort_values('date').reset_index(drop=True)

# Fill any missing values with 0 (in case of misaligned dates)
merged_df = merged_df.fillna(0)

print(f"✓ Merged dataset created with {len(merged_df)} months")
merged_df.head()

✓ Merged dataset created with 13 months


Unnamed: 0,date,money_norm,people_norm,paper_norm
0,2024-10-01,0.0,0.0,0.4
1,2024-11-01,0.0,0.0,0.066667
2,2024-12-01,0.000401,0.0,0.466667
3,2025-01-01,0.417111,0.0,0.466667
4,2025-02-01,0.000531,0.0,0.2


## 3. Calculate Policy Momentum Score

Apply the weighted formula:
- **Policy Momentum Score = (0.4 × Money) + (0.3 × People) + (0.3 × Paper)**

The weights reflect the relative importance of each signal:
- Money is weighted highest (40%) as funding is the strongest early indicator
- People and Paper are equally weighted (30% each) as supporting signals

In [28]:
# Define weights for each vector
WEIGHT_MONEY = 0.4
WEIGHT_PEOPLE = 0.3
WEIGHT_PAPER = 0.3

# Calculate weighted Policy Momentum Score
merged_df['policy_momentum_score'] = (
    (WEIGHT_MONEY * merged_df['money_norm']) +
    (WEIGHT_PEOPLE * merged_df['people_norm']) +
    (WEIGHT_PAPER * merged_df['paper_norm'])
)

print("✓ Policy Momentum Score calculated")
print(f"  Score range: {merged_df['policy_momentum_score'].min():.3f} to {merged_df['policy_momentum_score'].max():.3f}")
print(f"  Mean score: {merged_df['policy_momentum_score'].mean():.3f}")

merged_df[['date', 'money_norm', 'people_norm', 'paper_norm', 'policy_momentum_score']].head(10)

✓ Policy Momentum Score calculated
  Score range: 0.020 to 0.422
  Mean score: 0.194


Unnamed: 0,date,money_norm,people_norm,paper_norm,policy_momentum_score
0,2024-10-01,0.0,0.0,0.4,0.12
1,2024-11-01,0.0,0.0,0.066667,0.02
2,2024-12-01,0.000401,0.0,0.466667,0.140161
3,2025-01-01,0.417111,0.0,0.466667,0.306845
4,2025-02-01,0.000531,0.0,0.2,0.060212
5,2025-03-01,0.025132,0.0,0.2,0.070053
6,2025-04-01,0.04977,0.0,0.466667,0.159908
7,2025-05-01,0.180277,0.0,1.0,0.372111
8,2025-06-01,0.164909,0.0,0.133333,0.105964
9,2025-07-01,1.0,0.006452,0.066667,0.421935


## 4. Calculate Rolling Average

Compute a 3-month rolling average to smooth out short-term fluctuations and establish a baseline.

In [29]:
# Calculate 3-month rolling average
# Using center=False means the rolling average looks backwards
merged_df['rolling_avg_3m'] = merged_df['policy_momentum_score'].rolling(
    window=3,
    min_periods=1  # Allow calculation even for first 2 months
).mean()

print("✓ 3-month rolling average calculated")
merged_df[['date', 'policy_momentum_score', 'rolling_avg_3m']].head(10)

✓ 3-month rolling average calculated


Unnamed: 0,date,policy_momentum_score,rolling_avg_3m
0,2024-10-01,0.12,0.12
1,2024-11-01,0.02,0.07
2,2024-12-01,0.140161,0.093387
3,2025-01-01,0.306845,0.155668
4,2025-02-01,0.060212,0.169072
5,2025-03-01,0.070053,0.145703
6,2025-04-01,0.159908,0.096724
7,2025-05-01,0.372111,0.200691
8,2025-06-01,0.105964,0.212661
9,2025-07-01,0.421935,0.300003


## 5. Create Alert System

Flag periods when the current score exceeds 1.5x the rolling average, indicating accelerating momentum that warrants attention.

In [30]:
# Define alert threshold: 1.5x rolling average
ALERT_MULTIPLIER = 1.5

# Create alert flag
merged_df['alert'] = merged_df['policy_momentum_score'] > (merged_df['rolling_avg_3m'] * ALERT_MULTIPLIER)

# Count alerts
alert_count = merged_df['alert'].sum()
alert_months = merged_df[merged_df['alert'] == True]

print(f"✓ Alert system configured (threshold: {ALERT_MULTIPLIER}x rolling average)")
print(f"  Total alerts triggered: {alert_count} months")

if alert_count > 0:
    print("\nAlert periods:")
    for idx, row in alert_months.iterrows():
        print(f"  {row['date'].strftime('%Y-%m')}: Score {row['policy_momentum_score']:.3f} (vs avg {row['rolling_avg_3m']:.3f})")

✓ Alert system configured (threshold: 1.5x rolling average)
  Total alerts triggered: 4 months

Alert periods:
  2024-12: Score 0.140 (vs avg 0.093)
  2025-01: Score 0.307 (vs avg 0.156)
  2025-04: Score 0.160 (vs avg 0.097)
  2025-05: Score 0.372 (vs avg 0.201)


## 6. Summary Statistics

Review the final dataset with scores and alerts.

In [31]:
print("=== POLICY MOMENTUM SCORE SUMMARY ===")
print(merged_df[['policy_momentum_score', 'rolling_avg_3m']].describe())
print(f"\nAlert rate: {(alert_count / len(merged_df) * 100):.1f}% of months")

# Show full dataset
print("\n=== COMPLETE DATASET ===")
merged_df

=== POLICY MOMENTUM SCORE SUMMARY ===
       policy_momentum_score  rolling_avg_3m
count              13.000000       13.000000
mean                0.193538        0.173944
std                 0.147526        0.070461
min                 0.020000        0.070000
25%                 0.070053        0.120000
50%                 0.140161        0.169072
75%                 0.356930        0.212661
max                 0.421935        0.300003

Alert rate: 30.8% of months

=== COMPLETE DATASET ===


Unnamed: 0,date,money_norm,people_norm,paper_norm,policy_momentum_score,rolling_avg_3m,alert
0,2024-10-01,0.0,0.0,0.4,0.12,0.12,False
1,2024-11-01,0.0,0.0,0.066667,0.02,0.07,False
2,2024-12-01,0.000401,0.0,0.466667,0.140161,0.093387,True
3,2025-01-01,0.417111,0.0,0.466667,0.306845,0.155668,True
4,2025-02-01,0.000531,0.0,0.2,0.060212,0.169072,False
5,2025-03-01,0.025132,0.0,0.2,0.070053,0.145703,False
6,2025-04-01,0.04977,0.0,0.466667,0.159908,0.096724,True
7,2025-05-01,0.180277,0.0,1.0,0.372111,0.200691,True
8,2025-06-01,0.164909,0.0,0.133333,0.105964,0.212661,False
9,2025-07-01,1.0,0.006452,0.066667,0.421935,0.300003,False


## 7. Export Results

Save the Policy Momentum Score dataset for visualization and dashboard integration.

In [32]:
# Export to CSV
output_file = '../data/policy_momentum_score.csv'
merged_df.to_csv(output_file, index=False)

print(f"✓ Policy Momentum Score saved to: {output_file}")
print(f"✓ Dataset contains {len(merged_df)} months of data")
print(f"✓ {alert_count} alert periods identified")
print("\nReady for visualization in notebook 3!")

✓ Policy Momentum Score saved to: ../data/policy_momentum_score.csv
✓ Dataset contains 13 months of data
✓ 4 alert periods identified

Ready for visualization in notebook 3!
