# CSJ 2024 — Constituency Summary
This notebook loads the Canada Summer Jobs data, filters to **2024 only**, and summarizes **total jobs created** and **total amount paid** per constituency.

All numbers are formatted with comma separators throughout (e.g. `14,258`).

**My goal is to verify which constituency yields the highest number of jobs with the least funding in 2024**  




In [1]:
import pandas as pd
import numpy as np

# Load raw CSV
df = pd.read_csv('csj-results-master.csv', encoding='latin-1', dtype=str)
df.columns = ['Program Year','Region','Activity Constituency','Constituency','Employer','Amount Paid','Jobs Created']

# Filter to 2024 only
df = df[df['Program Year'].str.strip() == '2024'].copy()
df = df[['Constituency', 'Amount Paid', 'Jobs Created']]

# Clean text
df['Constituency'] = df['Constituency'].str.replace('\n', ' ', regex=False).str.replace(r'\s+', ' ', regex=True).str.strip()

# Convert to numeric
df['Amount Paid'] = pd.to_numeric(df['Amount Paid'].str.strip(), errors='coerce').fillna(0).astype(int)
df['Jobs Created'] = pd.to_numeric(df['Jobs Created'].str.strip(), errors='coerce').fillna(0).astype(int)

# Aggregate by constituency
summary = df.groupby('Constituency', as_index=False).agg(
    Total_Amount_Paid=('Amount Paid', 'sum'),
    Total_Jobs_Created=('Jobs Created', 'sum')
)
summary['Cost Per Job'] = np.where(
    summary['Total_Jobs_Created'] > 0,
    (summary['Total_Amount_Paid'] / summary['Total_Jobs_Created']).round(2),
    0.0
)
summary = summary.sort_values('Total_Amount_Paid', ascending=False).reset_index(drop=True)

print(f'2024 records: {len(df):,} employer entries')
print(f'Constituencies: {len(summary):,}')
print(f'Total Amount Paid: ${summary["Total_Amount_Paid"].sum():,}')
print(f'Total Jobs Created: {summary["Total_Jobs_Created"].sum():,}')
print()

# Display with comma formatting
display_df = summary.copy()
display_df['Total_Amount_Paid'] = display_df['Total_Amount_Paid'].apply(lambda x: f'{x:,}')
display_df['Total_Jobs_Created'] = display_df['Total_Jobs_Created'].apply(lambda x: f'{x:,}')
display_df['Cost Per Job'] = display_df['Cost Per Job'].apply(lambda x: f'{x:,.2f}')
display_df.head(10)

2024 records: 26,452 employer entries
Constituencies: 338
Total Amount Paid: $293,372,944
Total Jobs Created: 71,204



Unnamed: 0,Constituency,Total_Amount_Paid,Total_Jobs_Created,Cost Per Job
0,Ville-Marie - Le Sud-Ouest - Île-des-Soeurs,1913923,414,4623.0
1,Long Range Mountains,1807700,448,4035.04
2,Coast of Bays - Central - Notre Dame,1676188,420,3990.92
3,London-Centre-Nord,1626934,360,4519.26
4,Bonavista - Burin - Trinity,1566115,372,4209.99
5,Ottawa-Sud,1535622,370,4150.33
6,Mississauga - Erin Mills,1516853,341,4448.25
7,Ottawa-Centre,1495592,329,4545.87
8,St. John's-Sud - Mount Pearl,1494810,371,4029.14
9,Calgary Confederation,1467792,417,3519.88
