# Tri-Agency Research Grants Analysis

This notebook analyzes grant data from Canada's three main research funding agencies:
- NSERC (Natural Sciences and Engineering Research Council)
- CIHR (Canadian Institutes of Health Research)
- SSHRC (Social Sciences and Humanities Research Council)

## 1. Setup and Data Collection

First, let's import required libraries and set up our environment.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from data.fetcher import Fetcher, FetcherConfig

Now, let's fetch the grant data for 2019:

In [None]:
# Initialize fetcher with verbose output to see progress 
fetcher = Fetcher(FetcherConfig(verbose=True))

# Fetch 2019 data (latest amendment versions only)
year = '2019'
grants_df = fetcher.fetch_all_orgs(
    year=year,
    verify_ssl=False,
    handle_amendments='latest'
)

if not grants_df.empty:
    results = fetcher.analyze_grants(grants_df)
analysis = {}

## 2. Basic Analysis

Let's start with the built-in analysis functions:

In [None]:
analysis['org_summary'] = fetcher.get_org_summary(df = grants_df, display_table = True)

## 3. Additional Analysis

### 3.1 Temporal Distribution

Let's look at how grants are distributed across months:

In [None]:
# Convert dates to datetime and extract month
grants_df['start_month'] = pd.to_datetime(grants_df['agreement_start_date']).dt.month

# Plot monthly distribution by organization
fig, ax = plt.subplots(figsize=(12, 6))
monthly_dist = grants_df.pivot_table(
    index='start_month', 
    columns='org', 
    values='ref_number', 
    aggfunc='count'
).fillna(0)

monthly_dist.plot(kind='bar', stacked=True, ax=ax)
ax.set_title(f'Grant Distribution by Month and Organization in {year}')
ax.set_xlabel('Month')
ax.set_ylabel('Number of Grants')
ax.legend(title='Organization')
ax.set_xticklabels(ax.get_xticklabels(), rotation=45)
plt.tight_layout()
plt.show()

### 3.2 Grant Value Distributions
Let's examine the distribution of grant values:

In [None]:
# Create box plot of grant values by organization
plt.figure(figsize=(10, 6))
sns.boxplot(data=grants_df, x='org', y='agreement_value')
plt.xlabel('Organization')
plt.ylabel('Grant Value ($)')
plt.title(f'Distribution of Grant Values by Organization in {year}')
plt.ylabel('Grant Value ($)')
plt.yscale('log')  # Use log scale due to large range
plt.xticks(rotation=45)
plt.tight_layout()

### 3.3 Geographic Analysis

Let's analyze the geographic distribution of funding by province (and US states) in 2019:

In [None]:
analysis['provincial_distribution'] = fetcher.get_provincial_distribution(df = grants_df, display_table = True)

### 3.4 Recipient Analysis

Let's look at the top recipients by total funding:

In [None]:
analysis['top_recipients'] = fetcher.get_top_recipients(df = grants_df, display_table = True, top = 10)

## 4. Findings and Insights

1. **Temporal Patterns**: 
   - Clear seasonal peaks in grant distributions, with major spikes in April and May (~5,500-6,000 grants each)
   - NSERC shows the highest activity in these peak months, followed by SSHRC
   - Secondary peak in September (~3,000 grants)
   - Relatively quiet periods in summer (July-August) and winter (December-January)
   - Most agencies align their major grant distributions in spring months

2. **Funding Distribution**:
   - Total grant volume: 20,817 unique grants across all agencies
   - Agency breakdown:
     * NSERC: 12,080 grants (58% of volume but only 34% of total funding)
     * SSHRC: 5,454 grants (26% of volume, 34% of funding)
     * CIHR: 3,283 grants (16% of volume, 32% of funding)
   - Average grant values vary significantly:
     * CIHR: Highest average at $319,331 per grant
     * SSHRC: Middle range at $200,184 per grant
     * NSERC: Lowest average at $91,679 per grant
   - Box plot shows CIHR has the highest median funding values and largest spread

3. **Geographic Concentration**:
   - Strong regional disparities in funding distribution
   - Ontario dominates with $1.3B total funding (32% of all funding)
   - Quebec follows with $845M (21% of total funding)
   - British Columbia third with $424M (10% of total funding)
   - These three provinces account for 63% of all research funding
   - Territories and smaller provinces receive significantly less funding
   - Some international funding present, primarily in US states (though minimal)

4. **Institutional Leadership**:
   - Top recipients show strong institutional concentration
   - University of Toronto leads with single largest grant ($46.5M)
   - Major research universities dominate the top 10
   - Individual researchers appear in top recipients (e.g., Chertkow, Howard M with $31.6M CIHR grant)
   - Montreal/Quebec institutions feature prominently (McGill, UdeM, Laval)
   - All top 10 recipients received grants over $13.9M

These insights reveal a research funding landscape characterized by strong geographic and institutional concentration, significant variations in grant sizes between agencies, and clear temporal patterns in grant distributions.

## 5. Next Steps

Potential areas for deeper analysis:
1. Year-over-year comparison (2018-2023)
2. Research field analysis using grant titles/descriptions
3. Network analysis of institutions and researchers
4. Success rate analysis for different demographics
5. Impact analysis using citation data (if available)