# 2025 IWRC Seed Fund - INTERACTIVE Visualizations

This notebook creates interactive HTML visualizations using Plotly:
1. Interactive pie chart of research keywords
2. Interactive map of Illinois showing funded institutions

The visualizations will be saved as HTML files that can be opened in any web browser.

In [1]:
# Import required libraries
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from collections import Counter
import json

In [2]:
# Load the 2025 data
df = pd.read_excel('fact sheet data.xlsx', sheet_name='2025 data')
print(f"Loaded {len(df)} projects from 2025 data")
print(f"Columns: {df.shape[1]}")
df.head()

Loaded 75 projects from 2025 data
Columns: 33


Unnamed: 0,PI,Award Amount,College,Department,Institution,City,State,PhD,MS,undergrad,...,How many students were co-authors of this product?,Unnamed: 24,"Award, Achievement, or Grant\n (This may include awards and achievements for projects from the previous year to this 5-year cycle, so long as they were not already included in last year's report)",Source? Identify the Organization,"Description of Award, Achievement, or Grant\n (This may include awards and achievements for projects from the previous year to this 5-year cycle, so long as they were not already included in last year's report)","Award Recipient(s), Name",Date (Mo/Year),Who was the recipient?,Monetary Benefit of Award or Achievement (if applicable; use NA if not applicable),Additional Comments
0,2015,,,,,,,,,,...,,,Award,Southern Illinois University,University Level Scholar Excellence Award,Dr. Michael Lydy,2017.0,PI,"The permanent title of Distinguished Scholar, ...",The University-Level Scholar Excellence Award ...
1,Maia,9977.0,,,Eastern Illinois University,,,,,,...,,,Award,Southern Illinois University,Morris Doctoral Fellowship,Sam Nutile,2016.0,Student,The Morris Doctoral Fellowship is a five-year...,The Morris Doctoral Fellowship was established...
2,Rhoads,5591.0,College of Liberal Arts & Sciences,Geography and Geographic Information Science,University of Illinois Urbana-Champaign,Urbana-Champaign,Illinois,,,,...,,,Award,Southern Illinois University,REACH Award for undergraduate research,Andrew Derby,42826.0,Student,2000,The REACH (Research-Enriched Academic Challeng...
3,Stillwell,10000.0,Grainger College of Engineering,Civil and Environmental Engineering,University of Illinois Urbana-Champaign,Urbana-Champaign,Illinois,,,,...,,,Award,Southern Illinois University,Master’s Fellowship Award,Andrew Derby,2018.0,Student,A fellowship covers full tuition (nine hours m...,
4,Lydy,249329.0,,"Chemical and Biomolecular Science, Zoology",Southern Illinois University,,,2.0,2.0,3.0,...,,,Award,Southern Illinois University,REACH Award for undergraduate research,Tristin Miller,43556.0,Student,2000,The REACH (Research-Enriched Academic Challeng...


## Part 1: Interactive Research Keywords Pie Chart

In [3]:
# Combine keywords from columns O and P
keyword2 = df['Keyword 2'].dropna().tolist()
keyword3 = df['Keyword 3'].dropna().tolist()
all_keywords = keyword2 + keyword3

# Count keyword frequencies
keyword_counts = Counter(all_keywords)
print(f"Total keywords: {len(all_keywords)}")
print(f"Unique keywords: {len(keyword_counts)}")

# Prepare data for pie chart
sorted_keywords = keyword_counts.most_common()
top_n = 10
top_keywords = dict(sorted_keywords[:top_n])
other_count = sum(count for _, count in sorted_keywords[top_n:])

if other_count > 0:
    top_keywords['Other'] = other_count

# Create DataFrame for Plotly
pie_df = pd.DataFrame({
    'Keyword': list(top_keywords.keys()),
    'Count': list(top_keywords.values())
})

# Calculate percentages
pie_df['Percentage'] = (pie_df['Count'] / pie_df['Count'].sum() * 100).round(1)

print("\nKeyword Distribution:")
print(pie_df.to_string(index=False))

Total keywords: 83
Unique keywords: 28

Keyword Distribution:
      Keyword  Count  Percentage
SURFACE WATER      8         9.6
      ECOLOGY      7         8.4
 CONSERVATION      6         7.2
    HYDROLOGY      6         7.2
       MODELS      5         6.0
WATER QUALITY      5         6.0
      METHODS      5         6.0
       FLOODS      4         4.8
    NUTRIENTS      4         4.8
  AGRICULTURE      3         3.6
        Other     30        36.1


In [4]:
# Create interactive pie chart with Plotly
fig_pie = go.Figure(data=[go.Pie(
    labels=pie_df['Keyword'],
    values=pie_df['Count'],
    hole=0.3,  # Donut chart
    textposition='auto',
    textinfo='label+percent',
    hovertemplate='<b>%{label}</b><br>' +
                  'Count: %{value}<br>' +
                  'Percentage: %{percent}<br>' +
                  '<extra></extra>',
    marker=dict(
        colors=px.colors.qualitative.Set3,
        line=dict(color='white', width=2)
    )
)])

fig_pie.update_layout(
    title={
        'text': '2025 IWRC Seed Fund Projects<br>Research Topic Distribution',
        'x': 0.5,
        'xanchor': 'center',
        'font': {'size': 24, 'family': 'Arial Black'}
    },
    font=dict(size=14),
    showlegend=True,
    legend=dict(
        orientation='v',
        yanchor='middle',
        y=0.5,
        xanchor='left',
        x=1.1
    ),
    height=700,
    width=1000
)

# Save as HTML
fig_pie.write_html('2025_keyword_pie_chart_interactive.html')
print("Saved: 2025_keyword_pie_chart_interactive.html")

# Display in notebook
fig_pie.show()

Saved: 2025_keyword_pie_chart_interactive.html


## Part 2: Interactive Illinois Institutions Map

In [5]:
# Prepare institution data
# Count projects by grouping and using size() instead of referencing non-existent column
institution_data = df.groupby(['Institution', 'City']).size().reset_index(name='Project Count')

# Also calculate total funding per institution
funding_by_institution = df.groupby(['Institution', 'City'])['Award Amount'].sum().reset_index()
funding_by_institution.columns = ['Institution', 'City', 'Total Funding']

# Merge the data
institution_data = institution_data.merge(funding_by_institution, on=['Institution', 'City'])

# Add coordinates
coordinates = {
    'Champaign': (40.1164, -88.2434),
    'Urbana': (40.1106, -88.2073),
    'Carbondale': (37.7272, -89.2167),
    'Normal': (40.5142, -88.9906),
    'Chicago': (41.8781, -87.6298),
    'Charleston': (39.4961, -88.1781),
    'Evanston': (42.0451, -87.6877),
    'Godfrey': (38.9556, -90.1868),
    'Edwardsville': (38.8114, -89.9531)
}

institution_data['Latitude'] = institution_data['City'].map(lambda x: coordinates.get(x, (40.6331, -89.3985))[0])
institution_data['Longitude'] = institution_data['City'].map(lambda x: coordinates.get(x, (40.6331, -89.3985))[1])

# Clean institution names for display
institution_data['Short Name'] = institution_data['Institution'].str.replace(
    'University of Illinois Urbana-Champaign', 'UIUC'
).str.replace(
    'Southern Illinois University', 'SIU'
).str.replace(
    'Illinois State University', 'ISU'
).str.replace(
    'Illinois Institute of Technology', 'IIT'
).str.replace(
    'University of Illinois Chicago', 'UIC'
).str.replace(
    'Eastern Illinois University', 'EIU'
).str.replace(
    'Northwestern University', 'Northwestern'
).str.replace(
    'Lewis and Clark Community College', 'Lewis & Clark CC'
)

# Format funding as currency
institution_data['Funding Display'] = institution_data['Total Funding'].apply(lambda x: f'${x:,.0f}')

print("Institution data prepared:")
print(institution_data[['Short Name', 'City', 'Project Count', 'Funding Display']].to_string(index=False))

Institution data prepared:
      Short Name             City  Project Count Funding Display
             IIT          Chicago              2        $260,000
             ISU           Normal              1          $9,887
Lewis & Clark CC       East Alton              1         $15,000
  Not for profit          Chicago              1          $5,000
  SIU Carbondale       Carbondale              2         $30,000
            UIUC Urbana-Champaign             29      $5,893,349


In [6]:
# Load Illinois GeoJSON for state boundaries
with open('illinois_counties.json', 'r') as f:
    geojson_data = json.load(f)

# Filter for Illinois counties (FIPS codes starting with 17)
illinois_features = [f for f in geojson_data['features'] if f['id'].startswith('17')]

illinois_geojson = {
    'type': 'FeatureCollection',
    'features': illinois_features
}

print(f"Loaded {len(illinois_features)} Illinois counties")

Loaded 102 Illinois counties


In [7]:
# Create interactive map using Plotly with proper Illinois focus
# Use a simpler approach with scatter_geo for better control

fig_map = go.Figure()

# Add institution markers with better sizing
fig_map.add_trace(go.Scattergeo(
    lon=institution_data['Longitude'],
    lat=institution_data['Latitude'],
    mode='markers+text',
    marker=dict(
        size=institution_data['Project Count'] * 3 + 10,  # Smaller, more reasonable sizing
        color=institution_data['Project Count'],
        colorscale='YlOrRd',
        showscale=True,
        colorbar=dict(
            title='Number of<br>Projects',
            thickness=20,
            len=0.5,
            x=1.02
        ),
        line=dict(width=2, color='DarkSlateGray'),
        sizemode='diameter',
        opacity=0.85
    ),
    text=institution_data['Short Name'],
    textposition='top center',
    textfont=dict(size=9, color='black', family='Arial'),
    hovertemplate='<b>%{customdata[0]}</b><br>' +
                  'City: %{customdata[1]}<br>' +
                  'Projects: %{customdata[2]}<br>' +
                  'Total Funding: %{customdata[3]}<br>' +
                  '<extra></extra>',
    customdata=institution_data[['Institution', 'City', 'Project Count', 'Funding Display']].values,
    showlegend=False
))

# Update layout for proper Illinois focus
fig_map.update_layout(
    title={
        'text': '2025 IWRC Seed Fund<br>Funded Institutions Across Illinois',
        'x': 0.5,
        'xanchor': 'center',
        'font': {'size': 22, 'family': 'Arial', 'color': '#333'}
    },
    geo=dict(
        scope='usa',
        projection_type='albers usa',
        showland=True,
        landcolor='rgb(250, 250, 250)',
        subunitcolor='rgb(217, 217, 217)',
        countrycolor='rgb(217, 217, 217)',
        showlakes=True,
        lakecolor='rgb(200, 230, 255)',
        showsubunits=True,
        showcountries=True,
        resolution=50,
        lonaxis=dict(
            range=[-91.5, -87.0]  # Illinois longitude range
        ),
        lataxis=dict(
            range=[36.9, 42.6]  # Illinois latitude range
        ),
        center=dict(
            lon=-89.0,
            lat=40.0
        ),
        projection_scale=5.5  # Zoom level for Illinois
    ),
    height=900,
    width=700,
    margin=dict(l=0, r=100, t=100, b=0),
    paper_bgcolor='white',
    plot_bgcolor='white'
)

# Save as HTML
fig_map.write_html('2025_illinois_institutions_map_interactive.html')
print("Saved: 2025_illinois_institutions_map_interactive.html")

# Display in notebook
fig_map.show()

Saved: 2025_illinois_institutions_map_interactive.html


## Summary

Two interactive HTML files have been created:

1. **2025_keyword_pie_chart_interactive.html** - Interactive donut chart
   - Hover to see detailed counts and percentages
   - Click legend items to show/hide categories
   - Zoom and pan capabilities

2. **2025_illinois_institutions_map_interactive.html** - Interactive Illinois map
   - Hover over markers to see institution details
   - Shows project count and total funding
   - Zoom and pan to explore different regions
   - Marker size represents number of projects

These files can be:
- Opened in any web browser
- Shared via email or web hosting
- Embedded in websites or presentations
- Used for interactive data exploration

In [8]:
# Display summary statistics
print("=" * 70)
print("2025 IWRC SEED FUND SUMMARY")
print("=" * 70)
print(f"\nTotal Projects: {len(df)}")
print(f"Total Institutions: {df['Institution'].nunique()}")
print(f"Total Funding: ${df['Award Amount'].sum():,.2f}")
print(f"Average Award: ${df['Award Amount'].mean():,.2f}")
print(f"\nResearch Topics Covered: {len(keyword_counts)}")
print(f"Total Keyword Mentions: {len(all_keywords)}")
print("\nTop 5 Research Topics:")
for i, (keyword, count) in enumerate(keyword_counts.most_common(5), 1):
    pct = (count / len(all_keywords) * 100)
    print(f"  {i}. {keyword}: {count} mentions ({pct:.1f}%)")
print("\nTop 3 Institutions by Project Count:")
inst_counts = df['Institution'].value_counts()
for i, (inst, count) in enumerate(inst_counts.head(3).items(), 1):
    funding = df[df['Institution'] == inst]['Award Amount'].sum()
    print(f"  {i}. {inst}: {count} projects (${funding:,.0f})")
print("\n" + "=" * 70)
print("\nInteractive HTML files created:")
print("  - 2025_keyword_pie_chart_interactive.html")
print("  - 2025_illinois_institutions_map_interactive.html")
print("\nOpen these files in your web browser for interactive exploration!")
print("=" * 70)

2025 IWRC SEED FUND SUMMARY

Total Projects: 75
Total Institutions: 12
Total Funding: $13,404,152.00
Average Award: $206,217.72

Research Topics Covered: 28
Total Keyword Mentions: 83

Top 5 Research Topics:
  1. SURFACE WATER: 8 mentions (9.6%)
  2. ECOLOGY: 7 mentions (8.4%)
  3. CONSERVATION: 6 mentions (7.2%)
  4. HYDROLOGY: 6 mentions (7.2%)
  5. MODELS: 5 mentions (6.0%)

Top 3 Institutions by Project Count:
  1. University of Illinois Urbana-Champaign: 30 projects ($5,903,349)
  2. Southern Illinois University: 5 projects ($289,163)
  3. Illinois State University: 4 projects ($39,885)


Interactive HTML files created:
  - 2025_keyword_pie_chart_interactive.html
  - 2025_illinois_institutions_map_interactive.html

Open these files in your web browser for interactive exploration!
