# 💧 10-Year and 100-Year Floods in Your Favorite Rivers

**Hands-on Hydrology Lab — Undergraduate Course**  
Instructor: *Prof. Zhi Li*, School of Civil and Environmental Engineering, University of Connecticut  
Environment: Jupyter Notebook (Binder-ready)

In this lab, you'll explore real river data from the U.S. Geological Survey (USGS) to estimate 10-year and 100-year flood magnitudes using **Extreme Value Analysis**.

By the end of this exercise, you'll be able to:
- Understand return periods and exceedance probabilities.
- Retrieve USGS annual peak discharge data using Python.
- Fit a **Gumbel distribution** (a type of extreme value distribution) to estimate flood magnitudes.
- Visualize flood frequency curves.
- Compare 10-year and 100-year floods across rivers.

In [None]:
# 🧩 Setup
# Import required libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gumbel_r
import dataretrieval.nwis as nwis

plt.style.use('seaborn-v0_8')
pd.options.display.float_format = '{:,.2f}'.format

print('Environment ready!')

## 🌊 What is a Return Period?

The **return period** (or recurrence interval) represents the *average* interval of time between events of a certain magnitude or greater.

- A **10-year flood** has a 10% chance of being exceeded in any given year.
- A **100-year flood** has a 1% chance of being exceeded in any given year.

These are **probabilities**, not predictions — two 100-year floods can happen in consecutive years!

In [None]:
# 🏞️ Choose a USGS Site
# Examples:
# - 06752000: Cache la Poudre River near Fort Collins, CO
# - 07010000: Mississippi River at St. Louis, MO
# - 01463500: Delaware River at Trenton, NJ
# - 01646500: Potomac River at Point of Rocks, MD
# - 09380000: San Juan River near Bluff, UT

site = "01335750"  # <-- You can change this!
print(f"Selected USGS site: {site}")

In [None]:
# 📥 Retrieve Annual Peak Flow Data from USGS NWIS

peaks = nwis.get_record(sites=site, service='peaks')
print(f"Retrieved {len(peaks)} records from USGS site {site}.")
peaks

In [None]:
# 📥 Retrieve and Clean Annual Peak Flow Data from USGS NWIS (final robust version)

print(f"Attempting to retrieve annual peak data for site {site}...")

# Try to get 'peaks' data first
peaks = nwis.get_record(sites=site, service='peaks')

if peaks.empty:
    print("⚠️ No annual peak data found. Falling back to daily streamflow (dv) records.")
    
    # Retrieve daily mean streamflow data for long period (adjust as needed)
    daily = nwis.get_record(sites=site, service='dv', start='1950-01-01')
    
    # Identify the discharge column (usually contains '00060_Mean')
    flow_col = None
    for c in daily.columns:
        if '00060' in c or 'discharge' in c.lower():
            flow_col = c
            break
    if flow_col is None:
        raise ValueError("Could not find discharge column in daily data.")
    
    # Compute annual maximum flows
    daily = daily[[flow_col]].dropna()
    daily.index = pd.to_datetime(daily.index)
    data = (
        daily.resample('Y').max()
        .rename(columns={flow_col: 'peak_flow_cfs'})
        .reset_index()
    )
    data['year'] = data['datetime'].dt.year
else:
    print(f"✅ Retrieved {len(peaks)} annual peak records.")
    
    # Handle the case where date is the index
    if isinstance(peaks.index, pd.DatetimeIndex):
        peaks = peaks.reset_index().rename(columns={'index': 'peak_dt'})
    
    # Find the date column (index or explicit)
    date_col = None
    for c in peaks.columns:
        if 'peak_dt' in c or 'peak_date' in c or 'datetime' in c:
            date_col = c
            break
    if date_col is None and isinstance(peaks.index, pd.DatetimeIndex):
        date_col = peaks.index.name
    
    # Find the discharge column
    flow_col = None
    for c in peaks.columns:
        if 'peak_va' in c or 'discharge' in c.lower() or 'flow' in c.lower():
            flow_col = c
            break

    if flow_col is None:
        raise ValueError("Could not find discharge column in NWIS data.")
    
    # Prepare the data
    data = peaks[[date_col, flow_col]].dropna().copy()
    data['year'] = pd.to_datetime(data[date_col]).dt.year
    data = data.rename(columns={flow_col: 'peak_flow_cfs'})
    data = data.groupby('year', as_index=False)['peak_flow_cfs'].max()

print(f"Data covers {data['year'].min()}–{data['year'].max()} ({len(data)} years)")
data

In [None]:
# 📊 Visualize Annual Peak Flows

plt.figure(figsize=(9,5))
plt.plot(data['year'], data['peak_flow_cfs'], color='navy', marker='o', linestyle='-')
plt.xlabel('Year')
plt.ylabel('Annual Peak Flow (cfs)')
plt.title(f'Annual Peak Flows for USGS Site {site}')
plt.grid(True)
plt.show()

In [None]:
# 📈 Fit Gumbel Distribution to Annual Peak Flows

x = data['peak_flow_cfs'].values
loc, scale = gumbel_r.fit(x)
print(f"Fitted Gumbel parameters: loc = {loc:.2f}, scale = {scale:.2f}")

# Define return periods (years)
T = np.array([2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, 200])
prob_exceed = 1 / T

# Quantiles from the fitted distribution
Q = gumbel_r.ppf(1 - prob_exceed, loc=loc, scale=scale)

results = pd.DataFrame({
    'Return Period (yrs)': T,
    'Exceedance Probability': prob_exceed,
    'Estimated Flood (cfs)': Q
})
results

In [None]:
# 📉 Plot Flood Frequency Curve

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 5))
ax1.plot(T, Q, color='navy', marker='o', linestyle='-', label='Gumbel Fit')
ax1.set_xlabel('Return Period (years)')
ax1.set_ylabel('Flood Discharge (cfs)')
ax1.set_title(f'Flood Frequency Curve – USGS {site}\n(Linear)')
ax1.grid(True, which='both', linestyle='--', alpha=0.7)
ax1.legend()
ax2.semilogx(T, Q, color='navy', marker='o', linestyle='-', label='Gumbel Fit')
ax2.set_xlabel('Return Period (years)')
ax2.set_ylabel('Flood Discharge (cfs)')
ax2.set_title(f'Flood Frequency Curve – USGS {site}\n(Semilogx)')
ax2.grid(True, which='both', linestyle='--', alpha=0.7)
ax2.legend()
plt.tight_layout()
plt.show()

In [None]:
# 🔍 Extract Key Flood Estimates

Q10 = results.loc[results['Return Period (yrs)'] == 10, 'Estimated Flood (cfs)'].values[0]
Q100 = results.loc[results['Return Period (yrs)'] == 100, 'Estimated Flood (cfs)'].values[0]

print(f"Estimated 10-year flood:  {Q10:,.0f} cfs")
print(f"Estimated 100-year flood: {Q100:,.0f} cfs")

## 🚀 Try It Yourself

1. Change the **`site`** variable above to another river (see suggestions below).
2. Rerun all cells to see how the results differ.
3. Discuss: Why do flood magnitudes differ between rivers?

**Suggested Sites:**
- 01646500 — Potomac River at Point of Rocks, MD  
- 05555500 — Illinois River at Peoria, IL  
- 09380000 — San Juan River near Bluff, UT  
- 01463500 — Delaware River at Trenton, NJ  

Find more site numbers here: [https://nwis.waterdata.usgs.gov/usa/nwis/peak](https://nwis.waterdata.usgs.gov/usa/nwis/peak)

## 🧠 Discussion Questions

1. What does a *100-year flood* really mean?
2. What assumptions does the Gumbel model make?
3. How might climate change affect flood frequency analyses?
4. How could you improve this analysis using longer datasets or different distributions?

## 📎 Appendix: 48 rivers in Lower 48 States

| #  | State          | River / Tributary      | USGS Gage Site No. | Station Name / Location                    |
| -- | -------------- | ---------------------- | ------------------ | ------------------------------------------ |
| 1  | Alabama        | Mobile River           | 02471000           | Mobile River at Coffeeville, AL            |
| 2  | Arizona        | Salt River             | 09506000           | Salt River near Tempe, AZ                  |
| 3  | Arkansas       | Arkansas River         | 07182500           | Arkansas River at Little Rock, AR          |
| 4  | California     | Sacramento River       | 11447650           | Sacramento River at Freeport, CA           |
| 5  | Colorado       | Colorado River         | 09152500           | Colorado River at Dotsero, CO              |
| 6  | Connecticut    | Connecticut River      | 01184000           | Connecticut River at Thompsonville, CT     |
| 7  | Delaware       | Delaware River         | 01463500           | Delaware River at Trenton, NJ              |
| 8  | Florida        | Apalachicola River     | 02323500           | Apalachicola River at Chattahoochee, FL    |
| 9  | Georgia        | Savannah River         | 02197000           | Savannah River at Augusta, GA              |
| 10 | Idaho          | Snake River            | 13241000           | Snake River at American Falls, ID          |
| 11 | Illinois       | Illinois River         | 05526000           | Illinois River at La Salle, IL             |
| 12 | Indiana        | Wabash River           | 03352500           | Wabash River at Lafayette, IN              |
| 13 | Iowa           | Des Moines River       | 05432000           | Des Moines River at Des Moines, IA         |
| 14 | Kansas         | Kansas River           | 06893000           | Kansas River at Kansas City, KS            |
| 15 | Kentucky       | Ohio River             | 03274500           | Ohio River at Louisville, KY               |
| 16 | Louisiana      | Mississippi River      | 07289000           | Mississippi River at Red River Landing, LA |
| 17 | Maine          | Penobscot River        | 01034500           | Penobscot River at Old Town, ME            |
| 18 | Maryland       | Potomac River          | 01646500           | Potomac River at Point of Rocks, MD        |
| 19 | Massachusetts  | Merrimack River        | 01175000           | Merrimack River at Lowell, MA              |
| 20 | Michigan       | Grand River            | 04120500           | Grand River at Grand Rapids, MI            |
| 21 | Minnesota      | Minnesota River        | 05331000           | Minnesota River at Mankato, MN             |
| 22 | Mississippi    | Yazoo River            | 07288950           | Yazoo River at Yazoo City, MS              |
| 23 | Missouri       | Missouri River         | 06934500           | Missouri River at Hermann, MO              |
| 24 | Montana        | Missouri River (Upper) | 06157100           | Missouri River near Wolf Point, MT         |
| 25 | Nebraska       | Platte River           | 06730200           | Platte River at Louisville, NE             |
| 26 | Nevada         | Truckee River          | 10335000           | Truckee River at Reno, NV                  |
| 27 | New Hampshire  | Merrimack River        | 01076000           | Merrimack River at Manchester, NH          |
| 28 | New Jersey     | Delaware River         | 01463500           | Delaware River at Trenton, NJ              |
| 29 | New Mexico     | Rio Grande             | 08330000           | Rio Grande at Embudo, NM                   |
| 30 | New York       | Hudson River           | 01335750           | Hudson River at Troy, NY                   |
| 31 | North Carolina | Neuse River            | 02091000           | Neuse River at Goldsboro, NC               |
| 32 | North Dakota   | Red River of the North | 05054000           | Red River at Fargo, ND                     |
| 33 | Ohio           | Cuyahoga River         | 03163500           | Cuyahoga River at Independence, OH         |
| 34 | Oklahoma       | Arkansas River         | 07335000           | Arkansas River near Tulsa, OK              |
| 35 | Oregon         | Willamette River       | 14131500           | Willamette River at Salem, OR              |
| 36 | Pennsylvania   | Susquehanna River      | 01570500           | Susquehanna River at Marietta, PA          |
| 37 | Rhode Island   | Pawtuxet River         | 01100900           | Pawtuxet River at Cranston, RI             |
| 38 | South Carolina | Congaree River         | 02169500           | Congaree River near Columbia, SC           |
| 39 | South Dakota   | Big Sioux River        | 06472000           | Big Sioux River at Sioux Falls, SD         |
| 40 | Tennessee      | Tennessee River        | 03542500           | Tennessee River at Chattanooga, TN         |
| 41 | Texas          | Brazos River           | 08074000           | Brazos River at Waco, TX                   |
| 42 | Utah           | Green River            | 09315000           | Green River at Green River, UT             |
| 43 | Vermont        | Connecticut River      | 04234000           | Connecticut River at Wilder, VT            |
| 44 | Virginia       | James River            | 02037500           | James River at Richmond, VA                |
| 45 | Washington     | Columbia River         | 14105700           | Columbia River at The Dalles, OR–WA border |
| 46 | West Virginia  | Kanawha River          | 03192000           | Kanawha River at Charleston, WV            |
| 47 | Wisconsin      | Wisconsin River        | 05427500           | Wisconsin River at Nekoosa, WI             |
| 48 | Wyoming        | Green River            | 09234500           | Green River at Green River, WY             |
