
# USGS Daily Streamflow Downloader — Site 05470500 (2001–2023)

This notebook downloads **USGS daily streamflow** (parameter **00060**, mean daily **00003**) for site **05470500** using the **NWIS Daily Values** service via the `hydrofunctions` package, saves it to CSV, and creates quick QA plots.

**You will:**
1. Install and import dependencies.
2. Set parameters (site ID and dates).
3. Fetch daily discharge data from NWIS.
4. Save a clean CSV with both **cfs** and **m³/s**.
5. Plot a hydrograph and compute simple annual summaries.


## 1) Install requirements (run once if needed)

In [1]:

# If needed, install dependencies in your environment:
# !pip install --upgrade pip
# !pip install hydrofunctions pandas matplotlib


## 2) Parameters

In [2]:

SITE = "05470500"               # USGS site number
START = "2001-01-01"            # inclusive
END   = "2023-12-31"            # inclusive

# Output filename (saved in the current working directory by default)
OUT_CSV = f"usgs_{SITE}_{START}_to_{END}.csv"


## 3) Imports

In [3]:

import pandas as pd
import matplotlib.pyplot as plt

# hydrofunctions for NWIS access
import hydrofunctions as hf

pd.set_option("display.max_rows", 8)


## 4) Fetch USGS daily values (discharge)

In [6]:

def fetch_usgs_daily(site: str, start: str, end: str) -> pd.DataFrame:
    """
    Fetch mean daily discharge (00060, stat 00003) from NWIS 'dv' service via hydrofunctions.
    Returns a DataFrame with Date index and columns: Discharge_cfs, Discharge_m3s.
    """
    # Request daily values (dv) for discharge (00060) with mean statistic (00003)
    req = hf.NWIS(site, service='dv', start_date=start, end_date=end, parameterCd='00060')
    df = req.df()  # index is datetime
    df.index = pd.to_datetime(df.index, errors='coerce')
    df = df.sort_index()

    # Robustly locate the discharge column: USGS:<site>:00060:00003
    col = None
    for c in df.columns:
        if f":{site}:" in c and ":00060:" in c and c.endswith(":00003"):
            col = c
            break
    if col is None:
        # Fallback: any 00060 column
        candidates = [c for c in df.columns if ":00060:" in c]
        if candidates:
            col = candidates[0]
        else:
            raise RuntimeError(f"No discharge (00060) column found in NWIS response. Columns: {list(df.columns)}")

    out = pd.DataFrame(index=df.index)
    out["Discharge_cfs"] = pd.to_numeric(df[col], errors="coerce")
    out["Discharge_m3s"] = out["Discharge_cfs"] * 0.0283168  # ft^3/s -> m^3/s
    out = out.dropna(how="all")
    out.index.name = "Date"
    return out


## 5) Run the download

In [7]:

flow_df = fetch_usgs_daily(SITE, START, END)
flow_df.head()


Requested data from https://waterservices.usgs.gov/nwis/dv/?format=json%2C1.1&sites=05470500&parameterCd=00060&startDT=2001-01-01&endDT=2023-12-31


Unnamed: 0_level_0,Discharge_cfs,Discharge_m3s
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2001-01-01 00:00:00+00:00,0.0,0.0
2001-01-02 00:00:00+00:00,0.0,0.0
2001-01-03 00:00:00+00:00,0.14,0.003964
2001-01-04 00:00:00+00:00,0.16,0.004531
2001-01-05 00:00:00+00:00,0.22,0.00623


## 6) Save to CSV

In [8]:

flow_df.to_csv(OUT_CSV, float_format="%.6f")
print(f"Saved {len(flow_df):,} rows to {OUT_CSV}")


Saved 8,399 rows to usgs_05470500_2001-01-01_to_2023-12-31.csv


## 7) Quick QA: Hydrograph

In [9]:

plt.figure(figsize=(12, 4))
plt.plot(flow_df.index, flow_df["Discharge_cfs"], linewidth=0.8)
plt.title(f"USGS {SITE} — Daily Discharge (cfs): {START} to {END}")
plt.xlabel("Date")
plt.ylabel("Discharge (cfs)")
plt.grid(True)
plt.tight_layout()
plt.show()


  plt.show()


## 8) Annual Summary (optional)

In [10]:

annual = (
    flow_df
    .assign(Year=flow_df.index.year)
    .groupby("Year")
    .agg(
        mean_cfs = ("Discharge_cfs", "mean"),
        median_cfs = ("Discharge_cfs", "median"),
        max_cfs = ("Discharge_cfs", "max"),
        mean_m3s = ("Discharge_m3s", "mean"),
    )
)
annual


Unnamed: 0_level_0,mean_cfs,median_cfs,max_cfs,mean_m3s
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2001,95.095671,23.00,1070.0,2.692805
2002,57.741068,27.00,830.0,1.635042
2003,103.783205,10.90,2160.0,2.938808
2004,138.561694,27.80,1960.0,3.923624
...,...,...,...,...
2020,105.190519,54.85,1730.0,2.978659
2021,33.121123,19.80,294.0,0.937884
2022,96.727425,10.50,3180.0,2.739011
2023,77.367151,37.10,2370.0,2.190790



## 9) Notes
- **Service**: NWIS Daily Values (`dv`) via `hydrofunctions`.
- **Parameter**: `00060` (discharge); **Statistic**: `00003` (mean daily).
- **Time zone**: NWIS data are typically in local time; daily values are calendar days.
- **Gaps**: If the site has gaps or provisional data, you'll see `NaN`s; handle as needed.
- **Licensing**: USGS data are public domain; please cite NWIS appropriately in publications.
