Open-source Python toolkit for water data, hydrology, and agricultural water management — with an AI engine that recommends and auto-executes research methodologies.
Install · Examples · CLI · Features · Docs · Roadmap · Discussions
AquaScope unifies 15 global water-data APIs behind one Python schema, then layers a full scientific computing stack on top — from Bulletin 17C flood frequency to FAO-56 crop water requirements — wrapped in an AI engine that scores 26 research methodologies against your dataset and auto-executes 7 analysis pipelines. Validated against the CAMELS benchmark with 525 tests.
- 🌊 Pull water data from USGS, FAO AQUASTAT, FAO WaPOR, GEMStat, EU WFD, Copernicus ERA5, Taiwan MOENV/WRA/Civil IoT/DataGov, Japan MLIT, Korea WAMIS, OpenMeteo, UN SDG 6, US Water Quality Portal — one unified Python API.
- 📈 Run hydrological analyses — Bulletin 17C flood frequency (GEV / LP3 / Gumbel / non-stationary GEV / EMA), baseflow separation, rating curves, 22 hydrological signatures.
- 🌾 Plan agricultural water — FAO-56 Penman-Monteith ET₀, crop water requirements for 20 crops, irrigation scheduling, soil water balance with auto-irrigation.
- 🤖 Ask the AI engine — describe your goal in plain English and get a recommended methodology, scored against your dataset profile and auto-executed. LLM enhancement via OpenAI, Groq (free), HuggingFace (free), or local Ollama.
- 📊 Visualise + report — 16 plot types, Q-Q / P-P diagnostics, Markdown / HTML reports with embedded figures, threshold alerts (WHO / EPA / EU WFD).
- 🗺️ Spatial hydrology — DEM processing, D8 flow direction, watershed delineation, Strahler ordering.
For the full capability list see docs/features.md.
| AquaScope | HEC-SSP | R lmom |
Standalone collectors | |
|---|---|---|---|---|
| Bulletin 17C FFA + EMA | ✅ | ✅ | partial | — |
| Non-stationary GEV | ✅ | — | partial | — |
| Baseflow separation (Lyne-Hollick, Eckhardt) | ✅ | — | — | — |
| FAO-56 Penman-Monteith ET₀ + crop water | ✅ | — | — | — |
| 15 unified data collectors | ✅ | — | — | per-source |
| AI methodology recommender (OpenAI / Groq / HF / Ollama) | ✅ | — | — | — |
| Interactive Streamlit dashboard | ✅ | — | — | — |
| Free, MIT, Python-native | ✅ | partial | ✅ | varies |
pip install aquascope # core — collectors + hydrology
pip install "aquascope[all]" # everything — ML, viz, spatial, dashboardFeature-group extras:
pip install "aquascope[ml]" # sklearn, xgboost, statsmodels
pip install "aquascope[viz]" # matplotlib, seaborn, folium
pip install "aquascope[scientific]" # xarray, netcdf4, h5py
pip install "aquascope[spatial]" # rasterio, geopandas, shapely
pip install "aquascope[dashboard]" # streamlit
pip install "aquascope[forecast]" # prophet, torch (for LSTM)For development:
git clone https://github.com/Rekin226/aquascope.git
cd aquascope
pip install -e ".[all,dev]"from aquascope.api import flood_analysis
result = flood_analysis(daily_discharge, method="gev", return_periods=[10, 50, 100])
print(result.return_levels)
# return_period return_level lower_ci upper_ci
# 0 10 1840.2 1690.4 2010.6
# 1 50 2530.7 2280.1 2820.9
# 2 100 2870.4 2540.6 3260.5Switch method to "lp3", "gumbel", "gpd", or "ns_gev" for non-stationary analysis. Pass censored=True for EMA on records with peak-over-threshold gaps.
from aquascope.api import baseflow_analysis, compute_all_signatures
bf = baseflow_analysis(daily_discharge, method="eckhardt") # or "lyne_hollick"
sig = compute_all_signatures(daily_discharge)
print(bf.bfi) # baseflow index, e.g. 0.42
print(sig["Q5"], sig["Q95"]) # high-flow / low-flow exceedances
print(sig["flashiness"]) # Richards-Baker flashiness index22 signatures across magnitude, variability, timing, recession, and flashiness — see docs/features.md.
from aquascope.collectors import USGSCollector, AquastatCollector, WaporCollector
usgs = USGSCollector()
flow = usgs.collect(station_id="01646500", parameter="00060", days=365)
aquastat = AquastatCollector()
egy_water = aquastat.collect(country="EGY", variables=[4263, 4253, 4312])
wapor = WaporCollector()
et = wapor.collect(
bbox=(30.5, 29.8, 31.1, 30.2),
variable="RET",
start_date="2026-04-01",
end_date="2026-07-31",
)Every collector returns records in the same Pydantic schema, so downstream analyses don't care where the data came from. See docs/data_sources.md for the full list.
from datetime import date
from aquascope.agri import (
penman_monteith_daily,
crop_water_requirement,
SoilWaterBalance,
)
from aquascope.agri.water_balance import SoilProperties
# Reference ET (FAO-56 Penman-Monteith) — Cairo, July
eto = penman_monteith_daily(
t_min=18.0, t_max=32.0, rh_min=40, rh_max=80,
u2=2.0, rs=22.0, latitude=30.0, elevation=70, doy=180,
)
# Crop water requirement for maize from planting through harvest
cwr = crop_water_requirement(eto_series, crop="maize", planting_date=date(2026, 4, 1))
# Soil water balance with auto-irrigation triggers
soil = SoilProperties(field_capacity=0.30, wilting_point=0.15, root_depth=1.0)
balance = SoilWaterBalance(soil).auto_irrigate(
etc=cwr.etc, precip=precip_series, efficiency=0.7,
)
print(balance.total_irrigation_mm, balance.deficit_days)from aquascope.ai_engine import recommend
# Describe your dataset and goal — get ranked, scored methodologies
recs = recommend(
parameters=["DO", "BOD5", "COD"],
n_records=4_500,
temporal=True,
spatial=False,
goal="detect long-term pollution trends with seasonality",
)
for r in recs[:3]:
print(f"{r.score:.2f} {r.method_id:<20} {r.rationale}")
# 0.92 mann_kendall Strong fit: temporal data, >30 records, trend goal
# 0.87 stl_decomposition Seasonal patterns + multi-year data
# 0.81 prophet Forecasting-capable, handles seasonality nativelyThen auto-execute the top result with run_pipeline(recs[0].method_id, df).
from aquascope.api import detect_changepoints, fit_copula
cps = detect_changepoints(annual_runoff, method="pettitt")
cop = fit_copula(rainfall, runoff, family="auto") # AIC-selects Gaussian/Clayton/Gumbel/Frank
print(cps.change_year, cps.p_value)
print(cop.family, cop.theta, cop.aic)from aquascope.api import bayesian_regression
# Annual rainfall → runoff with full posterior + convergence diagnostics
posterior = bayesian_regression(X=annual_precip, y=annual_runoff)
print(posterior.posterior_mean)
# {'beta_0': 12.4, 'beta_1': 0.82, 'sigma2': 41.6}
print(posterior.credible_intervals["beta_1"])
# (0.78, 0.86) ← 95% credible interval on slope
print(posterior.r_hat)
# {'beta_0': 1.00, 'beta_1': 1.00, 'sigma2': 1.00} ← Gelman–Rubin, converged
print(posterior.dic, posterior.effective_sample_size["beta_1"])
# 124.7 9842.0 ← model fit + effective sample sizeSwitch to MCMC with degree>1 for polynomial models, or pass prior_precision for informative priors. Conjugate linear, polynomial, and Metropolis-Hastings backends are all available.
AquaScope ships a 14-command CLI for the most common workflows:
# Collect data
aquascope collect --source usgs --station 01646500 --days 365
aquascope collect --source wapor --bbox 30.5,29.8,31.1,30.2 --variable RET --start-date 2026-04-01
# Hydrological analysis
aquascope hydro --method flood_frequency --file discharge.csv
aquascope hydro --method baseflow --file discharge.csv
# Agriculture planning
aquascope agri plan --crop maize --planting-date 2026-04-01 --lat 30.0 --lon 31.25
# AI recommendation + natural-language problem solving
aquascope recommend --parameters DO,BOD5,COD --goal "pollution trend detection"
aquascope solve --problem "Assess flood risk for a 100-year return period"
# Interactive Streamlit dashboard
aquascope dashboardRun aquascope --help for the full command list.
12 unified data sources spanning four regions:
- 🌎 Americas — USGS (streamflow + WQ), Water Quality Portal (400+ agencies)
- 🌍 Europe — EU Water Framework Directive, Copernicus ERA5
- 🌏 Asia-Pacific — Taiwan MOENV / WRA / Civil IoT / DataGov, Japan MLIT, Korea WAMIS
- 🌐 Global — GEMStat (170 countries), UN SDG 6, OpenMeteo, FAO AQUASTAT, FAO WaPOR
Full details, endpoints, and API-key requirements: docs/data_sources.md. Want to add your country's water service? See adding a data source.
- 534 tests — covering every collector, hydrology method, and pipeline (525 passing in the core suite; spatial tests require
rasterio) - CAMELS benchmark — a 10-catchment validation subset of the CAMELS dataset ships with the repo at
data/camels_benchmark/and runs as part of CI - Every method cited — equations, decision trees, and DOI references for all 26 methodologies live in the theory guide
- JOSS paper in submission — see
paper.mdandpaper.bib
| Resource | What it covers |
|---|---|
| Features | Full capability list — hydrology, agriculture, ML, spatial, I/O |
| Data sources | All 12 sources, endpoints, API-key requirements |
| Theory guide | Equations, DOI citations, decision trees for every method |
| Methodology matrix | When to use which method |
| Architecture | How AquaScope is structured internally |
| FAQ · Troubleshooting | Common questions and fixes |
| Use cases | Real-world applications and case studies |
| Integration guides | xarray, QGIS, R interoperability |
| Contributing | How to add a data source, methodology, or test |
We welcome contributions from the global water and agriculture research community. Highest-impact contributions right now:
- New data source collectors — your country / region
- New research methodologies — expand the AI recommender
- New crop coefficients — extend the FAO Kc table
- Jupyter tutorials and validation studies — compare against HEC-SSP, R packages, etc.
See CONTRIBUTING.md, the adding a data source guide, and the adding a methodology guide.
If you use AquaScope in your research, please cite:
@software{aquascope2026,
title = {AquaScope: Open-Source Water Data Aggregation, Hydrological Analysis, and Agricultural Water Management Toolkit},
author = {AquaScope Contributors},
year = {2026},
url = {https://github.com/Rekin226/aquascope},
version = {0.5.0},
license = {MIT}
}MIT — see LICENSE.