# Quickstart Guide

In [1]:
from hydrodata import Station, utils
import hydrodata.datasets as hds
from hydrodata import plot
from hydrodata import services
from arcgis2geojson import arcgis2geojson
import matplotlib.pyplot as plt
import geopandas as gpd
import warnings

In [2]:
warnings.filterwarnings("ignore")

Here are some examples to quickly get you started with basic capabilities of HydroData. The rest of documentation details other funtionalities of HydroData in more details.

## Station

`Station` provides some information for a USGS station based on ID or coordinates (longitude and latitude). It requires at least three parameters: start date, end date and a USGS station ID or coordinates. Upon instantiation, the station and its watershed characteristics are found using [NLDI](https://labs.waterdata.usgs.gov/about-nldi/) and [USGS Site Information](https://waterdata.usgs.gov/nwis/si) services.

In [None]:
wshed = Station('2000-01-01', '2010-01-21', station_id='01031500')
wshed.hcdn

The ``wshed.hcdn`` property shows that this station is located in a natural environment and is not affected by human activity.

## NHDPlus

The river network including tributaries and main river channel for the watershed can be retrived from NHDPlus database. Additionally, using the retrieved information such as the watershed geometry we can then use the `datasets` module to access other databases. For example, we can find the USGS stations upstream (or downstream) of the main river channel (or tributatires) up to a certain distance, say 150 km. Also, all the USGS stations inside the watershed can be found:

In [None]:
tributaries = hds.NLDI.tributaries(wshed.station_id)
main_channel = hds.NLDI.main(wshed.station_id)
stations = hds.NLDI.stations(wshed.station_id)
stations_upto_150 = hds.NLDI.stations(wshed.station_id, navigation="upstreamMain", distance=150)

ax = wshed.basin.plot(color='white', edgecolor='black', zorder=1, figsize = (8, 8))
tributaries.plot(ax=ax, label='Tributaries', zorder=2)
main_channel.plot(ax=ax, color='green', lw=3, label='Main', zorder=3)
stations.plot(ax=ax, color='black', label='All stations', marker='s', zorder=4)
stations_upto_150.plot(ax=ax, color='red', label='Stations up to 150 km upstream of main', marker='*', zorder=5)
ax.legend(loc='best')
ax.figure.set_dpi(100);

## Data for Single Pixel 

The climate data and streamflow observations for the location of interest can be retrieved and plotted using ``plot`` module that plots five hydrologic signatures graphs in one plot.

In [None]:
clm_p = hds.daymet_byloc(wshed.lon, wshed.lat, start=wshed.start, end=wshed.end)
clm_p['Q (cms)'] = hds.nwis_streamflow(wshed.station_id, wshed.start, wshed.end)

In [None]:
plot.signatures({"Q": (clm_p['Q (cms)'], wshed.drainage_area)}, clm_p["prcp (mm/day)"])

## Gridded Datasets

Other than point-based data, gridded data can also be accessed at the desired resolution. Furthermore, the watershed geometry can be used to mask the gridded data.

DEM can be retrieved for the station's contributing watershed at 1 arc-second (30 m) resolution, as follows:

In [None]:
dem = hds.nationalmap_dem(wshed.geometry, resolution=1)
ax = dem.plot(size=6)
ax.figure.set_dpi(100);

In [None]:
variables = ["tmin", "tmax", "prcp"]
clm_g = hds.daymet_bygeom(wshed.geometry, start='2005-01-01', end='2005-01-31', variables=variables, pet=True)
eta_g = hds.ssebopeta_bygeom(wshed.geometry, start='2005-01-01', end='2005-01-31')

All the gridded data are returned as [xarray](https://xarray.pydata.org/en/stable/) datasets that has efficient data processing tools. Note that Daymet dataset's projection is [Lambert](https://daymet.ornl.gov/overview).

In [None]:
fig, axes = plt.subplots(ncols=2, figsize=(13, 5))
clm_g.prcp.isel(time=10).plot(ax=axes[0])
eta_g.isel(time=10).plot(ax=axes[1])
fig.set_dpi(100);

## Adding New Database

The ``services`` module can be used for accessing [Los Angeles GeoHub](http://geohub.lacity.org/) RESTful service, NationalMap's [3D Eleveation Program](https://www.usgs.gov/core-science-systems/ngp/3dep) via WMS and [FEMA National Flood Hazard Layer](https://www.fema.gov/national-flood-hazard-layer-nfhl) via WFS for a watershed in Los Angeles:

In [None]:
la_wshed = Station('2005-01-01', '2005-01-31', '11092450')

url_rest = "https://maps.lacity.org/lahub/rest/services/Stormwater_Information/MapServer/10"
s = services.ArcGISREST(url_rest, outFormat="json")
s.get_featureids(la_wshed.geometry)
storm_pipes = s.get_features()

url_wms = "https://elevation.nationalmap.gov/arcgis/services/3DEPElevation/ImageServer/WMSServer"
slope = services.wms_bygeom(
                  url_wms,
                  "3DEP",
                  geometry=wshed.geometry,
                  version="1.3.0",
                  layers={"slope": "3DEPElevation:Slope Degrees"},
                  outFormat="image/tiff",
                  resolution=1)

url_wfs = "https://hazards.fema.gov/gis/nfhl/services/public/NFHL/MapServer/WFSServer"
wfs = services.WFS(
    url_wfs,
    layer="public_NFHL:Base_Flood_Elevations",
    outFormat="esrigeojson",
    crs="epsg:4269",
)
r = wfs.getfeature_bybox(la_wshed.geometry.bounds, in_crs="epsg:4326")
flood = utils.json_togeodf(r.json(), "epsg:4269", "epsg:4326")

## Flow Accumulation

For demonstration purposes, lets assum the flow in each river segment is equal to the length of the river segment. Therefore, it should produce the same results as the ``arbolatesu`` variable in the NHDPlus database.

In [None]:
flw.loc[flw["comid"] == 22515878, "lengthkm"].array

In [3]:
flw = utils.prepare_nhdplus(hds.NLDI.flowlines('11092450'), 0, 0, purge_non_dendritic=False)

def routing(qin, q):
    return qin + q

qsim = utils.vector_accumulation(flw[["comid", "tocomid", "lengthkm"]], routing, "lengthkm", ["lengthkm"], threading=False)
flw = flw.merge(qsim, on="comid")
diff = flw.arbolatesu - flw.acc
diff.abs().sum()

AttributeError: 'DataFrame' object has no attribute 'array'