# Filter point observations to pre-defined site networks

To launch this notebook interactively in a Jupyter notebook-like browser interface, please click the "Launch Binder" button below. Note that Binder may take several minutes to launch.

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/hydroframe/subsettools-binder/HEAD?labpath=hf_hydrodata/point/example_site_networks.ipynb)

This notebook showcases functionality of the `get_point_data` and `get_point_metadata` functions to filter sites based on a pre-defined site network. 

For USGS stream gages, the currently-supported set of site networks include:

  - [GAGESII](https://pubs.usgs.gov/publication/70046617) ('gagesii')
  - [GAGESII reference gages](https://pubs.usgs.gov/publication/70046617) ('gagesii_reference')
  - [HCDN-2009](https://water.usgs.gov/osw/hcdn-2009/) ('hcdn2009')
  - [CAMELS](https://ral.ucar.edu/solutions/products/camels) ('camels')

For USGS groundwater wells, the currently-supported set of site networks include:

  - [Climate Response Network](https://water.usgs.gov/ogw/networks.html) ('climate_response_network')

Please see the full [point module](https://hf-hydrodata.readthedocs.io) documentation for information on what data is available, our data collection process, and new features we are working on! Our [Metadata Description](https://hf-hydrodata.readthedocs.io/en/latest/available_metadata.html#point-observations-metadata) page itemizes the fields that get returned from `get_point_metadata`.

In [1]:
# Import packages
from hf_hydrodata import register_api_pin, get_point_data, get_point_metadata
import pandas as pd

In [None]:
# You need to register on https://hydrogen.princeton.edu/pin 
# and run the following with your registered information
# before you can use the hydrodata utilities
register_api_pin("your_email", "your_pin")

Note that `get_point_data` and `get_point_metadata` require mandatory parameters of `dataset`, `variable`, `temporal_resolution`, and `aggregation` (and `depth_level` if asking for soil moisture data). Please see [the documentation](https://hf-hydrodata.readthedocs.io/en/latest/available_data.html) for information about what point observation datasets are available and the parameters used to query them. 

The [hf_hydrodata API Reference](https://hf-hydrodata.readthedocs.io/en/latest/hf_hydrodata.point.html) includes information on what optional filtering parameters are available. These include filters for things like a geographic region or date range. Those parameters work cumulatively, so if `state` and `site_ids` are both supplied, for example, then only sites within `site_ids` that are *also* in `state` will be returned.

## Example: Query stream gage data for GAGES-II sites in Colorado

In this example, we are interested in querying the stream gages that are part of the GAGES-II network within the state of Colorado (`state = 'CO'`). We'll focus on data within Water Year 2003, so we'll set `date_start='2002-10-01'` and `date_end='2003-09-30'`. Note that we are setting `site_networks='gagesii'` to get only stream gages that are part of the GAGES-II network.

In [2]:
# Get point observations data
data_df = get_point_data(dataset="usgs_nwis", variable="streamflow", temporal_resolution="daily", aggregation="mean",
                         date_start="2002-10-01", date_end="2003-09-30", 
                         state="CO", site_networks="gagesii")

# View the first five records
data_df.head(5)

Unnamed: 0,date,06614800,06620000,06659580,06696980,06700000,06701500,06701620,06701900,06707500,...,09371000,09371010,09371492,09371520,09372000,393109104464500,394308105413800,394839104570300,401733105392404,402114105350101
0,2002-10-01,0.01981,0.97635,,0.116879,,8.4051,,9.5371,11.3766,...,0.0,14.15,0.023489,0.46978,0.274793,0.04811,0.61977,1.26784,0.071316,0.33677
1,2002-10-02,0.021508,1.01031,,0.148009,,8.3485,,9.8767,12.169,...,0.066505,15.6782,0.040469,0.53204,0.28583,0.281868,0.81504,2.81019,0.071316,0.39903
2,2002-10-03,0.022357,1.23388,,0.16414,,7.1882,,8.5466,10.2729,...,0.15565,19.1874,0.091975,1.10936,0.47261,0.249889,0.88862,1.23954,0.069618,0.46695
3,2002-10-04,0.025753,1.81969,,0.146877,,5.3204,,5.9996,8.2919,...,0.35092,19.6119,0.043582,0.50657,0.89428,0.219325,0.70184,0.68769,0.068203,0.43299
4,2002-10-05,0.024621,1.981,,0.143198,,4.4997,,5.0374,6.509,...,0.060279,22.7249,0.026885,0.47261,0.58581,0.191591,0.64807,0.47827,0.066788,0.43865


In [3]:
# Get site-level attributes for these sites
metadata_df = get_point_metadata(dataset="usgs_nwis", variable="streamflow", temporal_resolution="daily", aggregation="mean",
                                 date_start="2002-10-01", date_end="2003-09-30", 
                                 state="CO", site_networks="gagesii")

# View the first five records
metadata_df.head(5)

Unnamed: 0,site_id,site_name,site_type,agency,state,latitude,longitude,first_date_data_available,last_date_data_available,record_count,...,doi,huc8,conus1_x,conus1_y,conus2_x,conus2_y,gagesii_drainage_area,gagesii_class,gagesii_site_elevation,usgs_drainage_area
0,6614800,"MICHIGAN RIVER NEAR CAMERON PASS, CO",stream gauge,USGS,CO,40.496094,-105.865012,1973-10-01,2023-12-01,18322,...,,10180001,1054.0,818.0,1481,1764,4.0284,Ref,3188.0,1.54
1,6620000,"NORTH PLATTE RIVER NEAR NORTHGATE, CO",stream gauge,USGS,CO,40.936639,-106.339194,1904-06-01,2023-12-01,39782,...,,10180001,1020.0,870.0,1448,1817,3702.637,Non-ref,2388.0,1431.0
2,6659580,SAND CREEK AT COLORADO-WYOMING STATE LINE,stream gauge,USGS,CO,40.99365,-105.759703,1968-10-01,2020-09-01,10075,...,,10180010,,,1496,1814,79.11089,Non-ref,2323.0,29.2
3,6696980,"TARRYALL CREEK AT UPPER STATION NEAR COMO, CO",stream gauge,USGS,CO,39.339433,-105.911681,1978-06-01,2023-10-13,5420,...,,10190001,1036.0,690.0,1466,1639,61.9065,Ref,3040.0,23.9
4,6700000,"SOUTH PLATTE RIVER ABOVE CHEESMAN LAKE, CO.",stream gauge,USGS,CO,39.162769,-105.310273,1924-10-01,2023-09-30,9523,...,,10190002,,,1515,1617,4213.538,Non-ref,2092.0,1627.0


This gives us the data for the 289 Colorado GAGES-II sites that have data within the specified date range.