# USGS dataretrieval Python Package NLDI Data Access Examples

This notebook provides examples of using the Python dataretrieval package to retrieve data from the United States  Geological Survey (USGS) Hydro Network-Linked Data Index (NLDI). The dataretrieval package provides a collection of functions to get data from the USGS Hydro Network-Linked Data Index (NLDI).

### Install the Package

Use the following code to install the package if it doesn't exist already within your Jupyter Python environment. Note the `nldi` option in the `dataretrieval` package installation. The default `dataretrieval` does not support NLDI data access.

In [None]:
!pip install dataretrieval[nldi]

Load the package so that you can use its functions in this notebook.

In [None]:
from dataretrieval import nldi
from IPython.display import display

### Basic Usage

The dataretrieval package provides a number of functions to get data from the USGS NLDI. 

#### The following examples show how to use the `get_basin()` function from the dataretrieval package to get basin data from the USGS NLDI. The following arguments are supported:

* **feature_source** (string): The name of the NLDI feature source.
* **feature_id** (string): The identifier of the NLDI feature.
* **simplified** (boolean): If True, the data will be returned with simplified polygons. If False, the data will be returned as a single polygon (default is False).
* **split_catchment** (boolean): If True, the data will be returned with split catchment polygons. If False, the data will be returned as a single polygon (default is False) NOTE: Setting this to True may result in error due to a known issue with NLDI API.
* **as_json** (boolean): If True, the data will be returned as a python dictionary. If False, the data will be returned as a geopandas dataframe (default is False).


#### Example 1: Get aggregated basin level data for a single feature source.

In [None]:
# set the parameters needed to retrieve data
feature_source = "WQP"
feature_id = "USGS-01031500"

Get the basin data as a geopandas dataframe

In [None]:
gdf = nldi.get_basin(feature_source, feature_id)
display(gdf)

Get the basin data as GeoJSON (as_json=True)

In [None]:
basin_json_data = nldi.get_basin(feature_source, feature_id, as_json=True)
print(basin_json_data)

#### The following examples show how to use the `get_flowlines()` function from the dataretrieval package to get flowlines data from the USGS NLDI. The following arguments are supported:

* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD').
* **feature_source** (string): The name of the NLDI feature source.
* **feature_id** (string): The identifier of the NLDI feature.
* **comid** (integer): COMID (required if feature_resource is not specified).
* **distance** (integer): Distance in kilometers (default is 5).
* **as_json** (boolean): If True, the data will be returned as a python dictionary. If False, the data will be returned as a geopandas dataframe (default is False).

#### Example 1: Get the flowlines data using feature_source and feature_id

Get the flowlines data as a geopandas dataframe

In [None]:
gdf = nldi.get_flowlines(navigation_mode='UM', feature_source="WQP", feature_id="USGS-01031500")
display(gdf)

Get the flowlines data as GeoJSON (as_json=True)

In [None]:
flowlines_json_data = nldi.get_flowlines(navigation_mode='UM', feature_source="WQP", feature_id="USGS-01031500", as_json=True)
print(flowlines_json_data)

#### Example 2: Get the flowlines data using comid

Get the flowlines data as a geopandas dataframe

In [None]:
gdf = nldi.get_flowlines(navigation_mode='UM', comid=13294314)
display(gdf)

Get the flowlines data as GeoJSON (as_json=True)

In [None]:
flowlines_json_data = nldi.get_flowlines(navigation_mode='UM', comid=13294314, as_json=True)
print(flowlines_json_data)

#### The following examples show how to use the `get_features()` function from the dataretrieval package to get features data from the USGS NLDI. The following arguments are supported:

* **data_source** (string): The name of the NLDI data source.
* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD').
* **feature_source** (string): The name of the NLDI feature source.
* **feature_id** (string): The identifier of the NLDI feature (required if feature_resource is specified).
* **comid** (integer): COMID (required if feature_resource is not specified).
* **distance** (integer): Distance in kilometers (default is 50).
* **lat** (float): Latitude (required if feature for a specific location is specified).
* **long** (float): Longitude (required if feature for a specific location is specified).
* **as_json** (boolean): If True, the data will be returned as a python dictionary. If False, the data will be returned as a geopandas dataframe (default is False).

### Example 1: Get all features along the specified navigation path.

Get the features data using navigation path (UM) and origin type feature source

In [None]:
gdf = nldi.get_features(data_source="census2020-nhdpv2", navigation_mode="UM", feature_source="WQP", feature_id="USGS-01031500")
display(gdf)

Get the features data using navigation path (UM) and origin type COMID 

In [None]:
gdf = nldi.get_features(data_source="census2020-nhdpv2", navigation_mode="UM", comid=13294314)
display(gdf)

Get the features data using origin type feature source (no navigation path)

In [None]:
gdf = nldi.get_features(feature_source="WQP", feature_id="USGS-01031500")
display(gdf)

Get the features data using navigation path (UM) and origin type COMID

In [None]:
gdf = nldi.get_features(comid=13294314, data_source="census2020-nhdpv2", navigation_mode="UM")
display(gdf)

Get the features data for a specific location (lat, long)

In [None]:
gdf = nldi.get_features(lat=43.073051, long=-89.401230)
display(gdf)

#### The following examples show how to use the `search()` function from the dataretrieval package to get data (basins, flowlines, and features) from the USGS NLDI. You can use this `search()` function instead of the `get_basin()`, `get_flowlines()`, and `get_features()` functions described above. The search function returns data as a python dictionary. The following arguments are supported:

* **feature_source** (string): The name of the NLDI feature source.
* **feature_id** (string): The identifier of the NLDI feature (required if feature_resource is specified).
* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD').
* **data_source** (string): The name of the NLDI data source.
* **find** (string): The specific data type to search for. Allowed values are 'basin', 'flowlines', and 'feature' (default is 'features').
* **comid** (integer): COMID (required if feature_resource is not specified).
* **lat** (float): Latitude (required if feature for a specific location is specified).
* **long** (float): Longitude (required if feature for a specific location is specified).
* **distance** (integer): Distance in kilometers (default is 50).

#### Example 1: Get aggregated basin level data for a single feature source.

In [None]:
# set the parameters needed to retrieve data
feature_source = "WQP"
feature_id = "USGS-01031500"

In [None]:
basin_data = nldi.search(feature_source=feature_source, feature_id=feature_id, find="basin")
print(basin_data)

#### Example 2: Get flowlines data for a specified feature source.

In [None]:
flowlines_data = nldi.search(navigation_mode='UM', feature_source=feature_source, feature_id=feature_id, find="flowlines")
print(flowlines_data)

### Example 3: Get all features along the specified navigation path.

In [None]:
features_data = nldi.search(data_source="census2020-nhdpv2", navigation_mode='UM', feature_source=feature_source,
                            feature_id=feature_id, find="features")
print(features_data)