This example shows how to use hydrolinking methods to address a point to the NHD High Resolution.

For this example we use a point near the Red Cedar River in East Lansing, MI.

##### First set required parameters.
The input parameters are the same for NHDPlusV2.1 and NHD High Resolution hydrolinking methods.

Required parameters include:
    *input_identifier: str, user supplied identifier
    *input_lat: float, latitude of the point to be hydrolinked
    *input_lon: float, longitude of the point to be hydrolinked
    
Other optional parameters include:
    *input_crs: int, coordinate reference system, default is 4269 (NAD83)
    *stream_name: name of the stream a point is intended to be linked
    *buffer_m = search distance in meters, default is 1000, max is 2000

In [1]:
#identifier -> number or string
input_identifier = 1
#latitude
input_lat = 42.7284
#longitude
input_lon = -84.5026
#coordinate system
input_crs = 4269  
#stream name
stream_name = 'The Red Cedar River'
#buffer
buffer_m = 2000

The next cell runs three steps. First the hydrolink module for nhdplusV2.1 is imported.  Next a Python object is initiated and finally the code attempts to set the coordinate system to NAD83 (crs = 4269).

Note: It is recommended to use crs 4269, although this code attempts to accomidate other common coordinate systems by using some simple logic.  This has not been tested a lot and may cause the code to bypass hydrolinking if conversion fails. If this happens a message will be generated for the user.

In [2]:
#Import module from local folder hydrolink
from hydrolink import nhd_hr
import warnings; warnings.simplefilter('ignore')

#initiate a Python object, this tries reprojecting data to nad83 and checks if data within bounding box of U.S.
hydrolink = nhd_hr.HighResPoint(input_identifier, input_lat, input_lon, water_name=stream_name, input_crs=input_crs, buffer_m=buffer_m)

In [3]:
#builds query against EPA web services
hydrolink.build_nhd_query(service='hem_flow')

#executes query and measures distances from point to each line and their nodes
hydrolink.get_closest_reaches(n=6)

#stream name match for each of the reaches
hydrolink.water_name_match()

#show result, there were only 3 reaches within 2000 meters of this point
hydrolink.reach_df

Unnamed: 0,GNIS_NAME,LENGTHKM,PERMANENT_IDENTIFIER,REACHCODE,closest_node_m,snap_m,snap_xy,closest,name_check,name_check_txt
0,Red Cedar River,7.032,152093413,4050004000126,4733.247401,52.940605,POINT (-84.50249251260891 42.7288672003429),1,0.882353,most likely name match based on fuzzy match
1,,0.035,152093660,4050004002359,,559.105386,POINT (-84.5081933126001 42.72549706701477),2,0.0,"no name match, water name and or gnis name not..."
2,,0.116,152091797,4050004002359,,566.724935,POINT (-84.50850391259956 42.7257132003478),3,0.0,"no name match, water name and or gnis name not..."
3,,0.455,152091798,4050004002359,585.733504,585.733504,POINT (-84.50368877927372 42.72321920035165),4,0.0,"no name match, water name and or gnis name not..."
4,,0.011,152093560,4050004002359,595.535732,590.99255,POINT (-84.50938251259822 42.72651486701324),5,0.0,"no name match, water name and or gnis name not..."
5,Red Cedar River,1.746,152093412,4050004000126,1963.969166,595.535732,POINT (-84.50947937926475 42.7265880670131),6,0.882353,most likely name match based on fuzzy match


In the next cell, a selection of one reach is made using the following criteria.  This reach selection identifies the "best reach".  In other words, of the reaches returned in the query this step selects the reach most likely associated with the input data point.
 
        * First check for exact stream name matches (GNIS_NAME == stream_name)
           * If number of reaches with exact name match equals 1 that is the reach to recommend
           * If number of reaches with exact name match > 1 take the one that is closest to the point
           * If more than one reach has exact name match and are equally close to the point grab the first but note that                               multiple reaches so that we can recommend taking a closer look
        * If no reaches have an exact name match and the user supplied name does not refer to a tributary then check for fuzzy name matches over 0.75 cutoff.  
            * If number of reaches with fuzzy name match equals 1 that is the reach to recommend
            * If number of reaches with fuzzy name match > 1 take the one that is closest to the point
                * If more than one reach has fuzzy name match and are equally close to the point grab the first one but note                           that multiple reaches so that we can recommend taking a closer look
        * If fuzzy match < 0.75 just take closest reach.

In [4]:
#Return reach most likely associated with point
hydrolink.select_best_reach()

#show results
hydrolink.best_reach

{'input_id': 1,
 'input_water': 'The Red Cedar River',
 'water_name_ref': 'the red cedar river',
 'GNIS_NAME': 'Red Cedar River',
 'LENGTHKM': 7.032,
 'PERMANENT_IDENTIFIER': '152093413',
 'wb_id': None,
 'wb_gnis_name': None,
 'REACHCODE': '04050004000126',
 'snap_xy': <shapely.geometry.point.Point at 0x18f9d6decc8>,
 'snap_m': 52.94060499557293,
 'closest_node_m': 4733.247401301121,
 'closest': 1,
 'name_check': 0.8823529411764706,
 'mult_reach_ct': 1,
 'name_check_txt': 'most likely name match based on fuzzy match',
 'hr_meas': None,
 'message': ''}

The following cell finalizes point addressing by retrieving the measure along the "best reach" where the point should be associated.  This method takes a striaght line snap approach.  The measure value is measured as a percent of the way "up" a reach(0-100) where 0 is at the most downstream node of the reach and 100 is the most upstream node of the reach. Note that all reachcodes start at 0 and end at 100, but there may be multiple flowlines that make up a single reach.

In [5]:
hydrolink.get_hl_measure()

hydrolink.best_reach['hr_meas']

19.83811

There are a few options for writing data to CSV files.  Both are set up with intentions of batch hydrolinking (multiple points -> see batch example notebook). 

Both methods accept an optional parameter for 'outfile_name'. 
    * default for write_reach_options() is 'hr_hydrolink_output.csv'
    * default for write_best() is 'hr__hydrolink_output.csv'

In [6]:
#Creates a CSV file of all reach options.  This can be used for QAQC procedures.
hydrolink.write_reach_options()

#Creates a CSV file with one reach per point.  
hydrolink.write_best()

In [7]:
#View final data for best reach
import pandas as pd
pd.DataFrame([hydrolink.best_reach])

Unnamed: 0,input_id,input_water,water_name_ref,GNIS_NAME,LENGTHKM,PERMANENT_IDENTIFIER,wb_id,wb_gnis_name,REACHCODE,snap_xy,snap_m,closest_node_m,closest,name_check,mult_reach_ct,name_check_txt,hr_meas,message
0,1,The Red Cedar River,the red cedar river,Red Cedar River,7.032,152093413,,,4050004000126,POINT (-84.50249251260891 42.7288672003429),52.940605,4733.247401,1,0.882353,1,most likely name match based on fuzzy match,19.83811,
