# Extend the wqp search to nearby sites

Extend `wqp.get_results` to filter data from nearby sites.

1) given a search, as well as neighborhood search parameters, search sites in the neighborhood for matching data
2) The neighborhood search will use the NLDI. First look within the search radius for any matching sites upstream and downstream.
3) Apply some additional filter, mainly drainage area to filter out watersheds with drastically different sizes.
4) For example, you may search within an initial radius, then use `nwis.get_info(site)` to return the drainage area, then square root to come up with a distance measure and multiply that by some factor, say 0.1, then keep only sites within that distance that have similar area.
5) Finally, pull data from each of those sites and combine them into single dataframe, optionally add a boolean flag to indicate whether the data are "artificial",

## Notes

- Ultimately, we'd like to a construct a NWQN dataset. We can identify those data using the project ID field in `wqp.get_results`
- For testing, use the NWQN site on the Illinois river. Prior to 2018, the site was located at 05586100 then moved to 05586300.

In [None]:
from dataretrieval import nwis, wqp, nldi
import pandas as pd
import numpy as np

In [None]:
df, _ = nwis.get_info(sites='05586100')
distance = (np.sqrt(df.drain_area_va) * 0.1)[0]

gdf_features_up = nldi.get_features(feature_source="WQP", feature_id="USGS-05586100", navigation_mode="UT", distance=distance, data_source="nwissite")
gdf_features_down = nldi.get_features(feature_source="WQP", feature_id="USGS-05586100", navigation_mode="DD", distance=distance, data_source="nwissite")
gdf_features = pd.concat([gdf_features_up, gdf_features_down], ignore_index=True)

df, _ = nwis.get_info(sites=list(gdf_features.identifier.str.strip('USGS-')))
df = df.where(df.drain_area_va / df.drain_area_va.max() > 0.1).dropna(how='all')

data =pd.DataFrame()
for site in df.site_no.values:
    data = pd.concat([data, wqp.get_results(siteid=f'USGS-{site}')[0]], ignore_index=True)

In [None]:
data

In [None]:
m = gdf_basins.explore()
m = gdf_features.explore(m=m, color='red')
x.explore(m=m, color='green')