# Find Nearest Monitoring Station

Once in a while when working with weather data we face the need to identify the nearest station to a specific place, landmark, farm, or city.

Calculating the true shortest distance between two points on the Earth's surface requires detailed information about terrain topography and the Earth's ellipticity, which would require accessing additional datasets. In most cases we are only interested in approximate distances to allow us identifying the nearest station. The haversine formula is perhaps one of the most widely used approaches to approximate the great-circle distance between two points on a sphere given their longitudes and latitudes. This calculation is approximate since the  rotating Earth has the shape of an oblate spheroid rather than a sphere, but it will suffice our application.


In [1]:
# Import modules
import pandas as pd
import numpy as np

from bokeh.plotting import figure, output_notebook, show
output_notebook()

In [2]:
# Import stations from the Soil Climate Analysis Network
scan = pd.read_csv('../datasets/SCAN_stations_geoinfo.csv')
print(scan['state'].unique())
scan.head()



['AK' 'AL' 'AR' 'AZ' 'CA' 'CO' 'FL' 'GA' 'HI' 'IA' 'ID' 'IL' 'KS' 'KY'
 'MD' 'MN' 'MO' 'MS' 'MT' 'NC' 'ND' 'NE' 'NH' 'NM' 'NV' 'NY' 'OH' 'OK'
 'OR' 'PA' 'PR' 'SC' 'SD' 'TN' 'TX' 'UT' 'VA' 'VI' 'VT' 'WA' 'WI' 'WY']


Unnamed: 0,network,state,county,site_name,start,lat,lon,elev
0,SCAN,AK,Bethel,Aniak,2002,61.58,-159.58,80
1,SCAN,AK,Bethel,Canyon Lake,2014,59.42,-161.16,550
2,SCAN,AK,Nome,Checkers Creek,2014,65.4,-164.71,326
3,SCAN,AK,Yukon-koyukuk,Hozatka Lake,2014,65.2,-156.63,206
4,SCAN,AK,Yukon-koyukuk,Innoko Camp,2014,63.64,-158.03,83


In [3]:
# Select stations only in the conterminous U.S.
idx_drop = scan['state'].isin(['AK','HI','PR','VI'])
scan = scan[~idx_drop].reset_index(drop=True)
scan.head(100)

Unnamed: 0,network,state,county,site_name,start,lat,lon,elev
0,SCAN,AL,Madison,AAMU-JTG,2002,34.78,-86.55,860
1,SCAN,AL,Madison,Bragg Farm,2003,34.89,-86.60,798
2,SCAN,AL,Montgomery,Broad Acres,2010,32.28,-86.05,269
3,SCAN,AL,Cullman,Cullman-NAHRC,2006,34.19,-86.80,799
4,SCAN,AL,Pickens,Dee River Ranch,2010,33.11,-88.31,160
...,...,...,...,...,...,...,...,...
95,SCAN,NH,Grafton,Hubbard Brook,2002,43.93,-71.72,1480
96,SCAN,NH,Grafton,Mascoma River,1998,43.78,-72.03,1400
97,SCAN,NM,Lincoln,Adams Ranch #1,1993,34.25,-105.42,6175
98,SCAN,NM,Rio Arriba,Alcalde,2010,36.09,-106.06,5693


In [7]:
f = figure(match_aspect=True, plot_height=300)
f.circle(x='lon', y='lat', size=7, fill_color="blue", fill_alpha=0.8, source=scan)
show(f)

In [8]:
def haversine(lat_point,lon_point,lat_list, lon_list):
    
    """Haversine function: Computes distance between the geogrpahic
    coordinates of two points on a sphere or a point and list of points.
    
    Inputs: geographic coordinates in decimal degrees.
    Output: Approximate distance in kilometers"""
    
    # Convert point coordinates to radians
    lat_point = np.radians(lat_point)
    lon_point = np.radians(lon_point)

    # Convert list coordinates to radians
    lat_list = np.radians(lat_list)
    lon_list = np.radians(lon_list)

    # Compute deltas to simplify equation below
    lat_delta = lat_list - lat_point
    lon_delta = lon_list - lon_point

    # Define average Earth radius in kilometers
    earth_radius = (6356.752 + 6378.137)/2 # (Radius at the poles + Radius at the Ecuator)/2

    # Haversine formula
    a = (np.sin(lat_delta/2))**2 + np.cos(lat_point) * np.cos(lat_list) * (np.sin(lon_delta/2))**2
    d = 2 * earth_radius * np.arcsin(np.sqrt(a))
    return d


In [14]:
# Define a location.
# Latitude and Longitude for the geogrpahic center for the US, which is located in Kansas.
lat_center_usa = 39.828344
lon_center_usa = -98.579473

f = figure(match_aspect=True, plot_height=300)
f.circle(x='lon', y='lat', size=7, fill_color="blue", fill_alpha=0.8, source=scan)
f.inverted_triangle(x=lon_center_usa, y=lat_center_usa, size=12, color='black', fill_color="red")
f.xaxis.axis_label = 'Longitude'
f.yaxis.axis_label = 'Latitude'
show(f)


In [15]:
# Compute distances using the haversine function defined above
distances = haversine(lat_center_usa, lon_center_usa, scan["lat"], scan["lon"])


In [16]:
# Find shortest distance to point (find nearest station)
idx_nearest = np.argmin(distances)
idx_farthest = np.argmax(distances)


# Summary
print('The nearest SCAN station is', scan.loc[idx_nearest,"site_name"], '(about',
      round(distances[idx_nearest]), 'km from the point of interest)')

# Print station details
print(scan.loc[idx_nearest])

The nearest SCAN station is Phillipsburg (about 64 km from the point of interest)
network              SCAN
state                  KS
county           Phillips
site_name    Phillipsburg
start                2004
lat                 39.79
lon                -99.33
elev                 1986
Name: 55, dtype: object


In [22]:
# Access individual values
scan.loc[idx_nearest,'lat']
scan.loc[idx_nearest,'lon']
scan.loc[idx_nearest,'site_name']


'Phillipsburg'

In [20]:
f = figure(match_aspect=True, plot_height=300)
f.circle(x='lon', y='lat', size=7, fill_color="blue", fill_alpha=0.8, source=scan)
f.inverted_triangle(x=lon_center_usa, y=lat_center_usa, size=12, color='black', fill_color="red")
f.diamond_cross(x=scan.loc[idx_farthest,'lon'], y=scan.loc[idx_farthest,'lat'], 
         size=25, color="green", fill_color=None, 
         legend_label='Farthest SCAN station')
f.xaxis.axis_label = 'Longitude'
f.yaxis.axis_label = 'Latitude'
f.legend.location='bottom_left'
show(f)


## References

Haversine formula: https://www.wikiwand.com/en/Haversine_formula

Versines: https://www.wikiwand.com/en/Versine
