<b><font size="5">Historical Weather Finder</font></b>

The Historical Weather Finder fetches historical weather data for trails based on latitude and longitude. Data is averaged over 10 years to obtain climate data. For trails with a known start and end date, weather data can be obtained in addition to climate data. Thanks to Open-Meteo for the free API (https://open-meteo.com/en/docs/historical-weather-api).

In [1]:
### Imports

import numpy as np
import pandas as pd
import requests

In [2]:
### Defining Functions

# if there is an exisitng column, I use it; I build a new column from scratch otherwise
def getInitalArray(trails, col_name):
    if col_name in trails.columns:
        return trails[col_name].copy()
    else:
        return [None]*trails.shape[0]

# gets climate data via API and saves values
def addClimateDataAndSave(trails, printGap, outputCSV, forceOverwrite=False):

    numTrails = trails.shape[0]

    summerTemp = getInitalArray(trails, 'summer_temp')
    winterTemp = getInitalArray(trails, 'winter_temp')
    annualRain = getInitalArray(trails, 'annual_rain')
    
    # getting climate data one trail at a time
    for i, trail in enumerate(trails.iterrows()):
        trail = trail[1]
        if 'summer_temp' not in trail or 'winter_temp' not in trail or 'annual_rain' not in trail or \
            np.isnan(trail.summer_temp) or np.isnan(trail.winter_temp) or np.isnan(trail.annual_rain) or \
            forceOverwrite:
            
            response = requests.get('https://archive-api.open-meteo.com/v1/archive?' 
                                    + f'latitude={trail.lat}&longitude={trail.lng}&start_date' 
                                    + '=2011-01-01&end_date=2020-12-31&daily=temperature_2m_max,'
                                    + 'temperature_2m_min,rain_sum&timezone=America%2FNew_York')
            responseDaily = response.json()['daily']
            
            summerTemp[i] = np.quantile(responseDaily['temperature_2m_max'], 0.98) # 2% of days are hotter
            winterTemp[i] = np.quantile(responseDaily['temperature_2m_min'], 0.02) # 2% of days are colder
            annualRain[i] = np.sum(responseDaily['rain_sum'])/10 # average over 10 years
    
        # printing some progress notifications 
        if i%printGap==0:
            print(f'Climate data obtained for {i+1} out of {numTrails} trails')
            
    trails['summer_temp'] = summerTemp
    trails['winter_temp'] = winterTemp
    trails['annual_rain'] = annualRain

    # saving
    trails.to_csv(outputCSV, index=False)
    
# gets weather data via API and saves values
def addWeatherDataAndSave(trails, printGap, outputCSV, forceOverwrite=False):
    
    completedTrails = trails[trails.status == 0]
    numCompletedTrails = completedTrails.shape[0]
    
    maxTemp = getInitalArray(trails, 'max_temp')
    minTemp = getInitalArray(trails, 'min_temp')
    rain = getInitalArray(trails, 'rain')
    
    rainHours = getInitalArray(trails, 'rainHours')

    # getting weather data one trail at a time
    for i, trail in enumerate(completedTrails.iterrows()):
        trail = trail[1]
        if 'max_temp' not in trail or 'min_temp' not in trail or 'rain' not in trail \
            or np.isnan(trail.max_temp) or np.isnan(trail.min_temp) or np.isnan(trail.rain) or forceOverwrite:
            
            response = requests.get('https://archive-api.open-meteo.com/v1/archive?' 
                                    + f'latitude={trail.lat}&longitude={trail.lng}&' 
                                    + f'start_date={trail.start_date}&end_date={trail.end_date}' 
                                    + '&hourly=temperature_2m,rain&timezone=America%2FNew_York')
            responseHourly = response.json()['hourly']

            # I usually arrive and leave around noon, so I cut out the first and last 12 hours
            maxTemp[i] = np.max(responseHourly['temperature_2m'][12:-12]) # max over period
            minTemp[i] = np.min(responseHourly['temperature_2m'][12:-12]) # min over period
            rain[i] = np.sum(responseHourly['rain'][12:-12])
        
        # printing some progress notifications 
        if i%printGap==0:
            print(f'Weather data obtained for {i+1} out of {numCompletedTrails} trails')
    
    trails['max_temp'] = maxTemp
    trails['min_temp'] = minTemp
    trails['rain'] = rain

    # saving
    trails.to_csv(outputCSV, index=False)

In [8]:
### Getting Climate Data for My Trails

myTrails = pd.read_csv('MyTrails.csv')
addClimateDataAndSave(myTrails, 4, 'MyTrails.csv', forceOverwrite=False)

Climate data obtained for 1 out of 57 trails
Climate data obtained for 5 out of 57 trails
Climate data obtained for 9 out of 57 trails
Climate data obtained for 13 out of 57 trails
Climate data obtained for 17 out of 57 trails
Climate data obtained for 21 out of 57 trails
Climate data obtained for 25 out of 57 trails
Climate data obtained for 29 out of 57 trails
Climate data obtained for 33 out of 57 trails
Climate data obtained for 37 out of 57 trails
Climate data obtained for 41 out of 57 trails
Climate data obtained for 45 out of 57 trails
Climate data obtained for 49 out of 57 trails
Climate data obtained for 53 out of 57 trails
Climate data obtained for 57 out of 57 trails


In [4]:
### Getting Weather Data for My Trails

myTrails = pd.read_csv('MyTrails.csv')
addWeatherDataAndSave(myTrails, 1, 'MyTrails.csv', forceOverwrite=False)

Weather data obtained for 1 out of 20 trails
Weather data obtained for 2 out of 20 trails
Weather data obtained for 3 out of 20 trails
Weather data obtained for 4 out of 20 trails
Weather data obtained for 5 out of 20 trails
Weather data obtained for 6 out of 20 trails
Weather data obtained for 7 out of 20 trails
Weather data obtained for 8 out of 20 trails
Weather data obtained for 9 out of 20 trails
Weather data obtained for 10 out of 20 trails
Weather data obtained for 11 out of 20 trails
Weather data obtained for 12 out of 20 trails
Weather data obtained for 13 out of 20 trails
Weather data obtained for 14 out of 20 trails
Weather data obtained for 15 out of 20 trails
Weather data obtained for 16 out of 20 trails
Weather data obtained for 17 out of 20 trails
Weather data obtained for 18 out of 20 trails
Weather data obtained for 19 out of 20 trails
Weather data obtained for 20 out of 20 trails


In [63]:
### Getting Climate Data for USA National Park Trails

usaTrails = pd.read_csv('AllTrailsUsaNationalParks.csv', engine='python')
usaTrails['lat'] = usaTrails.apply(lambda row: row._geoloc.split(': ', 2)[1][:-7], axis=1)
usaTrails['lng'] = usaTrails.apply(lambda row: row._geoloc.split(': ', 2)[2][:-1], axis=1)
addClimateDataAndSave(usaTrails, 20, 'AllTrailsUsaNationalParks.csv')

Climate data obtained for 1 out of 3313 trails
Climate data obtained for 21 out of 3313 trails
Climate data obtained for 41 out of 3313 trails
Climate data obtained for 61 out of 3313 trails
Climate data obtained for 81 out of 3313 trails
Climate data obtained for 101 out of 3313 trails
Climate data obtained for 121 out of 3313 trails
Climate data obtained for 141 out of 3313 trails
Climate data obtained for 161 out of 3313 trails
Climate data obtained for 181 out of 3313 trails
Climate data obtained for 201 out of 3313 trails
Climate data obtained for 221 out of 3313 trails
Climate data obtained for 241 out of 3313 trails
Climate data obtained for 261 out of 3313 trails
Climate data obtained for 281 out of 3313 trails
Climate data obtained for 301 out of 3313 trails
Climate data obtained for 321 out of 3313 trails
Climate data obtained for 341 out of 3313 trails
Climate data obtained for 361 out of 3313 trails
Climate data obtained for 381 out of 3313 trails
Climate data obtained for 

Climate data obtained for 3301 out of 3313 trails
