# Meteorological Data Collection: Meteostat API
Along side with NASA POWER API we came across with [METEOSTAT API](https://dev.meteostat.net/guide.html#our-services) to help us to collect new weather data points to help understand the nature of wildfires. The data provided by organizations like NOAA, DWD and Environment Canada is a valuable resource to science, education, businesses and every individual looking for weather and climate data.  
**This data is based on the ground monitoring stations compared to space-based data we got via POWER API, so there were a lot of missing values. Because of that, we ended up not using it in our predictions.**

In [2]:
# Imports
import pandas as pd
import numpy as np

from datetime import datetime, timedelta
from meteostat import Point, Daily, Hourly, Stations

# Importing sys
import sys

# adding Config file
sys.path.insert(0, "../config/")

from config import Config

In [4]:
# Load wildfire dataset
df = pd.read_csv('../../data/cleaned/wildfires_all.csv')

df.head()

Unnamed: 0,X,Y,ContainmentDateTime,ControlDateTime,DailyAcres,DiscoveryAcres,FireCause,FireDiscoveryDateTime,IncidentTypeCategory,IncidentTypeKind,InitialLatitude,InitialLongitude,IrwinID,LocalIncidentIdentifier,POOCounty,POODispatchCenterID,POOFips,POOState,UniqueFireIdentifier
0,-111.348611,33.195755,2020-07-23 05:29:59+00:00,2020-07-23 05:29:59+00:00,8.0,2.5,Human,2020-07-22 21:51:00+00:00,WF,FI,33.19581,-111.3487,{951823FA-0B72-4295-87C8-E042D602324E},1450,Pinal,AZTDC,4021,US-AZ,2020-AZA3S-001450
1,-115.748812,40.617506,2020-08-03 23:00:00+00:00,2020-09-02 15:00:00+00:00,5985.9,5.0,Natural,2020-07-19 23:00:00+00:00,WF,FI,40.602563,-115.719777,{91E0CBAB-A24E-4590-B6C6-2B4A46907E8A},10145,Elko,NVEIC,32007,US-NV,2020-NVECFX-010145
2,-108.193611,39.858486,2020-08-30 00:00:00+00:00,2020-09-10 14:00:00+00:00,0.1,1.0,Natural,2020-08-29 21:46:00+00:00,WF,FI,39.89171,-108.2665,{3568D344-E3FB-415C-8014-ED34ECEAAB25},323,Rio Blanco,COCRC,8103,US-CO,2020-COWRD-000323
3,-109.703111,40.227646,2020-10-28 20:15:00+00:00,2020-10-28 20:15:00+00:00,0.1,0.1,Human,2020-10-28 19:37:00+00:00,WF,FI,40.2277,-109.703169,{4BEBC503-DACD-4198-A1D8-323B614DA555},100463,Uintah,UTUBC,49047,US-UT,2020-UTNES-100463
4,-110.385511,31.961145,2020-07-10 18:14:59+00:00,2020-07-10 18:14:59+00:00,0.1,0.1,Human,2020-07-09 16:34:59+00:00,WF,FI,31.9612,-110.3856,{FB125AAC-0DE2-4547-A2D3-32891D98CB0F},1263,Cochise,AZTDC,4003,US-AZ,2020-AZA3S-001263


In [5]:
# Convert date columns into pandas datetime
df['FireDiscoveryDateTime'] = pd.to_datetime(df['FireDiscoveryDateTime'], infer_datetime_format=True, errors = 'coerce')
df['ControlDateTime'] = pd.to_datetime(df['ControlDateTime'], infer_datetime_format=True, errors = 'coerce')

In [68]:
def get_weather_meteo(id, lat, long, start, end, result_df):
  """Helper function to pull data from meteostat api, process result and store

  Args:
      id (int): Wildfire id
      lat (float): Fire latitude
      long (float): Fire longitude
      start (datetime): Fire start date
      end (datetime): Fire end date
      result_df (DataFrame): previous result dataframe

  Returns:
      DataFrame: Result attached datafrmae
  """
  # Extract year-month-day from fire dates
  start_date_str = start.strftime("%Y-%m-%d").split('-')
  end_date_str = end.strftime("%Y-%m-%d").split('-')

  # Reformat date into api required format 
  start_date = datetime(int(start_date_str[0]), int(start_date_str[1]), int(start_date_str[2]))
  end_date = datetime(int(end_date_str[0]), int(end_date_str[1]), int(end_date_str[2]))

  # Add 1 day if start and end date of fire is same since api requires it to be whole day
  if start_date == end_date:
      end_date += timedelta(days=1)

  # Create Daily object 
  data = Daily(Point(lat, long), start_date, end_date)

  # Fetch data using Daily object and reset index
  data = data.fetch().reset_index()

  # Attach necessary columns
  data['lat'] = lat
  data['long'] = long
  data['pid'] = id

  # Concat data fetched into results
  result_df = pd.concat([result_df, data])
  return result_df

In [69]:
# Define meteo stat dataframe
meteo_res_df = pd.DataFrame()

# Call helper function to fetch weather data for each wildfire
for i in df.index:
    meteo_res_df = get_weather_meteo(
        i,
        df.loc[i, "InitialLatitude"],
        df.loc[i, "InitialLongitude"],
        df.loc[i, "FireDiscoveryDateTime"],
        df.loc[i, "ControlDateTime"],
        meteo_res_df,
    )

In [70]:
# Store result
meteo_res_df.to_csv(Config().get_raw_meteorology_path("meteo_weather"), index=False)