# Gather HRRR Atmospheric Data

The purpose of this notebook is to build time series of atmospheric observations at an arbitrary longitude/latitude coordinate in North America. The data source is the NOAA HRRR product from AWS. See the list of variables [here]().

The time series are intended for development and training an ML model of fuel moisture content (FMC). For live deploying an atmospheric model across the country, see the Herbie python package: 

https://github.com/blaylockbk/Herbie/tree/main/herbie 

In [None]:
# Setup
## Packages
import os
import pandas as pd
import numpy as np
import xarray as xr
from datetime import date, timedelta, datetime
import matplotlib.pyplot as plt 
import pickle
## Local modules with very literal names
from gather_HRRR import extract_hrrr, gather_hrrr_time_range, download_grib

In [None]:
# Time period of 1 month of data, June 2022
start_time = "2022-06-01 00:00"
end_time = "2022-06-30 00:00"

## Coordinates of Interest

Read in data frame of latitude and longitude coordinates that correspond to RAWS station locations. These locations have sensors for FMC, the primary response variable of interest in this larger project.

In [None]:
# Read in list of RAWS Stations
df = pd.read_csv("raws_stations_WA.csv")

# Filter to those with complete fmda data
df = df[(df[['air_temp', 'relative_humidity', 'precip_accum',
       'fuel_moisture', 'wind_speed', 'solar_radiation']]==1).sum(axis=1)==6]

# Get list of coords
points = list(df[["lon","lat"]].itertuples(index=False,name=None))
names = np.unique(df['STID'])

print(df.shape)

In [None]:
# Get first 2 station locations
points = points[0:2]
names = names[0:2]

## Extract Atmospheric Data

Variables of interest and their associated HRRR layer are:
* Temperature: "t2m", 2m layer
* RH: "r2", 2m layer
* Rain: "", surface layer
    - "PRATE", "APCP"
* Solar Radiation: "", surface layer
    - "DSWRF", downward short-wave flux
    - "USWRF", upward short-wave
    - "DLWRF", upward long-wave
    - "ULWRF", upward long-wave
* Wind: "", 10m layer
    - "u10": Eastward component of wind
    - "v10": Northward ncomponent of wind

---

## Build Timeseries

This next step loops through the hours of the time range given at the beginning of the notebook and,

* temporarily downloads grib file at that date
* extracts values at desired coordinates from temp file to build time series
* delete tempfile before iterating

Variables extracted with a pandas dataframe specifying layer and variable name.

In [None]:
vs = pd.DataFrame({
    'Common Name': ['temp', 'rh', 
                    'prate', 'dswrf', 'uswrf', 'dlwrf', 'ulwrf',
                    'ewind', 'nwind', 'wind'],
    'HRRR Name': ['t2m', 'r2', 
                  'prate', 'dswrf', 'uswrf', 'dlwrf', 'ulwrf',
                  'u10', 'v10', 'si10'],
    'Layer': ['2m', '2m', 
              'surface','surface','surface','surface','surface',
              '10m', '10m', '10m']
})
vs

In [None]:
## Break time period up for ease of running
start_time = "2022-06-01 00:00"
end_time1 = "2022-06-10 00:00"
end_time2 = "2022-06-20 00:00"
end_time3 = "2022-06-30 00:00"

hrrr_dat = gather_hrrr_time_range(
    start = end_time1,
    end = end_time2,
    pts = points,
    vs = vs
)

In [None]:
print(hrrr_dat[5,0,0]) # Compare to demo notebook
print(hrrr_dat[5,1,0])

In [None]:
# Data summary

## Simple func to print summary
def summary(dat):
    ntime = dat.shape[0]
    ncoords = dat.shape[1]
    nvars = dat.shape[2]
    
    print('-'*25)
    print('Sample Size:')
    print('Time: '+str(ntime))
    print('Coordinates: '+str(ncoords))
    print('Atmospheric Vars: '+str(nvars))
    print('-'*25)
    
summary(hrrr_dat)

In [None]:
# Plot a couple time series at a given pt
temps2 = hrrr_dat[:,1,0]
plt.plot(temps2)

## Write Output

In [None]:
hrrr_dict={
    'time': pd.date_range(end_time1, end_time2, freq="1H"),
    'coords': points,
    'data': hrrr_dat,
    'variables': vs
}

In [None]:
import pickle

filename='hrrr_'+end_time1[0:10:1]+'.pickle' ## NOTE: assumes date format "%Y-%m-%d %H:%M"
print(filename)

with open(filename, 'wb') as handle:
    pickle.dump(hrrr_dict, handle, protocol=pickle.HIGHEST_PROTOCOL)

## Sources

* https://registry.opendata.aws/noaa-hrrr-pds/

* https://spire.com/tutorial/spire-weather-tutorial-intro-to-processing-grib2-data-with-python/

* https://github.com/microsoft/AIforEarthDataSets/blob/main/data/noaa-hrrr.md

* https://nbviewer.org/github/microsoft/AIforEarthDataSets/blob/main/data/noaa-hrrr.ipynb

* https://github.com/ecmwf/cfgrib/issues/63

* https://github.com/blaylockbk/Herbie/discussions/45