# Data Ingest of HRRR weather model data

Retrieval of 10-h FMC observations is done with the software package `Herbie`. 

A configuration file is used to control data ingest. For automated processes, the code will look for a json configuration file depending on the use case: 

* For building training data, `../etc/training_data_config.json`
* For deploying the model on a grid, `../etc/forecast_config.json`

For a complete set of predictors that could be useful for FMC modeling, and for compatibility with other areas of `wrfxpy`, we use the 3D pressure model from HRRR. Additionally, since we require rainfall for modeling, we utilize the 3-hour forecast from HRRR and use the difference in accumulated precipitation from the 2 to 3 hour forecasts.

A module `retrieve_hrrr_api.py` has functions and other metadata for directing data ingest. A list of predictors will be provided in order to control the data downloading. Some of these predictors are derived features, such as equilibrium moisture content which is calculated from relative humidity and air temperature. Within the module, there are some hard-coded objects that have metadata related to this, such as the regex formatted search strings used for each variable.

## References

For more info on HRRR data bands and definitions, see [HRRR inventory](https://www.nco.ncep.noaa.gov/pmb/products/hrrr/hrrr.t00z.wrfprsf02.grib2.shtml) for pressure model f02-f38 forecast hours.

For more info on python package, see Brian Blaylock's `Herbie` [python package](https://github.com/blaylockbk/Herbie)

## Setup

User definitions, these will come from config files in other areas of this project.

In [None]:
import matplotlib.pyplot as plt
import herbie
from herbie import FastHerbie
from datetime import datetime
import sys
import pandas as pd
import numpy as np
sys.path.append("../src")
import ingest.retrieve_hrrr_api as ih

In [None]:
bbox = [40, -105, 45, -100]
start = datetime(2024, 6, 1, 0)
end = datetime(2024, 6, 1, 5)
forecast_step = 3 # Do not change for now, code depends on it
features_list = ['Ed', 'Ew', 'rain', 'wind', 'solar', 'elev', 'lat', 'lon']

print(f"Start Date of retrieval: {start}")
print(f"End Date of retrieval: {end}")
print(f"Spatial Domain: {bbox}")
print(f"Required Features: {features_list}")

In [None]:
# Create a range of dates
dates = pd.date_range(
    start = start,
    end = end,
    freq="1h"
)

In [None]:
ih.feature_df

### Read Data

This function from `herbie` sets up a connection to read, but only what is requested later will be downloaded. Available data can be viewed with the `inventory()` method. *Note:* this will display a separate row for each time step requested.

In [None]:
FH = FastHerbie(
    dates, 
    model="hrrr", 
    product="prs",
    fxx=range(3, 4)
)

In [None]:
inv = FH.inventory()
inv

Data fields are accessed through the `.xarray()` method. This will temporarily download the file and then deliver it in memory as an xarray object. Different variables are accessed through search strings that specify the variable name (e.g. air temperature), the level of the observation (e.g. surface level), and the forecast hour relative to the f00 start time (e.g. hour 3 as we will be using). The `retrieve_hrrr_api` module in this project stores a dataframe with names and info on various variables that will be considered for modeling FMC.

In [None]:
# Show HRRR naming dataframe
ih.name_df_hrrr

## Getting a Set of Predictors

We will demonstrate retrieval of a restricted set of predictors.

Equilibrium moisture content is calculated from RH and air temp.

In [None]:
features_list = ["Ed", "rain"]
hrrr_vars = [x for item in ih.feature_df.loc[ih.feature_df['feature_name'].isin(features_list), 'required_fmda_name'] for x in (item if isinstance(item, tuple) else [item])]
herbie_search_str = [
    ih.name_df_hrrr.loc[ih.name_df_hrrr['fmda_name'] == name, 'herbie_str'].iloc[0]
    for name in hrrr_vars
]
# "|".join(herbie_search_str)

print(f"Features used in modeling: {features_list}")
print(f"Needed HRRR fields {hrrr_vars}")
print(f"HERBIE search strings: {herbie_search_str}")

In [None]:
ds = FH.xarray("RH:2 m|TMP:2 m")
ds

In [None]:
from ingest.retrieve_hrrr_api import calc_eq
calc_eq(ds)

In [None]:
ds

In [None]:
ds2 = FH.xarray(":APCP:surface:2-3 hour acc fcst")
ds2

In [None]:
from herbie.toolbox import EasyMap, pc, ccrs
from herbie import paint

ax = EasyMap("50m", figsize=[15, 9], crs=ds.herbie.crs).STATES().ax
p = ax.pcolormesh(
    ds.longitude,
    ds.latitude,
    ds.Ed.isel(time=0),
    transform=pc,
    cmap=paint.NWSPrecipitation.cmap,
)

plt.colorbar(
    p,
    ax=ax,
    orientation="horizontal",
    pad=0.01,
    shrink=0.8,
    label="Equilibrium Moisture Content",
)