# Climate data for hourly load forecast model
How can AE tools be used to generate synthetic time series for Demand Forecast Unit applications in the hourly load forecasting model?

In [None]:
import climakitae as ck
import xarray as xr
import hvplot.pandas 
import pandas as pd
import panel as pn
pn.extension()

In [None]:
app = ck.Application()

## Step 1) Average Meteorological Year (AMY) 
First, set `app.location` to the location you wish to subset the data by. We've selected the Demand Forecast Zone for the LA Metro region. **This can also be set in the app.select panel by just selecting your desired region from the dropdown.** <br><br>An area average will be computed over the grid cells in this selected region in the next step.

In [None]:
app.location.area_subset = "CA Electricity Demand Forecast Zones"
app.location.cached_area = "LA Metro"

Next, choose the variable you wish to observe and the units of that variable. Again, this can also be set in the app.select panel. 

In [None]:
app.selections.variable = "Air Temperature at 2m" 
app.selections.units = "degF"

Next, set the spatial resolution you would like to use. The options are 45 km, 9 km, and 3 km, but we recommend sticking with just 45 km or 9 km to avoid long compute times or overloading the available memory.

In [None]:
app.selections.resolution = "45 km"

Now, we'll use the function `retrieve_meteo_yr_data()` from climakitae.meteo_yr to easily grab the relevant data from the AWS catalog. You can modify `year_start` and `year_end` to set the data's date range; the default range is 30 years past the set start year. <br><br>By default, this function will grab SSP 3-7.0; to change this, simply set the argument `ssp` to whatever SSP you're interested in. We'll show you how to do this in step 2 when we compute a severe meteorological year. 

In [None]:
t2_data_historical = app.retrieve_meteo_yr_data(year_start=1984, year_end=2014)
t2_data_current_day = app.retrieve_meteo_yr_data(year_start=1993, year_end=2023)
t2_data_future_15yr = app.retrieve_meteo_yr_data(year_start=2023, year_end=2053)
t2_data_future_30yr = app.retrieve_meteo_yr_data(year_start=2038, year_end=2068)

The data is returned as an xarray DataArray object, an excellent data type for easily manipulating gridded data. 

### 1c) Calculate an average meteorological year 
Before performing any calculations, we'll load the data into memory. This cell will likely take a few minutes to run. 

In [None]:
%%time
t2_data_historical = app.load(t2_data_historical)
t2_data_current_day = app.load(t2_data_current_day)
t2_data_future_15yr = app.load(t2_data_future_15yr)
t2_data_future_30yr = app.load(t2_data_future_30yr)

Next, we'll use the function `compute_amy()` from climakitae.meteo_yr to compute the average meteorological year for each input dataset. We'll also set the argument `show_pbar=True` to output a progress bar. 

In [None]:
from climakitae.meteo_yr import compute_amy

In [None]:
%%time
t2_amy_historical = compute_amy(t2_data_historical, show_pbar=True)
t2_amy_current_day = compute_amy(t2_data_current_day, show_pbar=True)
t2_amy_future_15yr = compute_amy(t2_data_future_15yr, show_pbar=True)
t2_amy_future_30yr = compute_amy(t2_data_future_30yr, show_pbar=True)

Lastly, let's preview what this output data looks like. The data is returned as a pandas DataFrame, a tabular object that can easily be output to a csv file. Below, we can view the first 5 rows. 

In [None]:
t2_amy_historical.head()

### 1d) Make a heatmap of the data 
We'll use the helper function `meteo_yr_heatmap` available in climakitae to easily make a heatmap of the data.

In [None]:
from climakitae.meteo_yr import meteo_yr_heatmap

In [None]:
var_and_units = "{0} ({1})".format(t2_data_historical.name,t2_data_historical.units)
meteo_yr_heatmap(
    meteo_yr_df=t2_amy_historical, 
    title="Historical Meteorological Year",
    clabel=var_and_units
)

### 1e) Compare the mean monthly data for each dataset 
To do this, we'll first compute the mean monthly value using the helper function `compute_mean_monthly_meteo_yr` from climakitae. Then, we'll use hvplot to make an interactive line plot of the data. You can see the mean monthly temperature is projected to be warmer for every month as we go farther into the future. 

In [None]:
from climakitae.meteo_yr import compute_mean_monthly_meteo_yr

In [None]:
# Compute mean monthly_values
t2_amy_historical_monmean = compute_mean_monthly_meteo_yr(t2_amy_historical, col_name="historical")
t2_amy_current_day_monmean = compute_mean_monthly_meteo_yr(t2_amy_current_day, col_name="current_day")
t2_amy_future_15yr_monmean = compute_mean_monthly_meteo_yr(t2_amy_future_15yr, col_name="future_15yr")
t2_amy_future_30yr_monmean = compute_mean_monthly_meteo_yr(t2_amy_future_30yr, col_name="future_30yr")

# Merge the individual dataframes so it's easier to plot 
data_all = pd.concat(
    [t2_amy_future_30yr_monmean,t2_amy_future_15yr_monmean,t2_amy_current_day_monmean,t2_amy_historical_monmean], 
    axis="columns"
)

# Display the lineplot! 
data_all.hvplot(width=800, height=300, grid=True, ylabel=var_and_units,title="Average Meteorological Year: Comparison Between Time Periods")

### 1f) Output the data to a csv file 

In [None]:
t2_amy_historical.to_csv("t2_amy_historical.csv")

### 1g) Repeat this process for another variable 
Now, we'll perform the same computations as above for dew point temperature. You could also put this code in a loop to streamline this process-- we've separated it here to better show the process, but a loop would be faster and more streamlined, especially if you want to compute AMY for several variables. 

In [None]:
# %%time
# # Reset variable and units 
# app.selections.variable = "Dew point temperature" 
# app.selections.units = "degF"

# # Retrieve data
# dpt_data_historical = app.retrieve_meteo_yr_data(year_start=1984, year_end=2014)
# dpt_data_current_day = app.retrieve_meteo_yr_data(year_start=1993, year_end=2023)
# dpt_data_future_15yr = app.retrieve_meteo_yr_data(year_start=2023, year_end=2053)
# dpt_data_future_30yr = app.retrieve_meteo_yr_data(year_start=2038, year_end=2068)

# # Read data into memory 
# dpt_data_historical = app.load(dpt_data_historical)
# dpt_data_current_day = app.load(dpt_data_current_day)
# dpt_data_future_15yr = app.load(dpt_data_future_15yr)
# dpt_data_future_30yr = app.load(dpt_data_future_30yr)

# # Compute AMY 
# dpt_amy_historical = compute_amy(dpt_data_historical, show_pbar=True)
# dpt_data_current_day = compute_amy(dpt_data_current_day, show_pbar=True)
# dpt_amy_future_15yr = compute_amy(dpt_data_future_15yr, show_pbar=True)
# dpt_amy_future_30yr = compute_amy(dpt_data_future_30yr, show_pbar=True)

## Step 2) Severe Meteorological Year

### 2a) Compute the Severe Meteorological Year
Next we'll explore how to determine a 'severe' meteorological year, by quantifying severity with the 90th percentile of the data. 

Let's return to air temperature at 2m for our variable for ease. Because we already have this data loaded from step 1, we can skip ahead to computing the severe meteorological year. We will calculate the severe meteorological year with the climakitae function `compute_severe_yr()` which utilizes the 90th percentile to examine extremes.

In [None]:
from climakitae.meteo_yr import compute_severe_yr

In [None]:
%%time
t2_smy_historical = compute_severe_yr(t2_data_historical, show_pbar=True)
t2_smy_current_day = compute_severe_yr(t2_data_current_day, show_pbar=True)
t2_smy_future_15yr = compute_severe_yr(t2_data_future_15yr, show_pbar=True)
t2_smy_future_30yr = compute_severe_yr(t2_data_future_30yr, show_pbar=True)

### 2b) Compare the Average Meteorological Year and Severe Meteorological Year

Here, we'll show mean monthly values for the average meteorological year and the severe meteorological year. We can use the helper function `compute_mean_monthly_meteo_yr` to compute the monthly mean values for each dataset, then hvplot to make interactive lineplots of the data (just as we did in step 1e). You can easily see in the outputted lineplot that the Severe Meteorological Year shows higher average monthly temperatures. 

In [None]:
# Compute monthly mean values 
t2_amy_historical_monmean = compute_mean_monthly_meteo_yr(t2_amy_historical, col_name="Average Meteorological Year")
t2_smy_historical_monmean = compute_mean_monthly_meteo_yr(t2_smy_historical, col_name="Severe Meteorological Year")

# Merge the individual dataframes so it's easier to plot 
data_all = pd.concat( 
    [t2_smy_historical_monmean,t2_amy_historical_monmean], 
    axis="columns"
)

# Display plot! 
data_all.hvplot(width=800, height=300, grid=True, ylabel=var_and_units, title="Historical Meteorological Year: Comparing Average vs. Severe")

### 2c) Mid and high climate scenarios for the future

Let's now repeat this process to calculate a severe meteorological year in a high climate scenario. We'll demonstrate how to do this for the period 30 years in the future, then compare with the SSP3-7.0 severe meteorological year. <br><br>To change the scenario used, simply set the argument `ssp` to your desired scenario in the function `app.retrieve_meteo_yr_data`. This function defaults to SSP3-7.0, so we only need to set this argument if we want to use SSP5-8.5 or SSP2-4.5.

In [None]:
%%time
# Retrieve data
t2_ssp585_data_future_30yr = app.retrieve_meteo_yr_data(year_start=2038, year_end=2068, ssp="SSP 5-8.5 -- Burn it All")

# Read data into memory 
t2_ssp585_data_future_30yr = app.load(t2_ssp585_data_future_30yr)

# Compute Severe Meteorological Year 
t2_ssp585_smy_future_30yr = compute_severe_yr(t2_ssp585_data_future_30yr, show_pbar=True)

Next, lets compute a monthly mean and produce a comparison lineplot, like we did the the previous step. 

In [None]:
# Compute monthly mean values 
t2_ssp370_30yr_mon_mean = compute_mean_monthly_meteo_yr(t2_smy_future_30yr, col_name="SSP 3-7.0 -- Business as Usual")
t2_ssp585_30yr_mon_mean = compute_mean_monthly_meteo_yr(t2_ssp585_smy_future_30yr, col_name="SSP 5-8.5 -- Burn it All")

# Merge the individual dataframes so it's easier to plot 
data_all = pd.concat( 
    [t2_ssp370_30yr_mon_mean,t2_ssp585_30yr_mon_mean], 
    axis="columns"
)

# Display plot! 
data_all.hvplot(width=800, height=300, grid=True, ylabel=var_and_units, title="Severe Meteorological Year: Comparing High vs. Mid Scenarios")

### 2d) Output the mid and high scenario severe meteorological year data

In [None]:
t2_ssp585_smy_future_30yr.to_csv("t2_ssp585_smy_future_30yr.csv") # mid scenario

In [None]:
t2_smy_future_30yr.to_csv("t2_smy_future_30yr.csv") # high scenario