# Custom Climate Profiles Generation


This notebook generates custom `Climate Profile` hourly datasets (8760) of two varieties. 

1. Average Meteorological Year (AMY, single variable)
2. Typical Meteorological Year Profile (TMY, multi-variable)

You will be able to generate a climate profile for a warming level and/or location not already present in our catalog. If you are looking to generate profiles for one of the pre-generated locations and warming level, Check out this page.

### What is a climate profile?

A climate profile is a representative year of hourly data that is meant to capture the intrannual weather patterns for a location of interest. This time series is constructed by piecing together either historical or synthetic data. Thhis notebook generated two type of climate profiles: the average meteorolical year (AMY) and the typical meteorological year (TMY). AMY is recommended for most applications, while TMY is a strictly defined construct suitable for a narrow range of applications in the utilities sector.


## Setup

Import the [climakitae](https://github.com/cal-adapt/climakitae) library and other dependencies.

In [None]:
%config InlineBackend.figure_format = 'svg' # Make plots look better in the notebook environment 
%reload_ext autoreload
%autoreload 2

import climakitae as ck
import pandas as pd
import matplotlib.pyplot as plt

from climakitae.explore.amy import get_climate_profile

from climakitae.explore.typical_meteorological_year import TMY
from climakitae.util.utils import read_csv_file

## Average Meteoroligical Year (AMY)

The average meteorological year (AMY) represents the mean weather conditions for a one-year period. 

Examining a particular month within the Average Meteorological Year can provide hourly information that could inform energy usage analysis. For example, a warm season month can be used to examine cooling demand; while a cold season month can be used for exploring heating demand change under future climate conditions.

The absolute AMY for a given 30-year period (either the historical period 1981-2010, or a thirty year window centered on where each GCM simulation reaches the specified global warming level) is calculated by identifying for each hour the hourly value that is closest to the mean hourly value across all years to produce a full annual timeseries of hourly data for a variable of interest that best represents the average conditions.

The resulting heatmap of AMY values for a full annual timeseries is then plotted, with day of year 1 being January 1st and hour of day given in Pacific Standard Time. 

To learn more about the data available on the Analytics Engine, [see our data catalog](https://analytics.cal-adapt.org/data/). 

**Intended Application**: As a user, I want to **<span style="color:#FF0000">analyze the average weather conditions</span>** of a region of interest by:
1. Computing the average weather conditions
2. Visualize average conditions throughout the year in a heatmap

Because this represents average rather than extreme conditions, an AMY dataset is not suited for designing systems to meet the worst-case conditions occurring at a location.

**Runtime**: AMY generation time ranges from around **5 minutes** for a single grid cell and and to **1 hour** for the entire state of California.  

### Generating AMY

Select the warming level, location and variable for which you want to generate the AMY.

1. **Select variable.** The default is "Air Temperature at 2m". Want to know what variables (and their associated units) are available to choose from? Check out our variable list [here](https://github.com/cal-adapt/climakitae/blob/main/climakitae/data/variable_descriptions.csv).

In [None]:
variable = "Air Temperature at 2m" 
units = "degF"

2. **Select quantile.** Insert explanation and acceptable range.

In [None]:
q = 0.5 # float | list[float], default 0.5, quantile for profile calculation

3. **Select warming level.** The default is [1.2] and base options are [1.5, 1.75, 2.0, 2.25, 2.5]. If you decide to generate AMY for a warming level outside of that list, keep in mind that the realistic range of warming levels is _. 

In [None]:
warming_level = [1.45]  # List[float], default [1.2]

4. **Select location.** You have a choice of either selecting from one of 32 cached locations, _ cashed areas, or providing a custom point location of interest. If you are only interested in generating an AMY for a cached location, please go to this notebook instead.

In [None]:
latitude = 32.7136 # float or tuple
longitude = -117.2031  # float or tuple

5. Now we define the selection by the choices you've made above and generate the climate profile. AMY generation time ranges from around 5 miinutes for a single grid cell and and to an hour for the entire state of California.

In [None]:
# define the selection
selection = {
    "variable": variable,
    "resolution": "3 km",
    "warming_level": warming_level,
    "units": units,
    "latitude": latitude,
    "longitude": longitude
}

In [None]:

# generate the climate profile
profile = get_climate_profile(**selection)

In [None]:
# for testing
profile = get_climate_profile(
    stations=["San Diego Lindbergh Field (KSAN)"],
    warming_level=[2.0],
)

###  Visualizing AMY

If you generated AMY for a single warming level, you can visualize your result below. A figure is produce for each of the 8 simulations used in profile generation.

In [None]:
idx = pd.IndexSlice
sims = profile.columns.get_level_values("Simulation").unique().tolist()

for sim in sims:
    # Select columns for the current simulation
    data = profile.loc[:, idx[:, sim]]
    # Assuming the first level is month and the second is simulation
    # We'll plot a heatmap of the data for each simulation
    plt.figure(figsize=(12, 6))
    plt.imshow(data.values.T, aspect="auto", cmap="coolwarm")
    plt.colorbar(label=f"{selection['variable']} ({selection['units']})")
    plt.yticks(range(data.shape[1]), data.columns.get_level_values(0))
    plt.xticks(range(data.shape[0]), data.index)
    plt.title(
        f"{sim} - {selection['variable']} at {selection['warming_level'][0]}°C Warming"
    )
    plt.xlabel("Day of Year")
    plt.ylabel("Hour of Day")
    vmin = -max(abs(data.values.max()), abs(data.values.min()))
    vmax = max(abs(data.values.max()), abs(data.values.min()))
    xtick_freq = max(1, data.shape[0] // 12)
    plt.xticks(range(0, data.shape[0], xtick_freq), data.index[::xtick_freq])
    plt.clim(vmin, vmax)
    plt.show()

If you generated AMY for multuple warming levels, visualize your result here. It will produce a figure for each of the 8 simulations used in profile generation, for each warming level you provided.

In [None]:
idx = pd.IndexSlice
sims = profile.columns.get_level_values("Simulation").unique().tolist()
wls = profile.columns.get_level_values("Warming_Level").unique().tolist()

for wl in wls:
    for sim in sims:
        # Select columns for the current simulation
        data = profile.loc[:, idx[:, wl, sim]]
        # Assuming the first level is month and the second is simulation
        # We'll plot a heatmap of the data for each simulation
        plt.figure(figsize=(12, 6))
        plt.imshow(data.values.T, aspect='auto', cmap='coolwarm')
        plt.colorbar(label=f"{selection['variable']} ({selection['units']})")
        plt.yticks(range(data.shape[1]), data.columns.get_level_values(0))
        plt.xticks(range(data.shape[0]), data.index)
        plt.title(f"{sim} - {selection['variable']} at {wl}°C Warming")
        plt.xlabel("Day of Year")
        plt.ylabel("Hour of Day")
        vmin = -max(abs(data.values.max()), abs(data.values.min()))
        vmax = max(abs(data.values.max()), abs(data.values.min()))
        xtick_freq = max(1, data.shape[0] // 12)
        plt.xticks(range(0, data.shape[0], xtick_freq), data.index[::xtick_freq])
        plt.clim(vmin, vmax)
        plt.show()

## Typical Meteorological Year (TMY)

<br>The [Typical Meteorological Year](https://nsrdb.nrel.gov/data-sets/tmy) is an hourly dataset used for applications in energy and building systems modeling. Because this represents average rather than extreme conditions, an TMY dataset is not suited for designing systems to meet the worst-case conditions occurring at a location. 

The TMY methodology here mirrors that of the Sandia/NREL TMY3 methodology, and uses historic and projected downscaled climate data available through the Cal-Adapt: Analytics Engine catalog. The [TMY3 method](https://www.nrel.gov/docs/fy08osti/43156.pdf) selects a "typical" month based on ten daily variables: max, min, and mean air and dew point temperatures, max and mean wind speed, global irradiance and direct irradiance. As this methodology heavily weights the solar radiation input data, be aware that the final selection of "typical" months may not be typical for other variables. 

**Intended Application** As a user, I want to <span style="color:#FF0000">**generate a typical meteorological year file**</span> for a location of interest:
- Visualize the TMY dataset across all input variables
- Export the TMY dataset for available models for input into my workflow

**Note**: 
1. For practical generation of a TMY dataset, a user <span style="color:#FF0000">**only needs to provide 2 elements**</span>: the **location**, and **reference time period**. These selections are highlighted below for you. 

**Runtime**: With the default settings, this section takes approximately **50 minutes** to run from start to finish. Modifications to selections may increase the runtime.

### Generating TMY

First, select the location and warming level for which the TMY is generated. TMY considers a set of 10 variables that, unlike for AMY, cannot be changed.

1. **Select Location.** You can either select a station from our pre-generated set of 32 weather stations, are define custom latitude and longitude. Run the block below to list out all station name options.

In [None]:
# read in station file of CA HadISD stations
stn_file = read_csv_file("data/hadisd_stations.csv")
# Display station names
list(stn_file["station"])

In [None]:
latitude = 37.9
longitude = -122.06

#stn_name = "Bakersfield Meadows Field (KBFL)"

2. **Select warming level.** The base options are [1.5, 1.75, 2.0, 2.25, 2.5]. If you decide to generate AMY for a warming level outside of that list, keep in mind that the realistic range of warming levels is _. 

In [None]:
warming_level = 2.0

3. We can use the TMY object to set up, run, and output TMY results to file. The first step is to initialize the object with your selected location and warming level. True is the default value for verbose but we have also set it explicitly for demonstration.

In [None]:

tmy = TMY(warming_level=warming_level,
          latitude = latitude,
          longitude = longitude,
          #station_name=stn_name, 
          verbose=True)

4. We can run the entire TMY workflow with a single command, as shown below. This will write 4 TMY files, one for each model.
The runtime for this command can reach up to 30 minutes. Because we set verbose to True, the TMY object will print updates as different parts of the workflow initialize.

In [None]:
tmy.generate_tmy()

5. **(Optional) Export to non-edw format.** The TMY files are exported in .epw format by default, but they can be saved as .tmy files using the method export_tmy_data with the argument extension="tmy" as shown here (uncomment to run):

In [None]:
#tmy.export_tmy_data(extension="csv")

### Visualizing TMY

In [None]:
idx = pd.IndexSlice
sims = tmy.columns.get_level_values("Simulation").unique().tolist()

In [None]:
for sim in sims:
    # Select columns for the current simulation
    data = profile.loc[:, idx[:, sim]]
    # Assuming the first level is month and the second is simulation
    # We'll plot a heatmap of the data for each simulation
    plt.figure(figsize=(12, 6))
    plt.imshow(data.values.T, aspect="auto", cmap="coolwarm")
    plt.colorbar(label=f"{selection['variable']} ({selection['units']})")
    plt.yticks(range(data.shape[1]), data.columns.get_level_values(0))
    plt.xticks(range(data.shape[0]), data.index)
    plt.title(
        f"{sim} - {selection['variable']} at {selection['warming_level'][0]}°C Warming"
    )
    plt.xlabel("Day of Year")
    plt.ylabel("Hour of Day")
    vmin = -max(abs(data.values.max()), abs(data.values.min()))
    vmax = max(abs(data.values.max()), abs(data.values.min()))
    xtick_freq = max(1, data.shape[0] // 12)
    plt.xticks(range(0, data.shape[0], xtick_freq), data.index[::xtick_freq])
    plt.clim(vmin, vmax)
    plt.show()