In [1]:
import pandas as pd

from prereise.gather.demanddata.nrel_efs.aggregate_demand import (
    access_non_efs_demand,
    combine_efs_demand,
)
from prereise.gather.demanddata.nrel_efs.get_efs_data import (
    download_demand_data,
    partition_demand_by_sector,
)
from prereise.gather.demanddata.nrel_efs.map_states import (
    decompose_demand_profile_by_state_to_loadzone,
)

## Notebook Overview

In this notebook, the functionality of the various modules used for obtaining and cleaning the National Renewable Energy Laboratory's (NREL's) Electrification Futures Study (EFS) demand data is displayed. An example is developed using the **Reference** electrification scenario and **Slow** technology advancement for the year **2030**. 

The NREL EFS demand data can be obtained from this [this website](https://data.nrel.gov/submissions/126), with the specific dataset required for the described example directly available [here](https://data.nrel.gov/system/files/126/EFSLoadProfile_Reference_Slow.zip). These datasets are generously provided by NREL, which is operated for the U.S. Department of Energy by the Alliance for Sustainable Energy, LLC. Before using these datasets, please read [this disclaimer](https://www.nrel.gov/disclaimer.html) first.

## Downloading and Extracting EFS Demand Data

NREL EFS demand data is accessed from the website referenced in the prior section. The website contains a .zip file for each pair of three electrification scenarios (Reference, Medium, and High) and three technology advancements (Slow, Moderate, and Rapid). Each .zip file contains a .csv file containing the sectoral demand data for each state in the contiguous U.S. and each of the six tested years (2018, 2020, 2024, 2030, 2040, and 2050). The `download_demand_data` function downloads the .zip file of NREL EFS data and attempts to extract the .csv file. `download_demand_data` can download NREL EFS data for one electrification scenario-technology advancement pair:

In [2]:
download_demand_data(es={"Reference"}, ta={"Slow"}, fpath="")

EFSLoadProfile_Reference_Slow.zip successfully downloaded!
EFSLoadProfile_Reference_Slow.zip is compressed using a method that is not supported by the zipfile module.
Trying other extraction methods supported by your OS.
EFSLoadProfile_Reference_Slow.csv successfully extracted!


`download_demand_data` can download multiple electrification scenario-technology advancement pairs:

In [3]:
download_demand_data(es={"High"}, ta={"Moderate", "Rapid"}, fpath="")

EFSLoadProfile_High_Moderate.zip successfully downloaded!
EFSLoadProfile_High_Rapid.zip successfully downloaded!
EFSLoadProfile_High_Moderate.zip is compressed using a method that is not supported by the zipfile module.
Trying other extraction methods supported by your OS.
EFSLoadProfile_High_Moderate.csv successfully extracted!
EFSLoadProfile_High_Rapid.csv successfully extracted!


`download_demand_data` can also download all of the electrification scenario-technology advancement pairs:

In [4]:
download_demand_data(es={"All"}, ta={"All"}, fpath="")

EFSLoadProfile_Reference_Moderate.zip successfully downloaded!
EFSLoadProfile_Reference_Slow.zip successfully downloaded!
EFSLoadProfile_Reference_Rapid.zip successfully downloaded!
EFSLoadProfile_Medium_Moderate.zip successfully downloaded!
EFSLoadProfile_Medium_Slow.zip successfully downloaded!
EFSLoadProfile_Medium_Rapid.zip successfully downloaded!
EFSLoadProfile_High_Moderate.zip successfully downloaded!
EFSLoadProfile_High_Slow.zip successfully downloaded!
EFSLoadProfile_High_Rapid.zip successfully downloaded!
EFSLoadProfile_Reference_Moderate.zip is compressed using a method that is not supported by the zipfile module.
Trying other extraction methods supported by your OS.
EFSLoadProfile_Reference_Moderate.csv successfully extracted!
EFSLoadProfile_Reference_Slow.csv successfully extracted!
EFSLoadProfile_Reference_Rapid.csv successfully extracted!
EFSLoadProfile_Medium_Moderate.csv successfully extracted!
EFSLoadProfile_Medium_Slow.csv successfully extracted!
EFSLoadProfile_Medi

The .zip files were created with a compression format (Deflate64) that is not supported by Python's `zipfile` module. Therefore, to perform automated extraction of the .csv files, Command Line- or Terminal-level tools must be used. For users running macOS or Linux, the Terminal can extract the .csv file. Unfortunately for Windows users, the Command Line extraction tools do not support Deflate64. [7-Zip](https://www.7-zip.org/), the popular file archiver tool, can extract the .csv file; by specifying the file path of 7-Zip when calling `download_demand_data`, Windows users can also use the automated .csv file extraction. If the automated extraction techniques do not work (e.g., a Windows user does not have 7-Zip), the .zip file is still downloaded and can be extracted manually (e.g., using the extraction tool built into Windows' File Explorer).

## Splitting the EFS Demand Data by Sector and Year

The EFS demand data for a given electrification scenario and technology advancement is provided for each sector and each year. Although the sectoral demand is eventually aggregated for a given year (see next subsection), splitting the demand by sector can be useful, especially for users that may only want to use a subset of the EFS sectoral demand (perhaps to pair with their own sectoral demand data). The `partition_demand_by_sector` function filters EFS demand data for a specified year and combines the subsector-level data for each state and sector for each time step. `partition_demand_by_sector` can either access the extracted .csv file or can call `download_demand_data` to obtain the demand data; the extracted .csv file is searched for according to the provided electrification scenario (`es`) and technology advancement (`ta`) in the file path (`fpath`) provided by the user:

In [5]:
sect_dem = partition_demand_by_sector(es="Reference", ta="Slow", year=2030, fpath="")

print(sect_dem.keys())
print(sect_dem["Transportation"])

EFSLoadProfile_Reference_Slow.zip successfully downloaded!
EFSLoadProfile_Reference_Slow.zip is compressed using a method that is not supported by the zipfile module.
Trying other extraction methods supported by your OS.
EFSLoadProfile_Reference_Slow.csv successfully extracted!
dict_keys(['Residential', 'Transportation', 'Industrial', 'Commercial'])
State                        AL         AR          AZ          CA  \
Local Time                                                           
2016-01-01 00:00:00   68.515074  38.191552  106.211180  309.266106   
2016-01-01 01:00:00   48.852300  21.686345   91.234131  179.870388   
2016-01-01 02:00:00   42.393089  19.647210   58.291866  159.485859   
2016-01-01 03:00:00   30.835974  16.668032   52.315227  131.571452   
2016-01-01 04:00:00   29.053236  11.469377   42.387554   95.092870   
...                         ...        ...         ...         ...   
2016-12-31 19:00:00  199.699625  94.357827  344.344738  740.103762   
2016-12-31 20:00:0

`partition_demand_by_sector` can also retain only a subset of sectoral demand data, if desired:

In [6]:
sect_dem = partition_demand_by_sector(
    es="Reference", 
    ta="Slow", 
    year=2030, 
    sect={"Commercial", "Industrial"}, 
    fpath="",
)

print(sect_dem.keys())

EFSLoadProfile_Reference_Slow.zip successfully downloaded!
EFSLoadProfile_Reference_Slow.zip is compressed using a method that is not supported by the zipfile module.
Trying other extraction methods supported by your OS.
EFSLoadProfile_Reference_Slow.csv successfully extracted!
dict_keys(['Commercial', 'Industrial'])


Note that `partition_demand_by_sector` calls the `account_for_leap_year` function to create an extra day's worth of data (the EFS data sets have 8760-hour profiles whereas Breakthrough Energy uses 8784-hour profiles). To account for the extra day, demand data from January 2nd is copied and added to the end of the data set to create demand for December 31st; January 2nd is chosen since it occurs on the same day of the week as December 31st (in a leap year), so as to preserve any weekly trends that may be present.

`partition_demand_by_sector` has the ability to save the resulting sectoral demand DataFrames, though it is not enabled by default. To save the DataFrames, set `save=True` when calling `partition_demand_by_sector`.

## Aggregating the Sectoral Demand Data

For use in the Breakthrough Energy production cost model, all sectoral demand must be aggregated within each location for each hour (i.e., only one demand data point per location per hour). The `combine_efs_demand` function allows sectoral demand data, both from NREL's EFS and studies not related to the EFS, to be aggregated together. `combine_efs_demand` is intended to receive input from `partition_demand_by_sector` for EFS-related sectoral demand and from `access_non_efs_demand` for non-EFS-related demand; the `access_non_efs_demand` loads sectoral demand that is not associated with the EFS and ensures that formatting is adequate for aggregation with EFS sectoral demand. The following is an example of aggregating only the EFS sectoral demand:

In [7]:
sect_dem = partition_demand_by_sector(es="Reference", ta="Slow", year=2030, fpath="")
agg_dem = combine_efs_demand(efs_dem=sect_dem)

print(agg_dem)

EFSLoadProfile_Reference_Slow.zip successfully downloaded!
EFSLoadProfile_Reference_Slow.zip is compressed using a method that is not supported by the zipfile module.
Trying other extraction methods supported by your OS.
EFSLoadProfile_Reference_Slow.csv successfully extracted!
                               AL           AR           AZ            CA  \
Local Time                                                                  
2016-01-01 00:00:00   8583.711287  4282.634701  8240.605207  26266.669236   
2016-01-01 01:00:00   8334.751328  4200.161507  8243.912717  25045.559218   
2016-01-01 02:00:00   8183.017211  4118.017085  8134.597876  24145.344159   
2016-01-01 03:00:00   8154.666506  4063.997126  8120.779652  23543.010852   
2016-01-01 04:00:00   8180.342767  4079.421848  8173.049191  23356.243470   
...                           ...          ...          ...           ...   
2016-12-31 19:00:00  13728.710656  6534.027144  9876.054819  34362.623982   
2016-12-31 20:00:00  13447.0

As described above, `combine_efs_demand` can also aggregate sectoral demand that is not related to the EFS. The following example shows how `access_non_efs_demand` is integrated with `combine_efs_demand` for this purpose. In this example, some EFS sectoral demand data is saved for the purpose of acting like non-EFS sectoral demand. It should be noted that the most efficient method for aggregating all EFS sectoral demand is still the method presented in the above example. Additionally, if non-EFS sectoral demand data is to be included, it must adhere to the same formatting guidelines: 8784 hours by 48 states.

In [8]:
# Store two demand data sets locally to mimic sectoral demand external to NREL's EFS
sect_dem = partition_demand_by_sector(
    es="Reference", 
    ta="Slow", 
    year=2030, 
    sect={"Commercial", "Industrial"}, 
    fpath="", 
    save=True,
)

# Aggregate the local sectoral demand data with the sectoral demand that still needs to be accessed
sect_dem = partition_demand_by_sector(
    es="Reference", 
    ta="Slow", 
    year=2030, 
    sect={"Residential", "Transportation"}, 
    fpath="",
)
ext_sect_dem = access_non_efs_demand(
    [
        "Commercial_Demand_Reference_Slow_2030.csv",
        "Industrial_Demand_Reference_Slow_2030.csv",
    ]
)
agg_dem = combine_efs_demand(efs_dem=sect_dem, non_efs_dem=ext_sect_dem)

print(agg_dem)

EFSLoadProfile_Reference_Slow.zip successfully downloaded!
EFSLoadProfile_Reference_Slow.zip is compressed using a method that is not supported by the zipfile module.
Trying other extraction methods supported by your OS.
EFSLoadProfile_Reference_Slow.csv successfully extracted!
                               AL           AR           AZ            CA  \
Local Time                                                                  
2016-01-01 00:00:00   8583.711287  4282.634701  8240.605207  26266.669236   
2016-01-01 01:00:00   8334.751328  4200.161507  8243.912717  25045.559218   
2016-01-01 02:00:00   8183.017211  4118.017085  8134.597876  24145.344159   
2016-01-01 03:00:00   8154.666506  4063.997126  8120.779652  23543.010852   
2016-01-01 04:00:00   8180.342767  4079.421848  8173.049191  23356.243470   
...                           ...          ...          ...           ...   
2016-12-31 19:00:00  13728.710656  6534.027144  9876.054819  34362.623982   
2016-12-31 20:00:00  13447.0

`combine_efs_demand` has the ability to save the resulting aggregated demand DataFrame, though it is not enabled by default. To save the DataFrame, set `save` equal to a valid file path and file name when calling `combine_efs_demand`.

Note that it will be incumbent upon the user to account for all sectors when building the aggregated demand profiles (i.e., users will need to ensure that sectors are not excluded or double-counted).

## Mapping the State Demand to the Appropriate Load Zones

Breakthrough Energy's production cost model requires demand to be specified for each load zone. The production cost model also considers operation in terms of UTC time, meaning that the demand provided according to each states' local time must be converted. The `decompose_demand_profile_by_state_to_loadzone` function takes a DataFrame of 8784-hour state-level demand and produces a DataFrame of 8784-hour load zone-level demand. `decompose_demand_profile_by_state_to_loadzone` calls the `shift_local_time_by_loadzone_to_utc` function, which is responsible for shifting the demand data (provided with respect to each state's local time) to be in terms of UTC time. Since this function causes the demand to shift, the last few hours' worth of demand are discarded and the first few hours' worth of demand are empty. To account for the first few hours' worth of demand (i.e., first five hours for EST load zones and first eight hours for PST load zones), the corresponding demand from the morning of December 30th is copied and added to the empty demand slots. Similar to the reasoning used in `account_for_leap_year`, December 30th was chosen since it occurs on the same day of the week as January 1st (in a leap year).

In [9]:
sect_dem = partition_demand_by_sector(es="Reference", ta="Slow", year=2030, fpath="")
agg_dem = combine_efs_demand(efs_dem=sect_dem)
agg_dem_lz = decompose_demand_profile_by_state_to_loadzone(
    df=agg_dem, profile_type="demand"
)

print(agg_dem_lz)

EFSLoadProfile_Reference_Slow.zip successfully downloaded!
EFSLoadProfile_Reference_Slow.zip is compressed using a method that is not supported by the zipfile module.
Trying other extraction methods supported by your OS.
EFSLoadProfile_Reference_Slow.csv successfully extracted!
Reading bus.csv
Reading plant.csv
Reading gencost.csv
Reading branch.csv
Reading dcline.csv
Reading sub.csv
Reading bus2sub.csv
Reading zone.csv
                             1            2           3            4    \
UTC Time                                                                 
2016-01-01 00:00:00  1530.275833  1473.987095  775.688480  7331.368931   
2016-01-01 01:00:00  1480.090997  1423.889929  750.744648  7147.848057   
2016-01-01 02:00:00  1405.286691  1365.685460  719.793818  6899.832339   
2016-01-01 03:00:00  1329.210999  1292.486049  693.919776  6560.864010   
2016-01-01 04:00:00  1238.024822  1181.417796  634.843632  6121.438662   
...                          ...          ...         ... 

Note that the above is the most direct path to acquiring EFS demand data for a particular electrification scenario, technology advancement, and year and formatting it for use in the Breakthrough Energy production cost model. `decompose_demand_profile_by_state_to_loadzone` can also be used to create a profile for a user-specified combination of interconnections and states.

In [10]:
sect_dem = partition_demand_by_sector(es="Reference", ta="Slow", year=2030, fpath="")
agg_dem = combine_efs_demand(efs_dem=sect_dem)
agg_dem_lz = decompose_demand_profile_by_state_to_loadzone(
    df=agg_dem, profile_type="demand", regions=["Texas"]
)

print(agg_dem_lz)

EFSLoadProfile_Reference_Slow.zip successfully downloaded!
EFSLoadProfile_Reference_Slow.zip is compressed using a method that is not supported by the zipfile module.
Trying other extraction methods supported by your OS.
EFSLoadProfile_Reference_Slow.csv successfully extracted!
                            301          302          303          304  \
UTC Time                                                                 
2016-01-01 00:00:00  960.883833  1083.575357  1232.121444  4964.524803   
2016-01-01 01:00:00  975.766654  1100.358507  1251.205377  5041.418731   
2016-01-01 02:00:00  982.066748  1107.463036  1259.283858  5073.968946   
2016-01-01 03:00:00  976.188075  1100.833738  1251.745757  5043.596056   
2016-01-01 04:00:00  952.362578  1073.966055  1221.194815  4920.498686   
...                         ...          ...          ...          ...   
2016-12-31 19:00:00  788.366462   889.029912  1010.905990  4073.192530   
2016-12-31 20:00:00  766.099260   863.919498   982.3532

`decompose_demand_profile_by_state_to_loadzone` has the ability to save the resulting aggregated demand DataFrame, though it is not enabled by default. To save the DataFrame, set `save` equal to a valid file path and file name when calling `decompose_demand_profile_by_state_to_loadzone`.