# ATB Electricity 2023

In this notebook, we demonstrate how to access and visualize the 2023 ATB electricity data from AWS S3 storage.

The following site-packages are required for this notebook:

* jupyter
* pyarrow
* pandas
* matplotlib

## 1 Introduction

The ATB Electricity data sets from 2019 through 2023 are managed by the [Open Energy Data Initiative](https://openei.org/wiki/Open_Energy_Data_Initiative_(OEDI)) and stored in the [OEDI Data Lake](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=ATB%2Felectricity%2F) on AWS. For more information about the ATB, please refer to our website, [https://atb.nrel.gov/](https://atb.nrel.gov/).

### 1.1 Import Data

The first step in this notebook is to import the 2023 data into a Pandas dataframe.

In [None]:
import pandas as pd

url = 'https://oedi-data-lake.s3.amazonaws.com/ATB/electricity/parquet/2023/ATBe.parquet'
raw_data = pd.read_parquet(url)
raw_data = raw_data.astype(
    dtype = {
        'core_metric_key': 'string',
        'core_metric_parameter': 'string',
        'core_metric_case': 'string',
        'crpyears': 'string',
        'technology': 'string',
        'technology_alias': 'string',
        'techdetail': 'string',
        'techdetail2': 'string',
        'resourcedetail': 'string',
        'display_name': 'string',
        'scale': 'string',
        'maturity': 'string',
        'scenario': 'string',
        'units': 'string'
    }
)
print('Columns: \n')
print(raw_data.dtypes)

### 1.2 Explore Data

A Pandas dataframe has great functionality for exploring and filtering the data.

In [None]:
from pprint import pprint

print('Here are all of the parameters present in this dataset:\n')
pprint(list(raw_data['core_metric_parameter'].unique()))

print('\nHere are all of the technologies:\n')
pprint(list(raw_data['technology'].unique()))

We can drill down and see all of the available details for a given technology.

In [None]:
tech = 'OffShoreWind'   # Plug in any tech from the list above.
raw_data.loc[raw_data['technology'] == tech, ['technology', 'techdetail', 'techdetail2', 'resourcedetail', 'scale', 'maturity']].drop_duplicates()

### 1.3 Simple plot

Next, let's make a simple plot to see the projected levelized cost of energy for land based wind power (class 10). In this dataset, 'core_metric_variable' corresponds to year in this time series data.

In [None]:
import matplotlib.pyplot as plt

df = raw_data[
    (raw_data['core_metric_parameter'] == 'LCOE') &
    (raw_data['core_metric_case'] == 'Market') &
    (raw_data['crpyears'] == '30') &
    (raw_data['technology'] == 'OffShoreWind') &
    (raw_data['techdetail'] == 'Class1') &
    (raw_data['techdetail2'] == 'Fixed-Bottom')
].sort_values(by='core_metric_variable')
units = df.units.iloc[0]
scenarios = ['Conservative', 'Moderate', 'Advanced']

fig = plt.figure(figsize=(8,4))
for scenario in scenarios:
    plt.plot(
        df[df['scenario'] == scenario]['core_metric_variable'],
        df[df['scenario'] == scenario]['value'],
        figure = fig,
        marker='.'
        )
plt.title('Levelized Cost of Energy for Offshore Wind', size=10)
plt.xlabel('Year', size=10)
plt.ylabel(units, size=10)
plt.ylim(bottom=0)
plt.legend(scenarios)
plt.show()

# 2 Financial Data

## 2.1 Filtering

Let's capture all the financial data. Note that 'core_metric_variable' is actually the year in this time series data, and 'value' is the value of the 'core_metric_parameter' given as a decimal rate.

In [None]:
# Select columns of interest
df = raw_data[
    [
        'core_metric_parameter',
        'core_metric_case',
        'crpyears',
        'technology',
        'scenario',
        'core_metric_variable',
        'value'
    ]
]

# Select rows that correspond to the financial parameters
df = df[
    df['core_metric_parameter'].isin(
        [
            'Calculated Rate of Return on Equity Real',
            'Calculated Interest Rate Real',
            'Debt Fraction',
            'FCR',
            'Interest Rate Nominal',
            'Rate of Return on Equity Nominal',
            'WACC Nominal',
            'WACC Real'
        ]
    )
]

# Additional filtering and sorting
df = df[
    (df.crpyears.isin(['30','*'])) &
    (df.scenario == 'Moderate')
    ]
df = df.sort_values(by='core_metric_variable')

df

## 2.2 Interactive Visualization

Now that we have a dataframe of the financials, we can build interactive plots using ipywidgets. We also add functionality for the user to export their data selection as a .csv.

In [None]:
from ipywidgets import Dropdown, widgets, Button, Layout, SelectMultiple, Output, Text, Output, Box, Layout
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import numpy as np
from math import ceil

# Create output widget
out = Output(layout=widgets.Layout(height='300px'))

def update_plot(w):
    with out:
        # Without clear_output(), figures get appended below each other inside
        # the output widget
        # Ah ha! Got it! I need wait=True!
        out.clear_output(wait=True)
        #plt.plot(x, x**p_widget.value)
        
        # Filter the df based on the user's choices in the dropdowns
        df_f = df[
            (df['core_metric_parameter'] == core_metric_parameter_W.value) &
            (df['technology'].isin(technology_W.value))
        ]
        df_market = df_f[df_f['core_metric_case'] == 'Market']
        df_RandD = df_f[df_f['core_metric_case'] == 'R&D']
        
        # Plot the data
        fig, (ax1, ax2) = plt.subplots(1, 2, figsize = (8,3), sharey = True)
        fig.suptitle(core_metric_parameter_W.value)
        for tech in technology_W.value:
            df_market[df_market['technology'] == tech].plot(x = 'core_metric_variable', y = 'value', ax = ax1, xlabel = 'year', title = 'Market')
            df_RandD[df_RandD['technology'] == tech].plot(x = 'core_metric_variable', y = 'value', ax = ax2, xlabel = 'year', title = 'R & D')
        ax1.get_legend().remove()
        ax1.set_ylim(ymin = 0, ymax = ymax[core_metric_parameter_W.value])
        ax1.yaxis.set_major_formatter(mtick.PercentFormatter(1.0)) # Converts the decimal rate value to be displayed as a percentage
        ax2.legend(technology_W.value, loc='best', bbox_to_anchor=(1, 0.5))
        fig.tight_layout()
        plt.show()
# Get unique values for user controls
core_metric_parameter = df['core_metric_parameter'].unique()
technology_options = df['technology'].unique()

# Find a reasonable maximum value for the y-axis for each core_metric_parameter
ymax = {}
for metric in core_metric_parameter:
    ymax[metric] = ceil((df[df['core_metric_parameter'] == metric]['value'].max() + .01) * 10)/10

# Make widgets
core_metric_parameter_W = Dropdown(options = core_metric_parameter, description='Parameter:')
technology_W = SelectMultiple(options = technology_options, value = [technology_options[0]],description='Technology:')

update_plot([])
core_metric_parameter_W.observe(update_plot)
technology_W.observe(update_plot)
box_layout = Layout()
filter_box = Box(children=[core_metric_parameter_W, technology_W], layout=box_layout)
display(filter_box, out)

## Now we set up widgets to facilitate exporting the data to a file
# Text widget for export filename
filename_W = Text(
    value='ATB_filtered_financials.csv',
    description='File Name:',
    disabled=False
)

# Make and style a Button widget to trigger data export
button = Button(description="Export Selection to .csv", layout = Layout(width = '200px', height = '50px'))
button.style.button_color = 'green'

# Set button action that will export the ATB dataset, filtered as per user's dropdown choices, once they click the button
def on_button_clicked(b):
    with out:
        # Filter the df based on the user's choices in the dropdowns
        df_f = df[
            (df['core_metric_parameter'] == core_metric_parameter_W.value) &
            (df['technology'].isin(technology_W.value))
        ]
        df_f.to_csv(filename_W.value)
        print("CSV Export Successful!")

# Display button and link to action
export_box = Box(children=[button, filename_W], layout=box_layout)
display(export_box)
button.on_click(on_button_clicked)

# 3 Core Metrics
## 3.1 Run Query
Next, let's make a new dataframe by querying the core metrics.

In [None]:
df_CM = raw_data[[
    'core_metric_parameter',
    'core_metric_case',
    'crpyears',
    'technology',
    'techdetail',
    'techdetail2',
    'scenario',
    'units',
    'core_metric_variable',
    'value'
]]

df_CM = df_CM[df_CM['core_metric_parameter'].isin([
    'LCOE',
    'CAPEX',
    'CF',
    'Fixed O&M',
    'Variable O&M'
])]

df_CM = df_CM.sort_values(by=['technology', 'techdetail', 'core_metric_variable'])
df_CM['techdetail_combined'] = df_CM['techdetail'] + ' - ' + df_CM['techdetail2']

df_CM

## 3.2 Interactive Visualization

Just like before, once we have a dataframe, we can use ipywidgets to build visualizations and export functionality.

In [None]:
from ipywidgets import Dropdown, widgets, Button, Layout, Output, Text, Output, Box, Layout
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import numpy as np
from math import ceil

# Create output widget
out = Output(layout=widgets.Layout(height='600px'))

def update_plot(w):
    with out:

        out.clear_output(wait=True)

        # Filter the df based on the user's choices in the dropdowns
        df_CM_f = df_CM[
            (df_CM['technology'] == technology_W.value) &
            (df_CM['core_metric_case'] == case_W.value) &
            (df_CM['crpyears'] == crpyears_W.value) &
            (df_CM['techdetail_combined'] == tech_detail_W.value)
        ]

        # Plot the data in a grid
        fig, ((ax1, ax2, ax3),(ax4, ax5, ax6)) = plt.subplots(2, 3, figsize = (10,6))
        axes = iter([ax1, ax2, ax3, ax4, ax5])
        ax6.set_axis_off()
        for cmp in core_metric_parameters:
            ax = next(axes)
            ax.set_ylim(ymin = 0, ymax = ymax[technology_W.value][cmp])
            for scenario in scenarios:
                df_CM_f[(df_CM_f['core_metric_parameter'] == cmp) & (df_CM_f['scenario'] == scenario)].plot(
                    x = 'core_metric_variable',
                    y = 'value', ax = ax,
                    xlabel = 'Year',
                    ylabel = ylabels[cmp],
                    title = cmp)
            ax.get_legend().remove()
        
        ax5.legend(scenarios, loc='center left', bbox_to_anchor=(1, 0.5))
        fig.tight_layout()
        plt.show()

# Get unique values for user controls
core_metric_parameters = df_CM.core_metric_parameter.unique()
technology_options = df_CM.technology.unique()
case_options = df_CM.core_metric_case.unique()

# List of scenarios for filtering and legend
scenarios = ['Conservative', 'Moderate', 'Advanced']

# Create dictionary of max values for the y-scales in the plots based on technology and core_metric_parameter
ymax = {}
for tech in technology_options:
    ymax[tech] = {}
    for metric in core_metric_parameters:
        x = df_CM[(df_CM['core_metric_parameter'] == metric) & (df_CM['technology'] == tech)].value.max()
        if np.isnan(x) or x == 0:
            ymax[tech][metric] = 1
        else:
            ymax[tech][metric] = x * 1.1

# Create dictionary of units to label the y-axes
ylabels = {}
for metric in core_metric_parameters:
    units = df_CM[df_CM['core_metric_parameter'] == metric].iloc[0]['units']
    if type(units) == str:
        ylabels[metric] = units
    else:
        ylabels[metric] = ''

# The options for crpyears and tech_detail depend on which technology is selected. We need to make these
# widgets update based on the technology_W widget.
crpyears_dict = {}
tech_detail_dict = {}
for tech in technology_options: 
    crpyears_dict[tech] = list(df_CM[df_CM['technology'] == tech]['crpyears'].unique())
    tech_details = list(df_CM[df_CM['technology'] == tech]['techdetail_combined'].unique())
    tech_detail_dict[tech] = tech_details

def update_W_options(*args):
    tech = technology_W.value
    crpyears_W.options = crpyears_dict[tech]
    tech_detail_W.options = tech_detail_dict[tech]

# Make widgets
technology_W = Dropdown(options = technology_options, description='Technology')
tech_detail_W = Dropdown(options = tech_detail_dict[technology_W.value], description='Tech Detail')
case_W = Dropdown(options = case_options, description='Case')
crpyears_W = Dropdown(options = crpyears_dict[technology_W.value], description='CRP')

# Update techdetail options when user selects a technology
technology_W.observe(update_W_options)

# Update plot each time a filter is changed
update_plot([])
technology_W.observe(update_plot)
tech_detail_W.observe(update_plot)
case_W.observe(update_plot)
crpyears_W.observe(update_plot)

# Display widgets
display(technology_W, out)
display(tech_detail_W, out)
display(case_W, out)
display(crpyears_W, out)

## Now we set up widgets to facilitate exporting the data to a file
# Text widget for export filename
filename_W = Text(
    value='ATB_filtered_core_metrics.csv',
    description='File Name:',
    disabled=False
)

# Make and style a Button widget to trigger data export
button = Button(description="Export Selection to .csv", layout = Layout(width = '200px', height = '50px'))
button.style.button_color = 'green'

# Set button action that will export the ATB dataset, filtered as per user's dropdown choices, once they click the button
def on_button_clicked(b):
    with out:
        # Filter the df based on the user's choices in the dropdowns
        df_out = df_CM[
            (df_CM['technology'] == technology_W.value) &
            (df_CM['techdetail_combined'] == tech_detail_W.value) &
            (df_CM['core_metric_case'] == case_W.value) &
            (df_CM['crpyears'] == crpyears_W.value)
        ]
        df_out.to_csv(filename_W.value)
        print("CSV Export Successful!")

# Display button and link to action
export_box = Box(children=[button, filename_W], layout=box_layout)
display(export_box)
button.on_click(on_button_clicked)