# Functionality walk-through

<div style="width: 600px;">
This notebook walks through how to use the functionality of the PPA analysis tool through it's Python APIs.
</div>
<br>
<div style="width: 600px;">
While interface.ipynb provides a convenient way for users to get started using the tool, advanced users 
requiring more flexibility or wishing to automate or batch the analysis may find it better to use the
functionality directly through the provided Python functions.
</div>
<br>
<div style="width: 600px;">
This notebook aims to help the user understand how to complete an analysis using the tool's Python
functions by providing a walk-through of how to use the functions in an analysis pipeline, starting
with data loading and continuing all the way through to bill calculation.
</div>

## Table of contents

1. Data loading
    1. Load data
    2. Generation data
    3. Price data
    4. Emissions data
2. Firming costs
3. LCOE calculations
4. Contract optimisation
5. Battery operation
6. Load flexibility
7. Bill calculation

In [1]:
# Firstly we'll just turn off NEMOSIS warnings to keep things clean

import logging

logging.getLogger("nemosis").setLevel(logging.ERROR)


## 1. Data loading

<div style="width: 600px;">
The sections below, Generation data, Price data, Load data, and Emissions data 
show how the tool's inbuilt functionality can be used to prepare datasets for
PPA analysis and modelling. However, the output in the below examples also 
provides a good guide as to the data formats required by the tool for users
wishing to prepare their own datasets.
</div>

<br>

<div style="width: 600px;">
Firstly, we define a few common parameters we'll use for the various data preparation steps and import the tool advanced settings, which define where various types of data are stored.
</div>

In [2]:
from datetime import datetime

from ppa_analysis import advanced_settings, import_data

start_date = '2020-01-01 00:00:00'
end_date = '2020-02-01 00:00:00'

# times need to be given to PPA analysis functions as objects.
start_date = datetime.strptime(start_date, '%Y-%m-%d %H:%M:%S')
end_date = datetime.strptime(end_date, '%Y-%m-%d %H:%M:%S')

### 1.A Load data

<div style="width: 600px;">
Example load data that ships with the tool can be fetched using get_load_data, 
as per the example below.
</div>
<br>
<div style="width: 600px;">
If the user wishes to use their own data they can store it in the same format
as the example data and use the get_load_data function. The expected format is a 
CSV with two columns, one specifying the datetime and the 
other load volume in MWh. The datetime column can be in various formats that pandas 
will automatically pass. Otherwise, the user can create a DataFrame following the output
format of get_load_data according to their own methods.
</div>


In [3]:
load_data, first_load_timestamp, last_load_timestamp = import_data.get_load_data(
    load_file_name='data_caches/c_and_i_customer_loads/(18) Hospital NQ.csv',
    datetime_col_name='TS',
    load_col_name='Load',
    day_first=False
)

load_data.head()

Some missing data found. Filled with zeros.



Unnamed: 0_level_0,Load
DateTime,Unnamed: 1_level_1
2019-01-02 00:00:00,0.659559
2019-01-02 01:00:00,1.430449
2019-01-02 02:00:00,1.329035
2019-01-02 03:00:00,1.274038
2019-01-02 04:00:00,1.239689


### 1.B Generation data

<div style="width: 600px;">
Generation data in the format required for the PPA analysis tool is prepared by 
fetching bulk generation data for all generators of the technology types of 
interest and saving in the parquet file format, then the data is retrieved 
from the parquet file and filtered to just the desired units. Note, the step to
save to parquet is primarily included to speed up repeatedly retrieving
data for the same time period, but is also necessary as it processes the data into
the format required by other PPA analysis tool functions.
</div>



In [4]:
# all_gen_data = import_data.get_generation_data(
#     cache='data_caches/gen_data_cache',
#     technology_type_s=['WIND - ONSHORE', 'PHOTOVOLTAIC FLAT PANEL'],
#     start_date=start_date,
#     end_date=end_date
# )

# all_gen_data.to_parquet(
#     'data_caches/examples/gen_data.parquet'
# )

filtered_gen_data = import_data.get_preprocessed_gen_data(
    file='data_caches/examples/gen_data.parquet', 
    regions=['QLD1']
)

filtered_gen_data = filtered_gen_data[[
    'CSPVPS1: PHOTOVOLTAIC FLAT PANEL',
    'COOPGWF1: WIND - ONSHORE'
]]

filtered_gen_data.head()

UNIT,CSPVPS1: PHOTOVOLTAIC FLAT PANEL,COOPGWF1: WIND - ONSHORE
DateTime,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-01-01 01:00:00,0.0,82.488179
2020-01-01 02:00:00,0.0,77.129198
2020-01-01 03:00:00,0.0,80.869461
2020-01-01 04:00:00,0.0,68.187557
2020-01-01 05:00:00,0.0,58.687316


### 1.C Price data

<div style="width: 600px;">
Price data in the format required for the PPA analysis tool is prepared by 
fetching pricing data for all regions using the function import_data.get_wholesale_price_data. If desired, the data can be saved to a file parequet and retrieved again using the function get_preprocessed_price_data, which will be faster if the same dataset is reused.
</div>
<br>
<div style="width: 600px;">
Note, if caching is not performed, then the user will need to filter the pricing data
to a single region and drop the REGIONID column, as usually this step is performed in
get_preprocessed_price_data.
</div>

In [5]:
# price_data = import_data.get_wholesale_price_data(
#     cache='data_caches/pricing_cache',
#     start_date=start_date, 
#     end_date=end_date
# )

# price_data.to_parquet(
#     'data_caches/examples/price_data.parquet'
# )

price_data = import_data.get_preprocessed_price_data(
    file='data_caches/examples/price_data.parquet',
    region='QLD1'
)

price_data.head()

Unnamed: 0_level_0,RRP
DateTime,Unnamed: 1_level_1
2020-01-01 01:00:00,52.119075
2020-01-01 02:00:00,51.180633
2020-01-01 03:00:00,51.841654
2020-01-01 04:00:00,51.313714
2020-01-01 05:00:00,48.306674


### 1.D Emissions data

<div style="width: 600px;">
Emissions data in the format required for the PPA analysis tool is prepared by 
fetching pricing data for all regions using the function get_avg_emissions_intensity. If desired, the data can be saved to a file parequet and retrieved again using the function get_preprocessed_average_emissions_intensity_data, which will be faster if the same dataset is reused.
</div>
<br>
<div style="width: 600px;">
Note, if caching is not performed, then the user will need to filter the pricing data
to a single region and drop the REGIONID column, as usually this step is performed in
get_preprocessed_price_data.
</div>

In [6]:
# emissions_data = import_data.get_avg_emissions_intensity_data(
#     cache='data_caches/pricing_cache',
#     start_date=start_date, 
#     end_date=end_date
# )

# emissions_data.to_parquet(
#     'data_caches/examples/emissions_data.parquet'
# )

emissions_data = import_data.get_preprocessed_average_emissions_intensity_data(
    file='data_caches/examples/emissions_data.parquet',
    region='QLD1'
)

emissions_data.head()

Unnamed: 0_level_0,AEI
DateTime,Unnamed: 1_level_1
2020-01-01 01:00:00,0.785886
2020-01-01 02:00:00,0.857561
2020-01-01 03:00:00,0.856539
2020-01-01 04:00:00,0.857571
2020-01-01 05:00:00,0.861483


## 2. Firming costs

<div style="width: 600px;">
Functionality is provided for setting the time varying costs of purchasing energy
consumption not covered by the renewable energy generation. This functionality
is provided by the function firming_contracts.choose_firming_type.
</div>

In [7]:
from ppa_analysis import firming_contracts

price_data = firming_contracts.choose_firming_type(
        firming_type='Wholesale exposed',
        time_series_data=price_data
)

price_data.head()

Unnamed: 0_level_0,RRP,Firming price
DateTime,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-01-01 01:00:00,52.119075,52.119075
2020-01-01 02:00:00,51.180633,51.180633
2020-01-01 03:00:00,51.841654,51.841654
2020-01-01 04:00:00,51.313714,51.313714
2020-01-01 05:00:00,48.306674,48.306674


## 3. LCOE calculations

<div style="width: 600px;">
The PPA analysis tool includes functionality for calculating renewable generator's levelised cost of electricity (LCOE). The LCOE is then used when running the optimisation that underpins the matching functionality, with the cost procuring energy from a particular generator in the optimisation being set to the generators LCOE. 
</div>

In [8]:
# There is default generator data for calculating LCOE stored in the tool's
# advanced settings module.

from pprint import pprint

low_cost_scen_data = advanced_settings.GEN_COST_DATA['GenCost 2018 Low']

pprint(low_cost_scen_data)


{'Photovoltaic Flat Panel': {'Capacity Factor': 0.22,
                             'Capital ($/kW)': 1280,
                             'Construction Time (years)': 1.0,
                             'Economic Life (years)': 25,
                             'Fixed O&M ($/kW)': 14.4,
                             'Variable O&M ($/kWh)': 0.0},
 'Wind': {'Capacity Factor': 0.35,
          'Capital ($/kW)': 2005,
          'Construction Time (years)': 1.2,
          'Economic Life (years)': 25,
          'Fixed O&M ($/kW)': 36.0,
          'Variable O&M ($/kWh)': 0.0027}}


In [9]:
# Now, we need to create a generator data dictionary with data for each of
# the generators we want to include in the analysis. Here, we'll just use the 
# of the same generators we fetched generation data for before, and give them
# default cost from the advanced_settings module.

generator_cost_data = {}

generator_cost_data['CSPVPS1: PHOTOVOLTAIC FLAT PANEL'] = \
    low_cost_scen_data['Photovoltaic Flat Panel']

generator_cost_data['COOPGWF1: WIND - ONSHORE'] = \
    low_cost_scen_data['Wind']


In [10]:
# We can then call get_all_lcoes from the helper functions to calculate each
# generators LCOE based on its cost data and capacity factor.

from ppa_analysis import helper_functions

lcoe_data = helper_functions.get_all_lcoes(generator_cost_data)

lcoe_data


{'CSPVPS1: PHOTOVOLTAIC FLAT PANEL': 64.4652667302055,
 'COOPGWF1: WIND - ONSHORE': 70.55717124182475}

## 4. Contract optimisation

<div style="width: 600px;">
This section details how to use the PPA analysis functionality to find a mix of 
capacity to contract from a set of renewable energy generators that will result
in a combined generation profile that is well-matched to the load profile and 
minimises costs, and that 
generation is bought at LCOE and that load not by the renewable generators 
is purchased at the wholesale spot price.
</div>
<br>
<div style="width: 600px;">
Just the basic usage of the function hybrid.create_hybrid_generation, which is used 
to run the optimisation, is demonstrated here. The user can read the documentation 
strings of the functions hybrid_pap, hybrid_pac, hybrid_shaped, hybrid_baseload, and hybrid_247 to see the optimisation details for each PPA contract type. The user can 
also read the documentation for hybrid.run_hybrid_optimisation, to see further optimisation implementation details. 
</div>



In [11]:
# First, let's check our load data cover's the time period we've got price
# and generation data for.
print(first_load_timestamp)
print(last_load_timestamp)

2019-01-02 00:00:00
2024-01-01 00:00:00


In [12]:
# We loaded price and generation data for a day in 2020, so we will have load
# data that overlaps with the generation and price data.

In [13]:
# Then, we combine time series load, generation, and price data into a single 
# DataFrame.

import pandas as pd

time_series_data = price_data.join(load_data)
time_series_data = time_series_data.join(filtered_gen_data)

time_series_data.head()

Unnamed: 0_level_0,RRP,Firming price,Load,CSPVPS1: PHOTOVOLTAIC FLAT PANEL,COOPGWF1: WIND - ONSHORE
DateTime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2020-01-01 01:00:00,52.119075,52.119075,1.417109,0.0,82.488179
2020-01-01 02:00:00,51.180633,51.180633,1.342925,0.0,77.129198
2020-01-01 03:00:00,51.841654,51.841654,1.387736,0.0,80.869461
2020-01-01 04:00:00,51.313714,51.313714,1.349911,0.0,68.187557
2020-01-01 05:00:00,48.306674,48.306674,1.31116,0.0,58.687316


In [14]:
# Now, we are ready to run the optimisation.

from ppa_analysis import hybrid

time_series_with_contracted_energy, contracted_percentages = hybrid.create_hybrid_generation(
        contract_type='Pay as Produced',
        contracted_amount=50,
        time_series_data=time_series_data,
        generator_info=lcoe_data
)


INFO: Using Python-MIP package version 1.16rc0


In [15]:
# The results indicate the best match of generation to load is achieved by 
# contracting 1.% of the wind generator's capacity and 2.6 % of the solar 
# generator's capacity. Of the total volume supplied 77.2 % is from wind and
# 22.8 is from solar.

pprint(contracted_percentages)

{'COOPGWF1: WIND - ONSHORE': {'Percent of generator output': 0.9150376894998034,
                              'Percent of hybrid trace': 39.51793150163807},
 'CSPVPS1: PHOTOVOLTAIC FLAT PANEL': {'Percent of generator output': 5.074376901622168,
                                      'Percent of hybrid trace': 60.482068498361954}}


In [16]:
# hybrid.create_hybrid_generation also returns the input time series data with
# additional columns 'Hybrid', indicating the combined renewable energy 
# generation profile, and 'Contracted Energy', indicating the total energy the 
# buyer would purchase under this scenario. In the case of the Pay as Produced
# contract the 'Hybrid' and 'Contracted Energy' volumes are the same.

time_series_with_contracted_energy.head()

Unnamed: 0_level_0,RRP,Firming price,Load,CSPVPS1: PHOTOVOLTAIC FLAT PANEL,COOPGWF1: WIND - ONSHORE,Hybrid,Contracted Energy
DateTime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2020-01-01 01:00:00,52.119075,52.119075,1.417109,0.0,82.488179,0.754798,0.754798
2020-01-01 02:00:00,51.180633,51.180633,1.342925,0.0,77.129198,0.705761,0.705761
2020-01-01 03:00:00,51.841654,51.841654,1.387736,0.0,80.869461,0.739986,0.739986
2020-01-01 04:00:00,51.313714,51.313714,1.349911,0.0,68.187557,0.623942,0.623942
2020-01-01 05:00:00,48.306674,48.306674,1.31116,0.0,58.687316,0.537011,0.537011


## 5. Battery operation

<div style="width: 600px;">
The PPA analysis tool also includes functionality for optimising battery operation 
to minimise the cost of purchasing energy not covered by the contracted renewable energy
generation. Here, we demonstrate the use of this functionality using the function
battery.run_battery_optimisation and the resulting time series data from the contract
optimisation performed above.
</div>


In [17]:
from ppa_analysis import battery

time_series_data_with_battery = battery.run_battery_optimisation(
        timeseries_data=time_series_with_contracted_energy,
        rated_power_capacity=1,
        size_in_mwh=4
)
    

In [18]:
# We'll just add an extra column to help the reader see when the battery 
# charges and discharges.
time_series_data_with_battery['Battery Dispatch'] = \
    time_series_data_with_battery['Load'] - \
    time_series_data_with_battery['Load with battery']

time_series_data_with_battery.head(24)

Unnamed: 0_level_0,RRP,Firming price,Load,CSPVPS1: PHOTOVOLTAIC FLAT PANEL,COOPGWF1: WIND - ONSHORE,Hybrid,Contracted Energy,Load with battery,Battery Dispatch
DateTime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2020-01-01 01:00:00,52.119075,52.119075,1.417109,0.0,82.488179,0.754798,0.754798,1.417109,0.0
2020-01-01 02:00:00,51.180633,51.180633,1.342925,0.0,77.129198,0.705761,0.705761,1.342925,0.0
2020-01-01 03:00:00,51.841654,51.841654,1.387736,0.0,80.869461,0.739986,0.739986,1.387736,0.0
2020-01-01 04:00:00,51.313714,51.313714,1.349911,0.0,68.187557,0.623942,0.623942,1.349911,0.0
2020-01-01 05:00:00,48.306674,48.306674,1.31116,0.0,58.687316,0.537011,0.537011,1.31116,0.0
2020-01-01 06:00:00,47.867489,47.867489,1.39099,0.21125,57.511213,0.536969,0.536969,1.39099,0.0
2020-01-01 07:00:00,33.514092,33.514092,1.556107,3.532417,60.509716,0.732935,0.732935,1.556107,0.0
2020-01-01 08:00:00,30.953977,30.953977,1.693151,14.135833,61.314652,1.278358,1.278358,1.693151,0.0
2020-01-01 09:00:00,15.922192,15.922192,1.782791,25.173583,44.464554,1.68427,1.68427,1.782791,0.0
2020-01-01 10:00:00,15.939535,15.939535,1.792339,31.875083,32.487943,1.914739,1.914739,1.914739,-0.1224


## 6. Load flexibility

<div style="width: 600px;">
Additionally, the PPA analysis tool also includes functionality for optimising load flexibility 
operation to minimise the cost of purchasing energy not covered by the contracted renewable energy
generation. Here, we demonstrate the use of this functionality using the function
load_flex.daily_load_shifting and the time series data from the contract
optimisation performed above.
</div>
<br>
<div style="width: 600px;">
Load flexibility is modelled by calculating a baseload profile for each day which cannot be shifted, with the
remaining flexible load being dispatched across the day to minimise the cost of purchasing energy at the wholesale
price. The baseload profile is calculated by specifying a quantile. Then, on a monthly basis, a daily baseload profile is 
defined by taking the quantile of the load consumption for each hour in the day across the month, with weekdays and 
weekends profiles created separately. For example, if the quantile 0.5 were specified, then for each hour in the day
the median consumption for that hour across the month would be used as the consumption in the monthly baseload profile.
Additionally, before a baseload profile is used to calculate a particular day's
load shifting, if the consumption of the baseload falls below the actual generation for any hour, then the baseload
profile is reset to the actual generation for that hour.
</div>

In [19]:
# First, let's adjust the time series data to account for the battery charge
# and discharge, this step could be skipped if a battery wasn't modelled.

time_series_data_with_battery['Load'] = \
    time_series_data_with_battery['Load with battery']


In [20]:
# And then we can run the load shifting optimisation.

from ppa_analysis import load_flex

time_series_data_with_load_flex = load_flex.daily_load_shifting(
        timeseries_data=time_series_data_with_battery,
        base_load_quantile=0.10,
        lower_price=0.0,
        ramp_up_price=0.01,
        ramp_down_price=0.01
)

time_series_data_with_load_flex.head(24)

Unnamed: 0_level_0,Load dispatch,Contracted Energy,Original load,Base load,Firming,Raised load,Ramp up,Ramp down,Load with flex
DateTime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2020-01-01 01:00:00,0.0,0.754798,1.417109,1.346612,0.591815,2.398841,0.0,-0.022055,1.346612
2020-01-01 02:00:00,0.0,0.705761,1.342925,1.324557,0.618796,2.398841,0.008725,0.0,1.324557
2020-01-01 03:00:00,0.0,0.739986,1.387736,1.333283,0.593297,2.398841,0.0,-0.023372,1.333283
2020-01-01 04:00:00,0.0,0.623942,1.349911,1.309911,0.685969,2.398841,0.001249,0.0,1.309911
2020-01-01 05:00:00,0.0,0.537011,1.31116,1.31116,0.774149,2.398841,0.06174,0.0,1.31116
2020-01-01 06:00:00,0.0,0.536969,1.39099,1.3729,0.835931,2.398841,0.114123,0.0,1.3729
2020-01-01 07:00:00,0.0,0.732935,1.556107,1.487023,0.754088,2.398841,0.911818,0.0,1.487023
2020-01-01 08:00:00,0.802434,1.278358,1.693151,1.596407,1.120484,2.398841,0.0,0.0,2.398841
2020-01-01 09:00:00,0.785021,1.68427,1.782791,1.61382,0.714571,2.398841,0.0,0.0,2.398841
2020-01-01 10:00:00,0.760483,1.914739,1.914739,1.638358,0.484102,2.398841,0.0,0.0,2.398841


## 7. Bill calculation

<div style="width: 600px;">
Finally, the PPA analysis tool provides functionality for calculating the bills associated with a PPA contract.
This functionality is demonstrated below using the output from the contract optimisation and the function
bill_calc.calculate_bill. Note, this function expects the columns 'Load', 'Contracted Energy', 'RRP', 
'Firming price' and the column 'Hybrid' if a contract_type of 'Baseload' or 'Shaped'. Therefore, if the outputs
of the battery or load optimisation are to be used, the 'Load' column may need to be re-calculated, and should
be the original load, net any battery or load dispatch.
</div>

In [21]:
from ppa_analysis import bill_calc

bill = bill_calc.calculate_bill(
    volume_and_price=time_series_with_contracted_energy,
    settlement_period='Y',
    contract_type='Pay as Produced',
    strike_price=75.0,
    lgc_buy_price=10.0,
    lgc_sell_price=10.0,
    shortfall_penalty=50.0,
    guaranteed_percent=90.0,
    excess_price='Wholesale',
    indexation=1.0,
    index_period='Y',
    floor_price= -1000.0
)

bill

Unnamed: 0_level_0,PPA Value,PPA Settlement,PPA Final Cost,Firming Costs,Revenue from on-sold RE,Revenue from excess LGCs,Cost of shortfall LGCs,Shortfall Payments Received,Total
DateTime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2020-12-31,46421.524755,7726.12065,46421.524755,37976.748159,-0.0,0.0,4893.435883,-24467.179416,64824.529381
