## Flow for paper modelling different PPA scenarios

Outline:
1. Basic data analysis: quality of data, basic statistics. For both load and gen
2. Select the scenario to be modelled. Variables to define include:
    * PPA delivery structure (PaP, shaped, baseload)
    * Use of (and params around) demand shifting
    * Whether a battery is added and what size
3. Create the 'optimal' hybrid profile according to the delivery structure:
    * For PaP - 'optimal' is best match to customer load
    * For shaped - 'optimal' is best match to customer load (maybe? Or is this just most consistent shape, i.e. most reliable to deliver? Probably this but idk how this would work)
    * For baseload - 'optimal' is best match to contract shape
4. If load shifting and/or battery operation is involved: add that here
5. Compare traces to find key values:
    * Matched and unmatched load
    * Matched and unmatched contracted generation
    * Wholesale value of each match/unmatch load and gen
    * Emissions associated with unmatched load
6. Calculate reasonable strike price(s) including risk-based premiums. Decide and define firming contract
7. Run ppa.calc scripts including firming contract details to receive financial outcomes
8. See what happens!

In [1]:
# ------------------------------ Packages & Files ------------------------------
import pandas as pd
import numpy as np
import seaborn as sns
import seaborn.objects as so
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
import nemed
import ppa, residuals, tariffs, hybrid, firming_contracts
import calendar
import holidays
import pprint
from mip import Model, xsum, minimize, CONTINUOUS, BINARY, conflict, OptimizationStatus
from nemosis import dynamic_data_compiler, static_table
from typing import List
from plotly.subplots import make_subplots
from datetime import datetime, timedelta
from getting_data import import_gen_data, import_load_data, import_pricing_data, import_emissions_data
from helper_functions import _check_interval_consistency, _check_missing_data, get_interval_length

pd.set_option('display.max_rows', None)

INFO: Using Python-MIP package version 1.15.0


In [2]:
# -------------------------------- USER INPUTS ---------------------------------

# - - - - - - - - - - - - - - - - - - DATA - - - - - - - - - - - - - - - - - - -

# load data file name
LOAD_FN = '/Users/elliekallmier/Desktop/RA_Work/247_TRUZERO/247_ppa/five_year_data/Engineering-Metal and non-metal fabrications_86_SW_same_year.csv'
LOAD_COL_NAME = 'Load'
LOAD_TIMEZONE = ''

# generation data file name, or, generator DUID(s)
# TODO: add function to take input TYPE of generator and select appropriate trace
GEN_FN = ''
GEN_COL_NAME_S = ''

# get name of datetime column for both load and generation files
LOAD_DATETIME_COL_NAME = 'TS'
GEN_DATETIME_COL_NAME = ''
DAY_FIRST = True
GEN_TECH_TYPE_S = ['WIND - ONSHORE', 'PHOTOVOLTAIC FLAT PANEL']

# Region to fetch generation/pricing/emissions data for:
REGION = 'QLD1'

# NEMOSIS inputs:
RAW_DATA_CACHE = 'data_caches/gen_data_cache'
EMISSIONS_CACHE = 'data_caches/nemed_cache'
PRICING_CACHE = 'data_caches/pricing_cache'


# - - - - - - - - - - - - - - - - - CONTRACT - - - - - - - - - - - - - - - - - -
# TODO: fill out
DELIVERY_STRUCTURE = ''
FIRMING_CONTRACT_TYPE = 'Partially wholesale exposed'

EXPOSURE_BOUND_UPPER = 300
EXPOSURE_BOUND_LOWER = 20
RETAIL_TARIFF_DETAILS = {'a':'b'}

FLOOR_PRICE = 0.0



In [3]:
# -------------------------------- Get Load Data -------------------------------
#   - check dtypes of columns - should all be float, except datetime col.
#   - update colname(s)
#   - set datetime index
#   - get interval length
#   - check for NaN/missing data

load_data, interval, start_date, end_date = import_load_data.get_load_data(LOAD_FN, LOAD_DATETIME_COL_NAME, LOAD_COL_NAME, DAY_FIRST)

Some missing data found. Filled with zeros.



In [4]:
# ----------------------------- Get Generation Data ----------------------------
gen_data = import_gen_data.get_generation_data(RAW_DATA_CACHE, REGION, GEN_TECH_TYPE_S, interval, start_date=start_date, end_date=end_date)

INFO: Retrieving static table Generators and Scheduled Loads
INFO: Downloading data for table Generators and Scheduled Loads
INFO: Compiling data for table DISPATCH_UNIT_SCADA
INFO: Returning DISPATCH_UNIT_SCADA.
Some missing data found. Filled with zeros.



In [5]:
# ------------------------ Get Pricing & Emissions Data ------------------------
emissions_intensity = import_emissions_data.get_avg_emissions_intensity(
    start_date, end_date, EMISSIONS_CACHE, [REGION], period=f'{interval}min'
)

INFO: Processing total emissions from 2019-01-01 to 2019-02-01
INFO: Compiling data for table DISPATCH_UNIT_SCADA
INFO: Returning DISPATCH_UNIT_SCADA.
INFO: Downloading data for table DUDETAILSUMMARY, year 2024, month 01
INFO: Creating feather file for DUDETAILSUMMARY, 2024, 01
INFO: Downloading data for table DUALLOC, year 2024, month 01
INFO: Creating feather file for DUALLOC, 2024, 01
INFO: Compiling Energy from Dispatch
INFO: Processing total emissions from 2019-02-01 to 2019-03-01
INFO: Compiling data for table DISPATCH_UNIT_SCADA
INFO: Returning DISPATCH_UNIT_SCADA.
INFO: Compiling Energy from Dispatch
INFO: Processing total emissions from 2019-03-01 to 2019-04-01
INFO: Compiling data for table DISPATCH_UNIT_SCADA
INFO: Returning DISPATCH_UNIT_SCADA.
INFO: Compiling Energy from Dispatch
INFO: Processing total emissions from 2019-04-01 to 2019-05-01
INFO: Compiling data for table DISPATCH_UNIT_SCADA
INFO: Returning DISPATCH_UNIT_SCADA.
INFO: Compiling Energy from Dispatch
INFO: Pr

In [6]:
price_data = import_pricing_data.get_wholesale_price_data(
    start_date, end_date, PRICING_CACHE, [REGION], period=f'{interval}min'
)

INFO: Compiling data for table DISPATCHPRICE
INFO: Returning DISPATCHPRICE.


In [7]:
load_data = load_data.resample('H').sum(numeric_only=True)
gen_data = gen_data.resample('H').sum(numeric_only=True)
emissions_intensity = emissions_intensity.resample('H').mean(numeric_only=True)
price_data = price_data.resample('H').mean(numeric_only=True)

In [8]:
gen_data_test = gen_data.copy()


one_yr_gen = gen_data.iloc[:24*365].copy().sum()
one_yr_load = load_data.iloc[:24*365].copy().sum()

print(one_yr_gen, one_yr_load)

for gen in gen_data_test.columns:
    print(load_data['Load'].sum() / gen_data_test[gen].sum())
    print((one_yr_load['Load'] / one_yr_gen[gen]))
    gen_data_test[gen] = gen_data_test[gen] * (one_yr_load['Load'] / one_yr_gen[gen])

gen_data_test.head()

CSPVPS1: Photovoltaic Flat Panel    163374.778333
DDSF1: Photovoltaic Flat Panel      474998.502245
KSP1: Photovoltaic Flat panel       225596.866520
MEWF1: Wind - Onshore               906803.669931
SMCSF1: Photovoltaic Flat panel     320754.601092
SRSF1: Photovoltaic Flat panel      241629.809607
dtype: float64 Load    3667506.8
dtype: float64
21.32569716335274
22.448427091464456
8.587477317936358
7.721091293269663
15.63252097672999
16.256904878928637
4.256963065297758
4.044433124403946
12.218804471217107
11.433995919366044
13.2496804629264
15.178205064888697


Unnamed: 0_level_0,CSPVPS1: Photovoltaic Flat Panel,DDSF1: Photovoltaic Flat Panel,KSP1: Photovoltaic Flat panel,MEWF1: Wind - Onshore,SMCSF1: Photovoltaic Flat panel,SRSF1: Photovoltaic Flat panel
DateTime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2019-01-01 23:00:00,0.0,0.0,0.0,439.913379,0.0,0.0
2019-01-02 00:00:00,0.0,0.0,0.0,845.495404,0.0,0.0
2019-01-02 01:00:00,0.0,0.0,0.0,833.360818,0.0,0.0
2019-01-02 02:00:00,0.0,0.0,0.0,869.872275,0.0,0.0
2019-01-02 03:00:00,0.0,0.0,0.0,785.51268,0.0,0.0


In [9]:
# Now combine all of the data together:
combined_data = pd.concat([load_data, gen_data_test, price_data, emissions_intensity], axis='columns')

In [10]:
combined_data_firming = firming_contracts.choose_firming_type(
    FIRMING_CONTRACT_TYPE, combined_data, [REGION], EXPOSURE_BOUND_UPPER, EXPOSURE_BOUND_LOWER, RETAIL_TARIFF_DETAILS
)

In [11]:
combined_data_firming.columns

Index(['Load', 'CSPVPS1: Photovoltaic Flat Panel',
       'DDSF1: Photovoltaic Flat Panel', 'KSP1: Photovoltaic Flat panel',
       'MEWF1: Wind - Onshore', 'SMCSF1: Photovoltaic Flat panel',
       'SRSF1: Photovoltaic Flat panel', 'RRP: QLD1', 'AEI: QLD1',
       'Firming price: QLD1'],
      dtype='object')

In [12]:
# ------------------------- CONTRACT DELIVERY STRUCTURE ------------------------

# choice between: PaP, Shaped, Baseload, 24/7, PaC
# Shaped and Baseload can both be re-scaled annually/quarterly/monthly as desired

In [13]:
def check_leap_year(
        df:pd.DataFrame,
        intervals_in_day:int
) -> bool:
    day_one = df.index[0]
    day_365 = day_one + timedelta(days=365)

    return ~(day_one.day == day_365.day)

In [32]:
def run_hybrid_optimisation(
        contracted_energy:pd.Series,
        wholesale_prices:pd.Series,
        generation_data:pd.DataFrame,
        excess_penalty:float,
        total_sum:float,
        contract_type:str,
        cfe_score_min:float=None,
        upscale_factor:float=1.0
) -> tuple[pd.Series, dict[str:dict[str:float]]]:

    # TODO: consider if this return structure is actually best/fit for purpose here
    gen_names = {}
    gen_data_series = {}
    wholesale_prices_vals = np.array(wholesale_prices.clip(lower=0.0).values)

    # print(contracted_energy)

    market_cap = 16600  # market price cap value to use as oversupply penalty

    for _, gen in enumerate(generation_data):
        gen_data_series[str(_)] = generation_data[gen].copy()
        gen_names[str(_)] = gen

    # Create the optimisation model and set up constants/variables:
    R = range(len(contracted_energy))       # how many time intervals in total
    G = range(len(generation_data.columns))         # how many columns of generators

    m = Model()
    percent_of_generation = {}
    # Add a 'percentage' variable for each generator
    for g in G:
        percent_of_generation[str(g)] = m.add_var(var_type=CONTINUOUS, lb=0.0, ub=1.0)

    excess = [m.add_var(var_type=CONTINUOUS, lb=0.0) for r in R]
    unmatched = [m.add_var(var_type=CONTINUOUS, lb=0.0, ub = contracted_energy.max()) for r in R]
    hybrid_gen_sum = [m.add_var(var_type=CONTINUOUS, lb=0.0) for r in R]
    oversupply_flip_var = m.add_var(var_type=CONTINUOUS, lb=0.0)

    # add the objective: to minimise firming (unmatched) - add price here?? (Yes)
    m.objective = minimize(xsum((unmatched[r] + unmatched[r]*wholesale_prices_vals[r] + excess[r]*excess_penalty) for r in R) + oversupply_flip_var*market_cap)

    # Add to hybrid_gen_sum variable by adding together each generation trace by the percentage variable
    for r in R:
        m += hybrid_gen_sum[r] <= sum([gen_data_series[str(g)][r] * percent_of_generation[str(g)] for g in G])
        m += hybrid_gen_sum[r] >= sum([gen_data_series[str(g)][r] * percent_of_generation[str(g)] for g in G])

    for r in R:
        m += unmatched[r] >= contracted_energy[r] - hybrid_gen_sum[r]
        m += excess[r] >= hybrid_gen_sum[r] - contracted_energy[r]

    
    m += xsum(hybrid_gen_sum[r] for r in R) >= total_sum
    m += oversupply_flip_var >= xsum(hybrid_gen_sum[r] for r in R) - total_sum

    # Add constraint around CFE matching percent:
    if contract_type == '24/7':
        m += xsum(unmatched[r] for r in R) <= (1 - cfe_score_min) * total_sum

    m.verbose = 0
    status = m.optimize()

    hybrid_trace = pd.DataFrame(generation_data)
    hybrid_trace['Hybrid'] = 0
    
    if status == OptimizationStatus.INFEASIBLE:
        if contract_type == '24/7':
            print('Infeasible problem under current constraints: trying again with no CFE limit.')
            m.clear()
            return run_hybrid_optimisation(contracted_energy, wholesale_prices, generation_data, excess_penalty, total_sum, 'Pay as Produced')
        elif contract_type == 'Shaped':
            print('Infeasible problem under current constraints: trying again with generation scaled up 10%.')
            m.clear()
            return run_hybrid_optimisation(contracted_energy, wholesale_prices, generation_data*(upscale_factor+0.1), excess_penalty, total_sum, 'Shaped', upscale_factor=(upscale_factor+0.1))
        


    if status == OptimizationStatus.OPTIMAL or status == OptimizationStatus.FEASIBLE:
        for g in G:         
            hybrid_trace['Hybrid'] += gen_data_series[str(g)] * percent_of_generation[str(g)].x

        results = {}
        for g in G:
            name = gen_names[str(g)]
            details = {
                'Percent of generator output' : percent_of_generation[str(g)].x,
                'Percent of hybrid trace' : round(
                    sum(percent_of_generation[str(g)].x * gen_data_series[str(g)]) / sum(hybrid_trace['Hybrid']) * 100, 1)
            }

            results[name] = details

        # clear the model at end of run
        # Add some checks to make sure optimisation is running correctly
        check_df = pd.DataFrame()
        check_df['Contracted'] = contracted_energy.copy()
        check_df['Hybrid Gen'] = hybrid_trace['Hybrid'].copy()

        check_df['Unmatched'] = [unmatched[r].x for r in R]
        check_df['Excess'] = [excess[r].x for r in R]

        check_df['Real Unmatched'] = (check_df['Contracted'] - check_df['Hybrid Gen']).clip(lower=0.0)
        check_df['Real Excess'] = (check_df['Hybrid Gen'] - check_df['Contracted']).clip(lower=0.0)

        check_df['Check unmatched'] = (check_df['Real Unmatched'].round(2) == check_df['Unmatched'].round(2))
        check_df['Check excess'] = (check_df['Real Excess'].round(2) == check_df['Excess'].round(2))

        check_df = check_df[(check_df['Check unmatched'] == False) | (check_df['Check excess'] == False)].copy()

        # print(check_df['Real Unmatched'].sum()/total_sum)
        # print(check_df)#[['Contracted', 'Hybrid Gen', 'Excess', 'Real Excess']])

        assert check_df.empty == True, "Unmatched and/or excess variables are not being calculated correctly. Check constraints."

        m.clear()

        return hybrid_trace['Hybrid'], results, upscale_factor

In [33]:
# Helper function to create the "shaped" profile based on the defined period and percentile
def get_percentile_profile(
        period_str:str,
        data:pd.DataFrame,
        percentile:float
) -> pd.DataFrame:
    
    if period_str == 'M':
        percentile_profile_period = data.groupby(
            [data.index.month.rename('Month'), 
             data.index.hour.rename('Hour')]
        ).quantile(percentile)

    if period_str == 'Q':
        percentile_profile_period = data.groupby(
            [data.index.quarter.rename('Quarter'), 
             data.index.hour.rename('Hour')]
        ).quantile(percentile)

    if period_str == 'Y':
        percentile_profile_period = data.groupby(
            data.index.hour.rename('Hour')
        ).quantile(percentile)

    return percentile_profile_period

# Helper function to apply the shaped profile across the whole desired timeseries
def concat_shaped_profiles(
        period_str:str,             # define the re-shaping period (one of 'Y', 'M', 'Q')
        shaped_data:pd.DataFrame,   # df containing the shaped 'percentile profile'
        long_data:pd.DataFrame,     # df containing full datetime index: to apply shaped profiles across
) -> pd.DataFrame:
    
    if period_str == 'M':
        long_data['Month'] = long_data.DateTime.dt.month
        long_data['Hour'] = long_data.DateTime.dt.hour

        long_data = long_data.set_index(['Month', 'Hour'])
        long_data = pd.concat([long_data , shaped_data], axis='columns')
        long_data = long_data.reset_index().drop(columns=['Month', 'Hour'])

    if period_str == 'Q':
        long_data['Quarter'] = long_data.DateTime.dt.quarter
        long_data['Hour'] = long_data.DateTime.dt.hour

        long_data = long_data.set_index(['Quarter', 'Hour'])
        long_data = pd.concat([long_data , shaped_data], axis='columns')
        long_data = long_data.reset_index().drop(columns=['Quarter', 'Hour'])

    if period_str == 'Y':
        long_data['Hour'] = long_data.DateTime.dt.hour

        long_data = long_data.set_index('Hour')
        long_data = pd.concat([long_data , shaped_data], axis='columns')
        long_data = long_data.reset_index().drop(columns=['Hour'])

    long_data = long_data.set_index('DateTime')

    return long_data.copy()

In [34]:
# Function to set up contract delivery structure and pass on to next stage (get
# optimal hybrid of generation traces). This function will take in the contract
# delivery type, and return the necessary input fields to collect information
# about the contract definition.
def select_delivery_structure(
        contract_type:str
) -> pd.DataFrame:
    # TODO: fill this out
    valid_options = {'Pay as Produced', 'Pay as Consumed', 'Shaped', 'Baseload', '24/7'}
    if contract_type not in valid_options:
        raise ValueError(f'contract_type must be one of {valid_options}')
    return

# DOING - NOT FINISHED YET
def hybrid_shaped(
        redef_period:str,
        contracted_amount:float, 
        df:pd.DataFrame,
        generator_list:list[str],
        interval:str,
        percentile_val:float
) -> pd.DataFrame:
    
    if contracted_amount < 0 or contracted_amount > 100:
        raise ValueError('contracted_amount must be a float between 0 - 100')
    
    # if not percentile_profile.isnumeric:
    #     percentile_profile = 0.5
    #     print('percentile_profile not specified. Using P50 profile.')
    
    if percentile_val < 0 or percentile_val > 1.0:
        raise ValueError('percentile_val must be a float between 0 - 1.0.')
    
    percentile_val = 1 - percentile_val

    # also need to find out if it's a leap year:
    leap_year = check_leap_year(df, 24)
    first_year = df.iloc[:24 * (365 + leap_year)].copy()

    # Get the load and gen:
    first_year_load = first_year['Load'].copy()
    first_year_gen = first_year[generator_list].copy()

    # sum of total load in first year:
    first_year_load_sum = first_year_load.sum(numeric_only=True) * (contracted_amount/100)

    # Create a new df to hold the shaped (percentile) profiles, make sure timestamps
    # all line up.
    shaped_first_year = pd.DataFrame()
    shaped_first_year['DateTime'] = pd.date_range(
        first_year_load.index[0], 
        first_year_load.index[-1], 
        freq='H'
    )

    # TODO: add commenting detail here to explain what's going on!!
    resampled_gen_data = get_percentile_profile(redef_period, first_year_gen, percentile_val)
    shaped_first_year = concat_shaped_profiles(redef_period, resampled_gen_data, shaped_first_year)

    hybrid_trace_series, percentages, upscale_factor = run_hybrid_optimisation(
        contracted_energy=first_year_load,
        wholesale_prices=first_year['RRP: QLD1'].copy(),
        generation_data=shaped_first_year.copy(),
        excess_penalty=50,
        total_sum=first_year_load_sum,
        contract_type='Shaped'
    )

    hybrid_trace_whole_length = pd.DataFrame(columns=['DateTime'])
    hybrid_trace_whole_length['DateTime'] = df.index.copy()

    # Now add the hybrid P[x] profile to df as contracted energy
    resampled_gen_data['Contracted Energy'] = 0
    for name, det_dict in percentages.items():
        if name != 'Hybrid':
            contracted_percent_gen = det_dict['Percent of generator output']
            resampled_gen_data['Contracted Energy'] += resampled_gen_data[name] * (contracted_percent_gen)
    
    contracted_gen_full_length = concat_shaped_profiles(redef_period, resampled_gen_data, hybrid_trace_whole_length)

    contracted_gen_full_length *= (upscale_factor + 0.1)

    df = pd.concat([df, contracted_gen_full_length['Contracted Energy']], axis='columns')

    # Now add the 'actual' hybrid profile (each gen * allocated output %)
    df['Hybrid'] = 0

    for name, det_dict in percentages.items():
        hybrid_percent_gen = det_dict['Percent of generator output']
        df['Hybrid'] += df[name] * hybrid_percent_gen
    return df

# I THINK DONE
def hybrid_baseload(
        redef_period:str,
        contracted_amount:float, 
        df:pd.DataFrame,
        generator_list:list[str],
        interval:str,
        percentile_val:float
) -> pd.DataFrame:

    if contracted_amount < 0:
        raise ValueError('contracted_amount must be greater than 0.')
    
    # use only the first year of data to create the contract basis. 
    # if there is only one year?
    # num_intervals_in_day = int(24 / (interval / 60))  # hours in day / minutes / minutes in hour

    # also need to find out if it's a leap year:
    leap_year = check_leap_year(df, 24)
    first_year = df.iloc[:24 * (365 + leap_year)].copy()

    # Resample to hourly load, then take the hourly average per chosen period
    first_year_load = first_year['Load'].copy()

    # Use a map to allocate hourly values across all years of load data:
    if redef_period == 'Y':
        avg_hourly_load = first_year_load.mean(numeric_only=True)

        # the contracted energy needs to be updated by the contracted_amount percentage:
        df['Contracted Energy'] = round(avg_hourly_load) * (contracted_amount / 100)
    
    else:
        # the contracted energy needs to be updated by the contracted_amount percentage:
        avg_hourly_load = pd.DataFrame(first_year_load.resample(redef_period).mean(numeric_only=True) * (contracted_amount / 100))
        avg_hourly_load['Load'] = avg_hourly_load['Load'].round()
        avg_hourly_load['M'] = avg_hourly_load.index.month
        avg_hourly_load['Q'] = avg_hourly_load.index.quarter

        map_dict = dict(zip(avg_hourly_load[redef_period], avg_hourly_load['Load']))

        first_year['M'] = first_year.index.month
        first_year['Q'] = first_year.index.quarter

        first_year['Contracted Energy'] = first_year[redef_period].copy()
        first_year['Contracted Energy'] = first_year['Contracted Energy'].map(map_dict)

        first_year = first_year.drop(columns=['M', 'Q'])
    
    hybrid_trace_series, percentages, upscale_factor = run_hybrid_optimisation(
        contract_type='Baseload',
        contracted_energy=first_year['Contracted Energy'].copy(),
        wholesale_prices=first_year['RRP: QLD1'].copy(),
        generation_data=first_year[generator_list].copy(),
        excess_penalty=0.5,
        total_sum=first_year_load.sum(numeric_only=True)
    )

    first_year = pd.concat([first_year, hybrid_trace_series], axis='columns')

    df['Hybrid'] = 0

    for name, det_dict in percentages.items():
        hybrid_percent_gen = det_dict['Percent of generator output']
        df['Hybrid'] += df[name] * hybrid_percent_gen

    return df

# I THINK DONE
def hybrid_247(
        redef_period:str,
        contracted_amount:float, 
        df:pd.DataFrame,
        generator_list:list[str],
        interval:str,
        percentile_val:float
) -> pd.DataFrame:

    if contracted_amount < 0 or contracted_amount > 100:
        raise ValueError('contracted_amount must be a float between 0-100')

    # use only the first year of data to create the contract basis. 
    # if there is only one year?
    # num_intervals_in_day = int(24 / (interval / 60))  # hours in day / minutes / minutes in hour

    # also need to find out if it's a leap year:
    leap_year = check_leap_year(df, 24)
    first_year = df.iloc[:24 * (365 + leap_year)].copy()

    # Get first year load (and total sum):
    first_year_load = first_year['Load'].copy()
    first_year_load_sum = first_year_load.sum(numeric_only=True) * (contracted_amount/100)

    hybrid_trace_series, percentages, upscale_factor = run_hybrid_optimisation(
        contracted_energy=first_year['Load'].copy(),
        wholesale_prices=first_year['RRP: QLD1'].copy(),
        generation_data=first_year[generator_list].copy(),
        excess_penalty=0.5,
        constrain_total_percent=False,
        total_sum=first_year_load_sum,
        contract_type='24/7',
        cfe_score_min=contracted_amount/100
    )

    df['Hybrid'] = 0

    for name, det_dict in percentages.items():
        hybrid_percent_gen = det_dict['Percent of generator output']
        df['Hybrid'] += df[name] * hybrid_percent_gen

    return df

# I THINK DONE
def hybrid_pap(
        redef_period:str,
        contracted_amount:float, 
        df:pd.DataFrame,
        generator_list:list[str],
        interval:str,
        percentile_val:float
) -> pd.DataFrame:

    if contracted_amount < 0:
        raise ValueError('contracted_amount must be greater than 0.')

    # use only the first year of data to create the contract basis. 
    # if there is only one year?
    # num_intervals_in_day = int(24 / (interval / 60))  # hours in day / minutes / minutes in hour

    # also need to find out if it's a leap year:
    leap_year = check_leap_year(df, 24)
    first_year = df.iloc[:24 * (365 + leap_year)].copy()

    # Get first year load (and total sum):
    first_year_load = first_year['Load'].copy()
    first_year_load_sum = first_year_load.sum(numeric_only=True) * (contracted_amount/100)

    hybrid_trace_series, percentages, upscale_factor = run_hybrid_optimisation(
        contracted_energy=first_year_load,
        wholesale_prices=first_year['RRP: QLD1'].copy(),
        generation_data=first_year[generator_list].copy(),
        excess_penalty=1,     # note: need to add a small (even if negligable) penalty for excess - to enforce calculation of the 'excess' variable in optimisation.
        total_sum=first_year_load_sum,
        contract_type='Pay as Produced'
    )

    df['Hybrid'] = 0

    for name, det_dict in percentages.items():
        hybrid_percent_gen = det_dict['Percent of generator output']
        df['Hybrid'] += df[name] * hybrid_percent_gen

    return df

# I THINK DONE
def hybrid_pac(
        redef_period:str,
        contracted_amount:float, 
        df:pd.DataFrame,
        generator_list:list[str],
        interval:str,
        percentile_val:float
) -> pd.DataFrame:

    if contracted_amount < 0:
        raise ValueError('contracted_amount must be greater than 0.')

    # use only the first year of data to create the contract basis. 
    # if there is only one year?
    # num_intervals_in_day = int(24 / (interval / 60))  # hours in day / minutes / minutes in hour

    # also need to find out if it's a leap year:
    leap_year = check_leap_year(df, 24)
    first_year = df.iloc[:24 * (365 + leap_year)].copy()

    # Get first year load (and total sum):
    first_year_load = first_year['Load'].copy()
    first_year_load_sum = first_year_load.sum(numeric_only=True) * (contracted_amount/100)

    hybrid_trace_series, percentages, upscale_factor = run_hybrid_optimisation(
        contracted_energy=first_year['Load'].copy(),
        wholesale_prices=first_year['RRP: QLD1'].copy(),
        generation_data=first_year[generator_list].copy(),
        excess_penalty=0.5,     # note: need to add a small (even if negligable) penalty for excess - to enforce calculation of the 'excess' variable in optimisation.
        total_sum=first_year_load_sum,
        contract_type='Pay as Consumed'
    )

    df['Hybrid'] = 0

    for name, det_dict in percentages.items():
        hybrid_percent_gen = det_dict['Percent of generator output']
        df['Hybrid'] += df[name] * hybrid_percent_gen

    return df


def create_hybrid_generation(
        contract_type:str, # describes contract delivery structure
        redef_period:str, # one of python's offset strings indicating when the contract gets "redefined"
        contracted_amount:float, # a number 0-100(+) indicating a percentage. Definition depends on contract type.
        df:pd.DataFrame, # df containing Load, all gen profiles, prices, emissions.
        generator_list:list[str],
        interval:str, # time interval in minutes that data is currently in
        percentile_val:float # for Shaped contracts only: to define the percentile of generation profiles to match.
) -> pd.DataFrame:
    
    valid_contracts = {'Pay as Produced', 'Pay as Consumed', 'Shaped', 'Baseload', '24/7'}
    if contract_type not in valid_contracts:
        raise ValueError(f'contract_type must be one of {valid_contracts}')
    
    valid_periods = {'M', 'Q', 'Y'}
    if redef_period not in valid_periods:
        raise ValueError(f'redef_period must be one of {valid_periods}')

    opt_hybrid_funcs = {
        'Pay as Produced' : hybrid_pap, 
        'Pay as Consumed' : hybrid_pac,
        'Shaped' : hybrid_shaped, 
        'Baseload' : hybrid_baseload, 
        '24/7' : hybrid_247
    }

    df_with_hybrid = opt_hybrid_funcs[contract_type](redef_period, contracted_amount, df, generator_list, interval, percentile_val)

    return df_with_hybrid


In [35]:
# TODO: update df to only use useful columns - > this will be filtered by scenario!!
hybrid_profiles_test = create_hybrid_generation('Shaped', 'M',  100, combined_data_firming, ['MEWF1: Wind - Onshore', 'SMCSF1: Photovoltaic Flat panel'], interval, 0.75)
hybrid_profiles_test.describe()

Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-06
Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-06


Unnamed: 0,Load,CSPVPS1: Photovoltaic Flat Panel,DDSF1: Photovoltaic Flat Panel,KSP1: Photovoltaic Flat panel,MEWF1: Wind - Onshore,SMCSF1: Photovoltaic Flat panel,SRSF1: Photovoltaic Flat panel,RRP: QLD1,AEI: QLD1,Firming price: QLD1,Contracted Energy,Hybrid
count,43801.0,43801.0,43801.0,43801.0,43801.0,43801.0,43801.0,43801.0,43801.0,43801.0,43801.0,43801.0
mean,418.66814,440.709682,376.428934,435.390308,397.766029,391.777266,479.60635,99.163587,0.750544,85.783312,463.11233,644.352806
std,787.945306,583.854944,522.384434,569.394247,370.168473,667.094068,673.685033,276.422363,0.068438,71.283271,461.145484,677.453979
min,2.8,0.0,0.0,0.0,0.0,0.0,0.0,-1000.0,0.479589,20.0,0.1969,0.0
25%,19.6,0.0,0.0,0.0,69.360289,0.0,0.0,37.178843,0.714334,37.178843,115.552364,139.325881
50%,52.2,14.112578,9.290635,1.327647,298.345328,0.0,13.635088,62.967373,0.76477,62.967373,259.264751,418.00614
75%,277.8,936.821501,751.95111,1027.924095,632.177812,583.267189,985.875013,103.426493,0.800326,103.426493,752.177818,858.595732
max,4802.8,1830.807661,1700.608075,1554.160106,1461.297732,2724.847002,2266.586609,15000.0,0.879175,300.0,1892.845303,3369.786707


In [21]:
hybrid_profiles_test.sum()

Load                                1.833808e+07
CSPVPS1: Photovoltaic Flat Panel    1.930352e+07
DDSF1: Photovoltaic Flat Panel      1.648796e+07
KSP1: Photovoltaic Flat panel       1.907053e+07
MEWF1: Wind - Onshore               1.742255e+07
SMCSF1: Photovoltaic Flat panel     1.716024e+07
SRSF1: Photovoltaic Flat panel      2.100724e+07
RRP: QLD1                           4.343464e+06
AEI: QLD1                           3.287460e+04
Firming price: QLD1                 3.757395e+06
Contracted Energy                   2.028478e+07
Hybrid                              2.822330e+07
dtype: float64

In [43]:
hybrid_profiles_r = hybrid_profiles_test[['Load', 'RRP: QLD1', 'AEI: QLD1', 'Firming price: QLD1', 'Contracted Energy', 'Hybrid']].copy()


hybrid_profiles_r = hybrid_profiles_r[hybrid_profiles_r.index > '2019-01-01 23:00:00']
hybrid_profiles_r.head()

Unnamed: 0_level_0,Load,RRP: QLD1,AEI: QLD1,Firming price: QLD1,Contracted Energy,Hybrid
DateTime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2019-01-02 00:00:00,5.4,65.332603,0.823169,65.332603,141.36575,536.877067
2019-01-02 01:00:00,5.2,50.608119,0.8188,50.608119,138.1555,529.171784
2019-01-02 02:00:00,5.4,47.75901,0.816646,47.75901,89.491823,552.356018
2019-01-02 03:00:00,5.2,56.795396,0.816868,56.795396,100.372065,498.788924
2019-01-02 04:00:00,6.6,60.832191,0.81903,60.832191,102.663743,465.131115


In [44]:
## Load Flexibility here

# Input: add load flex? (True/False)
# Input: flexibility rating? (High/Medium/Low)

# Could: add calculation to rate flexibility here, but I think that adds too many layers
# of calculation!!

add_load_flex = True
flexibility_rating = 'High'

# Optional extra inputs? raise_price: float, ramp_price: float, ?

# Need to add weekday/weekends to df!!
# add weekday and weekend markers, including holidays as a weekend:
# TODO: make this fit in as a function somewhere!!
holiday_dates = holidays.country_holidays('AU', subdiv=REGION[:-1])
for date in hybrid_profiles_r.index:
    hybrid_profiles_r.loc[date, 'Weekend'] = int((date in holiday_dates) | date.dayofweek in [5, 6])

In [53]:
## Load Shifting Optimisation ##

def get_daily_load_sums(
        df:pd.DataFrame     # a pandas df that has DateTime index and 'Load' as a column name
) -> pd.DataFrame:
    return df['Load'].copy().resample('D').sum(numeric_only=True)

def create_base_days(
        df:pd.DataFrame,
        region:str,
        flexibility_rating:str
) -> tuple[pd.DataFrame, pd.DataFrame]:
    
    # flex rating percentile dictionary:
    flex_dict = {
        'High': 0.5,
        'Medium' : 0.75,
        'Low' : 0.95
    }

    # validate flex rating here? Or outside of this function??
    if flexibility_rating not in flex_dict:
        raise ValueError(f'flexibility_rating must be one of {flex_dict.keys}')
    
    quant = flex_dict[flexibility_rating]
    
    # First get just the load profile from df:
    load_profile = df[['Load', 'Weekend']].copy()

    all_weekdays_only = load_profile[load_profile['Weekend'] == 0].copy()
    all_weekends_only = load_profile[load_profile['Weekend'] == 1].copy()

    base_weekday = all_weekdays_only.groupby(all_weekdays_only.index.hour)['Load'].quantile(quant).reset_index(drop=True)

    base_weekend = all_weekends_only.groupby(all_weekends_only.index.hour)['Load'].quantile(quant).reset_index(drop=True)
    
    return base_weekday, base_weekend

def daily_load_shifting(
        df:pd.DataFrame,
        raise_price:float=1,    # price on 'raising' load above original value
        ramp_price:float=100    # price on ramp: acts as penalty against extreme ramps.
) -> pd.DataFrame:
    
    results_df = pd.DataFrame(columns=['Load dispatch','Contract', 'Original load', 'Base load', 'Firming', 'Raised load', 'Ramp up', 'Ramp down'])

    daily_load_sums = get_daily_load_sums(df)
    base_weekday, base_weekend = create_base_days(df, REGION, flexibility_rating)
    all_time_max_load = df['Load'].max(numeric_only=True)
    all_time_min_load = df['Load'].min(numeric_only=True)

    # run optimisation for each day individually to keep constraints:
    for idx, date in enumerate(daily_load_sums.index):
        data_for_one_day = df[df.index.date == date.date()].copy()
        if data_for_one_day['Weekend'].values[0] == 0:
            base_day = base_weekday.values
        else:
            base_day = base_weekend.values

        data_for_one_day['Base Day'] = base_day

        # Use the lower of base_day and load values to form the 'base load' for
        # this day
        data_for_one_day['Base Load'] = np.where(
            data_for_one_day['Base Day'] <= data_for_one_day['Load'], 
            data_for_one_day['Base Day'], 
            data_for_one_day['Load']
        )
        
        # the load sum for this day will be a constraint in optimisation:
        load_sum_for_one_day = daily_load_sums.iloc[idx]

        # Transform all traces to arrays for optimisation:
        original_load = data_for_one_day['Load'].values
        base_load = data_for_one_day['Base Load'].values
        contracted_renewables = data_for_one_day['Contracted Energy'].values
        # emissions_intensities = data_for_one_day[f'AEI: {REGION}'].values
        # wholesale_prices = data_for_one_day[f'RRP: {REGION}'].values
        firming_prices = data_for_one_day[f'Firming price: {REGION}']

        # Start setting up the model:
        I = range(len(base_load))
        m = Model()

        load_dispatch = [m.add_var(var_type=CONTINUOUS, lb=0.0, ub=all_time_max_load) for i in I]
        unmatched = [m.add_var(var_type=CONTINUOUS, lb=0.0, ub=all_time_max_load) for i in I]
        raised_load = [m.add_var(var_type=CONTINUOUS, lb=0.0, ub=all_time_max_load) for i in I]

        # Add 'ramp' constraints applied as a penalty term in the optimisation:
        ramp_up = [m.add_var(var_type=CONTINUOUS, lb=0.0) for i in I]
        ramp_down = [m.add_var(var_type=CONTINUOUS, ub=0.0) for i in I]

        # Set up objective: to minimise unmatched load and associated cost.
        # Included in the objective are ramp penalties to disincentivise big jumps,
        # and a penalty on raising the load above its original value (small, can be set to 0)
        m.objective = minimize(
            xsum(
                (unmatched[i] + unmatched[i]*firming_prices[i] + raised_load[i]*raise_price + (ramp_up[i]-ramp_down[i])*ramp_price) for i in I
            )
        )
        
        # Add defining constraints to optimisation:
        for i in I:
            # total load in any hour is the sum of load_dispatch + base_load
            m += unmatched[i] >= (load_dispatch[i] + base_load[i]) - contracted_renewables[i]

            # raised load is the positive difference between total load and original load:
            m += raised_load[i] >= (load_dispatch[i] + base_load[i]) - original_load[i]

            # Final constraint on the upper limit of total load in any hour:
            m += load_dispatch[i] + base_load[i] <= all_time_max_load

        # Add ramping definition as constraints:
        for j in range(len(base_load) - 1):
            m += ramp_up[j] >= (load_dispatch[j + 1] + base_load[j + 1]) - (load_dispatch[j] + base_load[j])
            m += ramp_down[j] <= (load_dispatch[j + 1] + base_load[j + 1]) - (load_dispatch[j] + base_load[j])
        
        # Add constraint on the sum of daily load (can't change):
        # At the moment: there are no allowances for wiggle room here!!
        m += xsum((load_dispatch[i] + base_load[i]) for i in I) >= load_sum_for_one_day
        m += xsum((load_dispatch[i] + base_load[i]) for i in I) <= load_sum_for_one_day

        # Run the optimisation, suppressing excess outputs:
        m.verbose = 0
        status = m.optimize()
        

        if status == OptimizationStatus.INFEASIBLE:
            print('Load shifting optimisation infeasible.')
            m.clear()
        
        if status == OptimizationStatus.OPTIMAL or status == OptimizationStatus.FEASIBLE:
            # Get results:
            dispatch = [load_dispatch[i].x for i in I]
            firm = [unmatched[i].x for i in I]
            raised = [raised_load[i].x for i in I]
            r_up = [ramp_up[i].x for i in I]
            r_down = [ramp_down[i].x for i in I]

            day_result = pd.DataFrame({'Load dispatch':dispatch,'Contract': contracted_renewables, 'Original load': original_load, 'Base load': base_load, 'Firming':firm, 'Raised load':raised, 'Ramp up':r_up, 'Ramp down':r_down})
            results_df = pd.concat([results_df, day_result], axis='rows')

            # Now check the results to make sure that they make sense:

            day_result['Firm real'] = ((day_result['Load dispatch'] + day_result['Base load']) - day_result['Contract']).clip(lower=0.0)
            day_result['Firm check'] = (round(day_result['Firm real'], 3) == round(day_result['Firming'],3))

            if not day_result[~day_result['Firm check']].empty:
                print(day_result)
                raise ValueError('wrong type of error atm but firming isn\'t right')

            m.clear()

    return results_df

In [54]:
with_shifted_load = daily_load_shifting(hybrid_profiles_r)


Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-06
Load shifting optimisation infeasible.
Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-06
Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-06
Load shifting optimisation infeasible.
Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-06
Load shifting optimisation infeasible.
Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-06
Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-06
Set parameter Username
Load shifting optimisation infeasible.
Academic license - for non-commercial use only - expires 2024-06-06
Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-06
Set parameter Username
Load shifting optimisation infeasible.
Academic license - for non-commercial use only - expir

In [None]:
## ADDING BATTERY FUNCTIONALITY TOO

