# Time of Use (ToU) Tariffs and Electricity Consumption
## Shiftable Percentages Version

How might total residential electrical usage change if consumers shift a portion of their consumption from peak to off peak hours?

In this version, we examine usage for "anomalies": times when usage has fluctuations outside of the average. We assume that these may represent shiftable usage, and we move a portion of the consumer's usage to other hours. The interested reader should consult our pdf writeup (hyperlink!!! Once we have our pdf, I'll upload it to the github and we can link to that) for details about the reasoning behind our assumptions, as we've given minimal explanation here in the interest of space.

This notebook contains sections that:

* Query and organize the usage data
* Identify peak usage hours for an average winter weekday
    - Demonstrate a potential shift in usage for an average winter weekday
* Perform a week-by-week shift
    - Find "shiftable percentages" from an average winter week in 2021
    - Define a shifting scheme
    - Conduct week-by-week shifts
        * Arrange user-defined testing usage data
        * Shift, assuming consumers will shift usage to overnight hours
        * Shift, assuming consumers are less likely to shift to overnight hours
* Display before and after and compare peak usage
    - Residential usage with the shifts
    - Combining shifted residential usage with industrial and commercial usage


## Setup, Queries, and Cleaning

To reduce the number of queries, we query all of the usage data at once, and then use `pandas` to manage it.

Note: we drop the "extra" daylight savings time hour in November here. Later, the "missing" daylight savings time hour in March is filled with a zero.

In [1]:
import getpass
import math
import numpy as np
import pandas as pd
import urllib.parse
import plotly.express as px
import datetime as dt

import mograph as mg
import data_management_functions as dmf

pd.set_option('display.max_columns', None)

**Connection**

Enter the EDM server address and the login credentials provided by Awesense. If you do not have the credentials, or have any trouble connecting, please contact api@awesense.com.
<span style='color:red'> **Please do NOT store the credentials in the notebook, nor share them with anyone.** </span>

In [2]:
edm_address = getpass.getpass(prompt='EDM server address: ')

print('\nEDM login information')
edm_name = getpass.getpass(prompt='Username: ')
edm_password = getpass.getpass(prompt='Password: ')
edm_password = urllib.parse.quote(edm_password)

%load_ext sql
%sql postgresql://$edm_name:$edm_password@$edm_address/edm
%config SqlMagic.displaycon = False
%config SqlMagic.feedback = False

# Delete the credential variables for security purposes.
del edm_name, edm_password

EDM server address: ········

EDM login information
Username: ········
Password: ········


In [3]:
# User input for the grid.
grid_id = input('Enter grid ID: ') # awefice

Enter grid ID: awefice


In [4]:
%%sql

SET timezone = 'America/Vancouver'

[]

In [5]:
%%sql result_meter <<

SELECT ge.grid_element_id AS meter_id,
    tdss.timestamp AT TIME ZONE 'America/Vancouver' AS timestamp,
    tdss.value AS "kWh",
    geds.type,
    ge.meta ->> 'type_of_consumer' as type_of_consumer
FROM grid_element AS ge
LEFT JOIN grid_element_data_source AS geds
    ON geds.grid_id = ge.grid_id
    AND geds.grid_element_id = ge.grid_element_id
JOIN ts_data_source_select(geds.grid_element_data_source_id, 'kWh', null) AS tdss
    ON TRUE
WHERE ge.type = 'Meter'
    AND ge.grid_id = '{grid_id}'
    AND geds.type = 'CONSUMER'
ORDER BY cast(split_part(ge.grid_element_id, '_', 2) AS int) asc, tdss.timestamp

Returning data to local variable result_meter


### Setting up dataframes, dropping duplicates, aggregating residential usage

We drop the November DST "extra hour" and aggregate by timestamp for all of the residential meters.

Additionally, because we frequently consider only the winter months, and we consider weekdays and weekends separately, we formulate dataframes containing winter usage and winter weekday usage where winter is defined as January, February, and December 2021. We chose to restrict to a single year for "training" purposes.

In [6]:
df_origin_raw = result_meter.DataFrame()

# Drop duplicate rows to handle Nov DST; 
df_origin = df_origin_raw.drop_duplicates(subset=['meter_id','timestamp'], keep='first')

# Aggregate usage from the df where we drop November duplicates
df_agg = dmf.ds_demand_cat(df_origin)

# Pick out just the residential usage
df_res = df_agg[['ds_kWh_res']].copy().rename(columns={"ds_kWh_res": "kWh"})


# Winter
winter = [1,2,12]
# Restrict to winter 2021
df_res_winter = df_res.loc[(df_res.index.year == 2021) &
                             (df_res.index.month.isin(winter))
                            ].copy()

# Restrict to winter WEEKDAYS 2021
df_res_ww = df_res_winter.loc[(df_res_winter.index.weekday <5)].copy()

# Aggregate to get an average winter weekday
df_res_winter_weekday = df_res_ww.groupby(df_res_ww.index.hour).mean()

## Identifying Peak Hours of Consumption for an Average Winter Weekday

In this section, we analyse the usage data for an average winter weekday and split it into three pieces based on standard deviation ($\sigma$) of the residential consumption and the anomaly (difference between that hour's average consumption and the average daily consumption). These pieces are:

1. Peak: Where the consumption is above the average by more than one standard deviation.
2. Mid Peak: Where the consumption is at the average or above the average by at most one standard deviation.
3. Off Peak: Where the consumption is below the average

We then create three indicator arrays reflecting the above information. 

In [7]:
def peaky_finders(df, bigoffpeak = False, st_dev_no = 1, print_result = False):
    """Takes: the consumption columns in a dataframe
       Returns: three index arrays in this order: peak hours, mid-peak hours, off peak hours
       Displays the data it returns, can be commented out
       Added a parameter called `bigoffpeak` to increase the number of midpeak hours and decrease the number of off-peak hours.
       Added a parameter called `st_dev_no` which is set by default to 1, and represents the number of standard deviations
       an hour's consumption must exceed the average by in order to be considered peak.
       """
    st_dev = st_dev_no*df.std()
    peak = df[df > st_dev+df.mean()].dropna()
    if bigoffpeak == False:
        offpeak = df[df < -st_dev+df.mean()].dropna()
        midpeak = df[(df-df.mean()).abs() <= st_dev].dropna()
    if bigoffpeak == True:
        offpeak = df[df < df.mean()].dropna()
        midpeak = df[df >= df.mean()].dropna()
        midpeak = midpeak[midpeak <= df.mean()+st_dev].dropna()
    if print_result == True:    
        print ('Peak data is ')
        print (peak)
        print ('Mid-peak data is')
        print (midpeak)
        print ('Off-peak data is ')
        print (offpeak)
    return (peak.index, midpeak.index, offpeak.index)

In [8]:
peak, mid, off = peaky_finders(df_res_winter_weekday, bigoffpeak = True, print_result = True)

Peak data is 
                 kWh
timestamp           
18         61.656953
19         57.592200
20         61.485772
21         58.126307
Mid-peak data is
                 kWh
timestamp           
5          51.000723
6          55.284728
13         51.021101
14         50.549404
15         50.144773
16         53.415167
17         54.719834
22         49.569652
Off-peak data is 
                 kWh
timestamp           
0          45.744351
1          42.689087
2          42.939943
3          43.022561
4          46.399871
7          43.594585
8          40.575690
9          43.651114
10         39.926250
11         48.270291
12         44.758022
23         45.491441


Thus, the  `peaky_finders` classification suggests the following power pricing scheme for weekdays:

* **On-Peak:** 6:00 PM to 10:00 PM (18h00 to 22h00)
* **Mid-Peak:** 5:00 AM to 7:00 AM (5h00 to 7h00), 1:00 PM to 6:00 PM (13h00 to 18h00), and 10:00 PM to 11:00 PM (22h00 to 23h00)
* **Off-Peak:** 11:00 PM to 5:00 AM (23h00 to 23h59 and 24h00 to 5h00) and 7:00 AM to 1:00 PM (7h00 to 13h00).

This power scheme winds up being a bit disjointed, and there's an argument to be made that we can simplify it based on extrinsic factors. Namely, it is reasonable to expect non-residential consumption to be extremely low from 5:00 AM to 7:00 AM, so it may make sense to classify these hours as off-peak for residential consumers. Likewise, as we anticipate non-residential demand to drop off in the late hours of the evening, it might wind up being optimal to shift the on-peak and mid-peak hours a bit earlier in the day. Unfortunately, because there were only a small number of non-residential consumers in the test grid we worked with, this meant that the commercial and industrial consumption profiles were representative of only a few consumers, rather than commercial and industrial consumption pattens writ large. As such, we restricted our analysis to residential consumers and, in line with this choice, we will opt for this pricing pattern, and emphasize to the reader that this work should be viewed as a proof-of-concept -- the `peaky_finders` function could just as easily pick out the peaks if we had included industrial and commercial consumption from the start. 

With this analysis in mind, we define a `peak_indicator` array, which will have 1s corresponding to the peak hour and 0s otherwise; likewise for `mid_` and `low_` indicator arrays.

In [9]:
peak_indicator = np.zeros(24)
for i in peak:
    peak_indicator[i]+=1
mid_indicator = np.zeros(24)
for i in mid:
    mid_indicator[i]+=1
off_indicator = np.zeros(24)
for i in off:
    off_indicator[i]+=1

### Peak finding functions and a simple shift on a typical winter weekday

We now demonstrate how one can further automate this peak analysis. This may be used if we want the peak and off-peak hours to occur in continuous chunks. This may be incorporated in the shifting functions below or be used to generate a simple simulation to shift consumption from a peak chunk to an off-peak chunk. The drawback of this shift is that it does not take into account how practical this incentive would be. For example, between 7-11, people may be on their way to work and may not be able to either charge their vehicles or do their laundry. 

We define the `peak_finder` (`valley_finder`) function that picks out a continuous duration of time where peak (off peak) consumption is noted, based on the `peaky_finders` analysis.

*A limitation is that it cannot give you continuous peak between 11 PM and 1 AM, say, as we have not included modular arithmetic in the function. This can be easily upgraded, but given the way the data behaves, we deemed this unnecessary in this case.*

Also defined here is a simple shifting function to demonstrate how usage on an average winter weekday may change based on these peak and off peak hours. See the pdf writeup for an explanation of the reasoning behind the shifting.

In [10]:
def peak_finder(mid_peak, peak, length_of_chunk, df):
    """
    Requires: Mid-peak data, Peak data, duration of peak hours 
                         dataframe with 24 rows and one column for consumption
    Returns: an array with peak hours
    """
    k=0
    anom = df - df.mean()
    anomaly = anom[anom.columns[0]] #convert the dataframe into an array for easy indexing.
    iterant = np.sort(np.concatenate((np.array(mid_peak), np.array(peak))))
    foo = np.zeros(len(iterant))
    for i in iterant:
        max = 0
        if (23-i >= length_of_chunk-1):
            for j in range(i,i+length_of_chunk):
                max+=anomaly[j]
            foo[k]=max
        else:
            foo[k]=max
        k+=1
    correct_index = pd.Series(foo).idxmax()
    peak_start = pd.Series(iterant)[correct_index] 
    newpeak = np.arange(peak_start, peak_start+length_of_chunk)
    print ('Peak Hours:', newpeak)
    return newpeak

def valley_finder(mid_peak, off_peak, length_of_chunk, df):
    """
    Requires: Mid-peak data, Peak data, duration of peak hours 
                         dataframe with 24 rows and one column for consumption
    Returns: an array with off-peak hours
    """
    k=0
    anom = df - df.mean()
    anomaly = anom[anom.columns[0]] #convert the dataframe into an array for easy indexing.
    iterant = np.sort(np.concatenate((np.array(mid_peak), np.array(off_peak))))
    foo = np.zeros(len(iterant))
    for i in iterant:
        min = 0
        if (23-i >= 3):
            for j in range(i,i+length_of_chunk): #one could do modulo 24 here so have a peak chunk between 11 pm and 4 am, say.
                min+=anomaly[j]
            foo[k]=min
        else:
            foo[k]=min
        k+=1
    correct_index = pd.Series(foo).idxmin()
    valley_start = pd.Series(iterant)[correct_index]
    valley = np.arange(valley_start, valley_start+length_of_chunk)
    print ('Valley Hours:', valley)
    return valley

def simple_shift(original, peak, valley):
    """
    Takes: 4 pandas series
    original: a series of length 24 containing original data
    shift_percent: a series of length 24 containing shifting percentages
    peak: Peak hours from range 0 to 23
    valley: off-peak hours from range 0 to 23, same length as peak
    ------------------------------------------------------
    Returns: 1 pandas dataframe of length 24 with two columns. Column 0 is the original column and column 1 is the shifted result.
    """
    data = original.copy()
    shift_coeff = 0.5
    shift_percent = shift_coeff*100*(data - data.mean())/data
    shift_percent=shift_percent[shift_percent.columns[0]]
    data_copy = original.copy()
    data_array = data_copy[data_copy.columns[0]]
    for i,j in zip(valley,peak): 
        shift = (shift_percent[j]/100)*data_array[j]
        data_array[i]+= shift
        data_array[j]-= shift 
    data['Shifted']=data_array
    return data

In [11]:
# Finding 5 hour chunks of peak and off peak hours
newpeak=peak_finder(mid, peak, 5, df_res_winter_weekday)
newvalley=valley_finder(mid, off, 5, df_res_winter_weekday)

# Shifting usage based on these hours
simple_shift_df = simple_shift(df_res_winter_weekday, newpeak, newvalley)

mg.day_figure(simple_shift_df.reset_index(),\
               'Time of Use Shifting in an Average Winter Weekday',\
               ['kWh', 'Shifted'], ['Original Consumption', 'Shifted Consumption'],\
               t='timestamp', xtitle = 'Hour of Day', ytitle = 'Energy Consumption (kWh)') 

Peak Hours: [17 18 19 20 21]
Valley Hours: [ 7  8  9 10 11]


## A Week-by-Week Shift

Because certain usage (like laundry) isn't restricted to a particular day of the week, we also consider the possibility that consumers may move some of their usage between days.

### Shiftable Percentages

Certain electricity usage "isn't available" to be shifted: consumers need to have heat in the winter, and they need to make dinner at dinner time. Here, we compute a *shiftable percentage* of usage per hour in an average winter week. (Winter is chosen so that these percentages are conservative: we are considering the fluctuation in usage, and because of heating, winter usage has less dramatic fluctuations. In future work, one could compute summer shiftable percentages differently, and shift the usage differently depending on the season.)

To do this, we first consider the "anomaly" usage: that is, the usage that is higher or lower than a threshold determined by a rolling average. We are mainly interested in the usage that is over the threshold, and we name this "overshoot" which we use to compute shiftable percentages as decimals

$$\text{shiftable percentage}_\text{hour} = \text{overshoot}_\text{hour}\, / \,\text{usage}_\text{hour}$$

This is stored in a numpy array so we can use matrix multiplication on an array of usage by weekhour to find the usage that is shiftable in each of the 168 hours in a week.

In [12]:
def avg(df, period = 'M'):
    """
    **(Averaging Function)**: 
    Takes 
        a data frame with integer entries and 
        an averaging period (either 'M' for monthly or 'kD' where k is numeric for every k days).
    Returns a data frame of averages which is indexed by the midpoint of each averaging period 
    (or the rough midpoint in the monthly case). 
    This can accept dataframes with any column names, and adds a period+`_avg_` prefix to the columns. 
    Default behaviour (i.e. if we only give the `df` argument) is to return monthly averages.
    """
    if period == 'M': 
        av = df.resample(period).mean().add_prefix(period + '_avg_').apply(lambda x: x.shift(-15, freq='D'))
    if len(period) == 1 and period[-1] == 'D': 
        av = df.resample(period).mean().add_prefix(period + '_avg_')\
                .apply(lambda x: x.shift(12, freq='H'))
    if len(period) > 1 and period[-1] == 'D': 
        av = df.resample(period).mean().add_prefix(period + '_avg_')\
                .apply(lambda x: x.shift(int(period[:-1])*12, freq='H'))
    return av

def avg_ano(df, period = 'M', interpolate = True, percentage = False):
    """
    **(Averages and Anomalies)**: 
    Takes a time-indexed dataframe with `k` columns, 
    and returns a dataframe with a truncated time-index and `3*k` columns. 
    The first `k` columns contain the same entries as the original dataframe. 
    The next `k` columns contain monthly averages of the first `k` columns, where, 
    for times which are not the midpoint of some month, 
    we assign monthly averages by using the `.interpolate()` method 
    to linearly interpolate between the values at the two nearest monthly midpoints. 
    The last `k` columns contain values which I have termed 'anomalies,' 
    and they are obtained by subtracting values in the monthly average part of the new dataframe 
    from those in the original dataframe. 
    """
    k = len(df.columns)
    if (interpolate == True) or (period == 'M'): 
        df = df.join(avg(df, period), how = 'outer').interpolate()
        df = df[df.index <= avg(df, period).last_valid_index()]
        df = df[df.index >= avg(df, period).first_valid_index()]
    else: 
        if len(period) == 1 and period[-1] == 'D': 
            df = df.join(avg(df, period).apply(lambda x: x.shift(-12, freq='H')), how = 'outer')\
                        .fillna(method = 'ffill')
        if len(period) > 1 and period[-1] == 'D': 
            df = df.join(avg(df, period).apply(lambda x: x.shift(-int(period[:-1])*12, freq='H')), how = 'outer')\
                        .fillna(method = 'ffill')
    #add an anomaly column
    for i in range(k):
        df['Anomaly_'+df.columns[i]] = df[df.columns[i]]-df[df.columns[i+k]]
    if percentage == True:
        df['Percentage Anomaly (avg)'] = 100*df['Anomaly_kWh']/df['D_avg_kWh']
        df['Percentage Anomaly (act)'] = 100*df['Anomaly_kWh']/df['kWh']
    return df

In [13]:
# Compute the anomalies
df_res_wp_aa = avg_ano(df_res_winter, period = 'D', interpolate = False, percentage = True)

# Average the usage and anomalies over a week
df_res_wp_aao = dmf.avg_week(df_res_wp_aa)

# Compute the overshoot by considering only where the usage is above expected
df_res_wp_aao['Overshoot'] = df_res_wp_aao['Anomaly_kWh']
df_res_wp_aao['Overshoot'][df_res_wp_aao['Overshoot']<0] = 0

In [14]:
# Build a shiftable percentage matrix: ratio of overshoot to usage on diagonal
shiftable_percent_array = np.zeros((168,168))
for hour in range(168):
    shiftable_percent_array[hour,hour] = df_res_wp_aao.Overshoot[hour]/df_res_wp_aao.kWh[hour]

### Building a shifting scheme
Here we define functions to shift usage week-by-week, as well as use the on, mid, and off peak hours found above to define a numpy array of length 168 whose entries represent the cost of electricity at each hour of the week.

**Many** of our assumptions about consumer behavior are encoded in these functions and we again refer the interested reader to our pdf writeup for an in-depth discussion of these assumptions. In particular, we draw attention to the "sleepy" option: consumers may be less likely to shift their usage to the overnight hours.

In [15]:
def week_tariff_func(off_days, on_days, ondays_off, ondays_mid, ondays_peak, off_price, mid_price ,peak_price):
    """
    parameters: (np.array , np.array
                each of length 7
                    indicator functions of "off_days" where tariff is flat and "on_days" where tariff varies
                    e.g. the weekend indicator is [0,0,0,0,0,1,1]
                    and the weekday indicator is [1,1,1,1,1,0,0]
                    
                np.array, np.array, np.array,
                each of length 24
                    indicator functions of offpeak, midpeak, and peak pricing within an "on" day
                    e.g. the 7pm to 7am indicator is [1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1]
                
                float, float, float)
                    electricity rates for offpeak, midpeak, an peak, respectively
                    e.g. 7.4, 10.2 , 15.1
                    
    returns: np.array of length 168
        representing the electricity tariff function at each hour of the week 
    """
    off = np.zeros(168)
    mid = np.zeros(168)
    peak = np.zeros(168)
    for d in range(7):
        for h in range(24):
            off[24*d+h] = off_days[d] + on_days[d]*ondays_off[h]
            mid[24*d+h] = on_days[d]*ondays_mid[h]
            peak[24*d+h] = on_days[d]*ondays_peak[h]
    return off_price*off + mid_price*mid + peak_price*peak


def incentive(a, x, n, b):
    return a*(x**n) + b

def shift_by_tariff_and_dist(wtfunc, h0, sleepy = False):
    """
    parameters: ndarray of length 168, integer between 0 and 167
    
    function which returns a spread distribution at hour h0 for a given tariff function
    it is proportionally weighed by a*(x^n) + b
    and inverse proportionally weighed by c*(y^m) + d
    
    returns: ndarray of length 168 which sums up to 1
    """
    
    #parameters for price effect
    #a
    price_scaling = 1
    #n
    price_power = 1
    #b
    price_offset = 0
    
    #parameters for distance effect
    #c
    dist_scaling = 1
    #m
    dist_power = 0.5
    #d
    dist_offset = 1 # if dist_offset is <= 0 then division by zero errors can occur

    #Account for sleeping
    if sleepy == True:
        #What hour is the center of the sleeping period
        sleep_centre = 2
        #What's the minimum consumption during sleep.
        sleep_min = 0.3
        #What exponent do we use? (Higher means sharper dropoff overnight)
        sleep_power = 3
        #How long is the sleep period? 
        sleep_length = 10
    
    #initial probability distribution, i.e. the "no change" (dirac at h0) distribution
    shiftprob = np.zeros(168)
    init_weight = 1
    shiftprob[h0] = init_weight
    
    for h in range(168):
        # x = price difference at h0 and at h, if positive, and 0 otherwise
            #if price_offset = 0, this means the spread to any time with same tariff is 0
        pos_price_dif = max(wtfunc[h0]-wtfunc[h], 0)
       
        # y = distance from h_0 (on a 7 day circle)
        distance = min(np.abs(h0-h), 168-np.abs(h0-h))
        
        #computes incentive to shift from h0 to h
        price_inc = incentive(price_scaling, pos_price_dif, price_power, price_offset)
        dist_inc = incentive(dist_scaling, distance, dist_power , dist_offset)
        #Without sleep option, just compute it the normal way
        if sleepy == False:
            shiftprob[h] += (price_inc)/(dist_inc)
        #If base hour is a sleep hour, just compute it the normal way.
        #elif min(np.abs((h0%24)-sleep_centre), np.abs((h0%24)-(24+sleep_centre))) <= 4:
        #   shiftprob[h] += (price_inc)/(dist_inc)
        #Otherwise, penalize shifting into sleep hours
        else:
            sleep_dist = min(np.abs((h%24)-sleep_centre), np.abs((h%24)-(24+sleep_centre)))
            sleep_inc = min(1, (((1-sleep_min)/((0.5*sleep_length)**sleep_power))*(sleep_dist**sleep_power)+sleep_min))
            shiftprob[h]+=(price_inc)/(dist_inc)*(sleep_inc)
    
    out = (1/(sum(shiftprob)))*shiftprob
    
    #error correction to make the integral (i.e. sum) of the distribution closer to 1
        #adds error as a flat additional probability to not move
    #the error can be negative, 
        #but unless the probability to stay put is near absolute zero then this should not be a problem
    sm = sum(out)
    out[h0] += (1-sm)
    
    return out


def shift_many(tariff_scheme, shiftable_usage, sleepy = False):
    """
    Takes tariff_scheme, an ndarray of length 168 representing a tariff scheme and 
    shiftable_usage, an ndarray with 168 columns and rows for each week
    
    Returns an ndarray of the shifted usage
    """
    # build the shifting matrix by making each row a probability distribution array
    shifting_matrix = np.zeros((168,168))
    for hour in range(168):
        shifting_matrix[hour] = shift_by_tariff_and_dist(tariff_scheme, hour, sleepy)
    
    # matrix multiplication to shift the usage
    return np.matmul(shiftable_usage, shifting_matrix)

In [16]:
offdays = np.array([0,0,0,0,0,1,1]) #weekends
ondays = np.array([1,1,1,1,1,0,0]) #weekdays

# Prices taken from an Ontario scheme
off_price, mid_price, peak_price = 0.074, 0.102, 0.151

# indicators for off, mid, peak from the peaky_finders analysis above
wt = week_tariff_func(offdays, ondays,
                      off_indicator, mid_indicator, peak_indicator, 
                      off_price, mid_price, peak_price)

### Conducting week-by-week shifts

We will conduct two different shifts: one without the "sleepy" option (where we assume consumers will move some of their electricty usage to the overnight hours) and one with the "sleepy" option (where consumers move less usage to the overnight hours).

#### Preparing user-defined data for shifting

The user defines a time frame of interest and we show how the above shifting function adjusts (residential) user behavior assuming the winter shiftable percentages.

Ideally, the user will choose a time frame that begins on a Monday and ends on a Sunday (so it contains full weeks), but if not, the spare days on each end are omitted and stored so that if we choose we can perform a daily shift on them and add them back into the data we can. This could be implemented in future work; for now, we only consider full weeks.

In [17]:
# User input for the time frame.
user_start = input('Enter start date: ')
user_end = input('Enter end date (inclusive): ')

start_date = pd.to_datetime(user_start)
end_date = pd.to_datetime(user_end) + dt.timedelta(hours=23)

Enter start date: jan 3, 2022
Enter end date (inclusive): dec 25, 2022


In [18]:
# Get year, week, weekhour columns for just in the time frame
res_raw = dmf.timeframe_df(df_res, start_date, end_date)

# Make a pivot table of the usage data (res_pivot), stripping off partial weeks at beginning and end
(spare_days, res_pivot) = dmf.pivot_strip_spare(res_raw)

# Make the pivot table into a numpy array
weekly_usage_array = res_pivot.to_numpy()

# Compute the shiftable usage with the shiftable_percent_array from above
shiftable_array = np.matmul(weekly_usage_array, shiftable_percent_array)

#### Performing a non-sleepy shift

In [19]:
# Shift the shiftable usage, using the tariff function defined above
shifted_overshoot = shift_many(wt,shiftable_array)

# Replace the shiftABLE usage with the shiftED usage
weekly_shifted_array = weekly_usage_array - shiftable_array + shifted_overshoot

# Insert the shifted array into a df that looks like the original pivot df
shifted_pivot = pd.DataFrame(weekly_shifted_array,
                             columns=list(res_pivot.columns),
                             index=res_pivot.index).reset_index()

# Unpivot the df
shifted_dfmelt = pd.melt(shifted_pivot,
                                id_vars=['isoyear','week'],
                                value_vars=list(shifted_pivot.columns),
                               var_name='weekhour', value_name='ToU')\
                            .sort_values(by=['isoyear','week','weekhour'])

# Merge with the unshifted data to retrieve timestamps and to have a comparison
orig_plus_shifted = pd.merge(res_raw, shifted_dfmelt, on=['isoyear','week','weekhour'])

#### Performing a sleepy shift

We insert this shifted data into a data frame with the non-sleepy shift and the original usage for comparison.

In [20]:
# Shift the shiftable usage, using the tariff function defined above
shifted_overshoot_zzz = shift_many(wt,shiftable_array, sleepy=True)

# Replace the shiftABLE usage with the shiftED usage
weekly_shifted_array_zzz = weekly_usage_array - shiftable_array + shifted_overshoot_zzz

# Insert the shifted array into a df that looks like the original pivot df
shifted_pivot_zzz = pd.DataFrame(weekly_shifted_array_zzz,
                             columns=list(res_pivot.columns),
                             index=res_pivot.index).reset_index()

# Unpivot the df
shifted_dfmelt_zzz = pd.melt(shifted_pivot_zzz,
                                id_vars=['isoyear','week'],
                                value_vars=list(shifted_pivot.columns),
                               var_name='weekhour', value_name='ToU_zzz')\
                            .sort_values(by=['isoyear','week','weekhour'])

# Merge with the unshifted and non-sleepy data to retrieve timestamps and to have a comparison
orig_plus_shifted_zzz = pd.merge(orig_plus_shifted, shifted_dfmelt_zzz, on=['isoyear','week','weekhour'])\
                        .set_index('timestamp')\
                        .drop(['isoyear','week', 'weekhour'], axis=1)

## Results

### Residential usage

In [21]:
mg.month_figure(orig_plus_shifted_zzz.reset_index(),\
               'Consumption Shifted Weekly',\
               ['kWh', 'ToU', 'ToU_zzz'],\
               ['Original Consumption', 'Shifted Consumption', 'Sleepy Shifted Consumption'],\
               t='timestamp', ytitle = 'Energy Consumption (kWh)')

In [30]:
mg.month_figure(dmf.daily_max(orig_plus_shifted_zzz).reset_index(),\
               'Daily Maximum Consumption Before and After Weekly Shift',\
               ['max_kWh', 'max_ToU', 'max_ToU_zzz'],\
               ['Original Max Consumption', 'Shifted Max Consumption', 'Sleepy Shifted Max Consumption'],\
               t='timestamp', ytitle = 'Energy Consumption (kWh)')

In [28]:
print("Pre shift peak consumption", orig_plus_shifted_zzz.kWh.max())
print("Post shift peak consumption", orig_plus_shifted_zzz.ToU.max())
print("Post sleepy shift peak consumption", orig_plus_shifted_zzz.ToU_zzz.max())

Pre shift peak consumption 72.77423839657268
Post shift peak consumption 73.93161295756295
Post sleepy shift peak consumption 74.12425453462616


### Including the commercial and industrial use with our shifted residential use

As seen above, our shifts increase the peak usage (by shifting usage to weekends). However, if we also take into account commercial and industrial usage, the picture is much nicer. We chose to shift to weekends because we expect commercial and industrial usage to decrease on weekends.

In [24]:
# Retrieve just the commercial and industrial aggregated usage
df_agg_comm_ind = df_agg.drop(['ds_kWh_res'], axis=1)\
                            .rename(columns={"ds_kWh_comm": "comm",
                                             "ds_kWh_ind": "ind"
                                            }
                                   )\
                            .reset_index().copy()

# Merge with our shifted residential usage
df_agg_shift = pd.merge(orig_plus_shifted_zzz.reset_index(), df_agg_comm_ind, on=['timestamp'])\
                .rename(columns={'kWh': 'res_raw'})\
                .set_index('timestamp')


# Compute totals with the shifted vs unshifted data
df_agg_shift['no_shift']= df_agg_shift['res_raw']\
                            + df_agg_shift['comm']\
                            + df_agg_shift['ind']

df_agg_shift['shifted']= df_agg_shift['ToU']\
                            + df_agg_shift['comm']\
                            + df_agg_shift['ind']

df_agg_shift['sleepy']= df_agg_shift['ToU_zzz']\
                            + df_agg_shift['comm']\
                            + df_agg_shift['ind']

In [25]:
# Plot the totals
mg.month_figure(df_agg_shift.reset_index(),\
               'Total Grid Electrical Consumption with Weekly Shift',\
               ['no_shift', 'shifted', 'sleepy'],\
               ['No Residential Shift', 'Residential Shift', 'Sleepy Residential Shift'],\
               t='timestamp', ytitle = 'Energy Consumption (kWh)')

In [26]:
# Daily maximum usage for the total grid
mg.month_figure(dmf.daily_max(df_agg_shift).reset_index(),\
               'Daily Maximum Total Grid Consumption Before and After Weekly Shift',\
               ['max_no_shift', 'max_shifted', 'max_sleepy'],
               ['Original Max Consumption', 'Shifted Max Consumption', 'Sleepy Shifted Max Consumption'],\
               t='timestamp', ytitle = 'Energy Consumption (kWh)')

In [29]:
print("Pre shift peak consumption", df_agg_shift.no_shift.max())
print("Post shift peak consumption", df_agg_shift.shifted.max())
print("Post sleepy shift peak consumption", df_agg_shift.sleepy.max())

Pre shift peak consumption 229.19793311518887
Post shift peak consumption 221.2148571680717
Post sleepy shift peak consumption 222.18769477857722
