### Frequency-domain analysis workflow

This workbook contains a workflow to:
1. Explore the frequency-domain fetures of a sample building electric load profile
2. Cluster the load profiles with frequency domain features 

#### Part 0. Set the paths and import the modules needed for this workflow, descriptions of each module can be find in the README.md

In [1]:
# Run this cell only at the first kernel starts
%cd ..

import os
dir_root = %pwd
dir_example = os.path.join(dir_root, "example")
dir_test = os.path.join(dir_root, "tests")

from EULP.LP_explorations import LoadProfileExplorations
from EULP.LP_metrics import LoadProfileMetrics

/mnt/c/Users/hlee9/Documents/GitHub/DOE_EULP/pkg


#### Part 1. Explore the frequency-domain fetures of a sample building electric load profile

First, we create a LoadProfileMetrics instance from the sample dataframe, and take a look at the data.

In [7]:
import pandas as pd
df_sample = pd.read_csv(os.path.join(dir_example, 'data', 'sample.csv'))
lp_m = LoadProfileMetrics(df_sample)
lp_m.scale('Value', 0, 1)
print(lp_m)


--------------------------------------------------
Load profile between 7/30/2014 0:00 and 7/30/2016 23:45
------------------------------
Top 10 rows:
         Datetime     Value
0  7/30/2014 0:00  0.436189
1  7/30/2014 0:15  0.144474
2  7/30/2014 0:30  0.059231
3  7/30/2014 0:45  0.014092
4  7/30/2014 1:00  0.058048
5  7/30/2014 1:15  0.209871
6  7/30/2014 1:30  0.703102
7  7/30/2014 1:45  0.041681
8  7/30/2014 2:00  0.324800
9  7/30/2014 2:15  0.474346
------------------------------
Summary:
               Value
count  70272.000000
mean       0.498595
std        0.289229
min        0.000010
25%        0.247614
50%        0.497791
75%        0.749121
max        0.999946
--------------------------------------------------


Get frequency-domain features at daily window level.

In [10]:
lp_m.get_fft_w_window(df_sample, get_bins=False, year=2015)

Unnamed: 0,Date,0.0,1e-05,2e-05,4e-05,5e-05,6e-05,7e-05,8e-05,9e-05,...,0.00045,0.00046,0.00047,0.00048,0.0005,0.00051,0.00052,0.00053,0.00054,0.00056
0,2015-01-01,0.507455,0.254522,0.037455,0.019567,0.008935,0.029883,0.027308,0.026434,0.033742,...,0.015815,0.005229,0.014891,0.021479,0.054441,0.030949,0.031835,0.026659,0.026554,0.032948
1,2015-01-02,0.459509,0.228158,0.063681,0.076442,0.034501,0.022080,0.027290,0.019618,0.010929,...,0.013112,0.019240,0.019709,0.014751,0.019947,0.026207,0.039488,0.016169,0.019343,0.012988
2,2015-01-03,0.461971,0.227624,0.035373,0.027500,0.024019,0.023538,0.012306,0.022856,0.039545,...,0.022185,0.011521,0.047813,0.063732,0.041561,0.050104,0.069001,0.054309,0.044009,0.005540
3,2015-01-04,0.536262,0.270827,0.003735,0.028528,0.062341,0.060102,0.044687,0.047415,0.051606,...,0.020943,0.003307,0.023060,0.032684,0.013846,0.030080,0.031891,0.004827,0.028356,0.041186
4,2015-01-05,0.504221,0.275125,0.027339,0.045389,0.086305,0.073174,0.021654,0.040232,0.052581,...,0.032456,0.019919,0.038645,0.067264,0.047204,0.032435,0.049786,0.044533,0.018457,0.033197
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
360,2015-12-27,0.490395,0.245181,0.038264,0.051372,0.064346,0.045656,0.025486,0.017629,0.024840,...,0.035955,0.041043,0.023377,0.006915,0.013760,0.048074,0.036295,0.012008,0.023782,0.041413
361,2015-12-28,0.536678,0.282551,0.023114,0.047575,0.066466,0.046311,0.017210,0.028721,0.040828,...,0.036347,0.040536,0.027082,0.047736,0.034840,0.002250,0.050491,0.031206,0.040085,0.043561
362,2015-12-29,0.478326,0.254095,0.022520,0.043994,0.040497,0.014589,0.040184,0.033201,0.009107,...,0.016581,0.031753,0.009812,0.009009,0.032653,0.047478,0.018089,0.032713,0.042364,0.028291
363,2015-12-30,0.487940,0.248433,0.005566,0.018114,0.017528,0.016150,0.040254,0.018207,0.029830,...,0.009984,0.020644,0.028467,0.052550,0.026307,0.025793,0.043794,0.046471,0.054439,0.041580


The Dicrete Fourier Transform (DFT) of a daily load profile (15-min interval, 96 timestamps in total) may yield a spectrum with the frequency ranging from 0.5 hour to 12 hour. However, different load profiles may have different peaks in the spectrums. Creating bins of frequencies allows us to group various spectrum peaks. First, we're going to explore the load profiles' variations within each day. 

In [9]:
lp_m.get_fft_w_window(df_sample, get_bins=True, year=2015)

Unnamed: 0,0.5hr ~ 0.75hr,0.75hr ~ 1hr,1hr ~ 1.25hr,1.25hr ~ 1.5hr,1.5hr ~ 1.75hr,1.75hr ~ 2hr,2hr ~ 4hr,4hr ~ 8hr,8hr ~ 12hr
0,0.560095,0.214315,0.133605,0.135828,0.063123,0.081161,0.256065,0.058384,0.037455
0,0.383927,0.323965,0.117577,0.119553,0.079034,0.139508,0.114701,0.133024,0.063681
0,0.583062,0.313226,0.132448,0.050116,0.058875,0.032050,0.221065,0.075057,0.035373
0,0.457712,0.319624,0.201203,0.068819,0.074594,0.073603,0.191122,0.150972,0.003735
0,0.527714,0.211221,0.083400,0.182671,0.061812,0.024930,0.191873,0.204868,0.027339
...,...,...,...,...,...,...,...,...,...
0,0.415694,0.207822,0.115947,0.095497,0.061294,0.044101,0.194305,0.161374,0.038264
0,0.574101,0.208981,0.108214,0.109460,0.052160,0.062960,0.190028,0.160352,0.023114
0,0.396542,0.213972,0.146251,0.087010,0.061767,0.095723,0.170710,0.099080,0.022520
0,0.607827,0.233111,0.123146,0.092385,0.051692,0.044001,0.268461,0.051793,0.005566


#### Part 2. K-Means clustering with frequency-domain features