Export Controlled: ECCN EAR1E998

Warning: This document contains technical data whose export is restricted by the Bureau of Industry & Security’s Export Administration Regulations and cannot be exported or re-exported without the authorization of the U.S. government. Violations of these export laws are subject to severe criminal penalties. Diversion contrary to U.S. law is prohibited.

Start with:
•supply fan power if different chillers

Objective:
•feature generation for chiller start/stop feature extraction
 Features including: start/stop times, start hour, stop hour, hour per start, hour per stop


In [1]:
from datetime import timedelta
import pandas as pd
import matplotlib.pyplot as plt
from UTCDAL.Design.Preprocessing import start_stop_stats as ss
from UTCDAL.Design.FeatureEngineering import distristat as dist
from UTCDAL.Design.Preprocessing import pca

In [2]:
def get_chiller_state(data, idd, date):
    '''
    extract data for chiller id(idd) on certain day(date)
    '''
    data = data[data['id'] == idd]
    data = data[data['DT_SENSOR_READTIME'] == date]
    if data.shape[0] == 0:  ####no preday data
        result = -1
    else:
        result = data['flag'].values[0]
    return result

## Load data

In [3]:
chiller_all = pd.read_csv('line_kw_in.csv', index_col=0)## load chiller power data
chiller_all['DT_SENSOR_READTIME'] = pd.to_datetime(chiller_all['DT_SENSOR_READTIME'])
chiller_all['year_month_day'] = pd.to_datetime(chiller_all['year_month_day'])
start_date = pd.to_datetime('2017-02-02 00:00:00')
chiller_data = chiller_all[chiller_all['year_month_day'] > start_date]

In [4]:
chiller_data.head()

Unnamed: 0,id,Line_KW,DT_SENSOR_READTIME,year_month_day
1453,1,213,2017-02-03 00:00:00,2017-02-03
1454,1,176,2017-02-03 00:15:00,2017-02-03
1455,1,204,2017-02-03 00:30:00,2017-02-03
1456,1,218,2017-02-03 00:45:00,2017-02-03
1457,1,176,2017-02-03 01:00:00,2017-02-03


## Getting sampling rate

In [5]:
ch_ids = chiller_data['id'].unique().tolist()
time_list = chiller_data[chiller_data['id'] == ch_ids[0]]['DT_SENSOR_READTIME'].tolist()
time_list[0:5]

[Timestamp('2017-02-03 00:00:00'),
 Timestamp('2017-02-03 00:15:00'),
 Timestamp('2017-02-03 00:30:00'),
 Timestamp('2017-02-03 00:45:00'),
 Timestamp('2017-02-03 01:00:00')]

In [6]:
sampling_rate = ss.get_sampling_rate(time_list)
sampling_rate

96

## Getting on and off for chiller

In [7]:
chiller_data['flag'] = ss.prepro_start_stop(chiller_data['Line_KW'].tolist(), 1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [8]:
chiller_id = 1
ch_days = chiller_data[chiller_data['id'] == 1]
f_date = ch_days['year_month_day'].iloc[0]
ch_days = ch_days[ch_days['year_month_day'] == f_date]
print(ch_days)
print(f_date)

      id  Line_KW  DT_SENSOR_READTIME year_month_day  flag
1453   1      213 2017-02-03 00:00:00     2017-02-03     1
1454   1      176 2017-02-03 00:15:00     2017-02-03     1
1455   1      204 2017-02-03 00:30:00     2017-02-03     1
1456   1      218 2017-02-03 00:45:00     2017-02-03     1
1457   1      176 2017-02-03 01:00:00     2017-02-03     1
1458   1      212 2017-02-03 01:15:00     2017-02-03     1
1459   1      216 2017-02-03 01:30:00     2017-02-03     1
1460   1      173 2017-02-03 01:45:00     2017-02-03     1
1461   1      205 2017-02-03 02:00:00     2017-02-03     1
1462   1      237 2017-02-03 02:15:00     2017-02-03     1
1463   1      193 2017-02-03 02:30:00     2017-02-03     1
1464   1      199 2017-02-03 02:45:00     2017-02-03     1
1465   1      226 2017-02-03 03:00:00     2017-02-03     1
1466   1      194 2017-02-03 03:15:00     2017-02-03     1
1467   1      204 2017-02-03 03:30:00     2017-02-03     1
1468   1      221 2017-02-03 03:45:00     2017-02-03    

## Getting standard sampling time for each day

In [9]:
timelist_standard = ss.get_timestamp_perday(f_date, sampling_rate)
print(timelist_standard)

[Timestamp('2017-02-03 00:00:00'), Timestamp('2017-02-03 00:15:00'), Timestamp('2017-02-03 00:30:00'), Timestamp('2017-02-03 00:45:00'), Timestamp('2017-02-03 01:00:00'), Timestamp('2017-02-03 01:15:00'), Timestamp('2017-02-03 01:30:00'), Timestamp('2017-02-03 01:45:00'), Timestamp('2017-02-03 02:00:00'), Timestamp('2017-02-03 02:15:00'), Timestamp('2017-02-03 02:30:00'), Timestamp('2017-02-03 02:45:00'), Timestamp('2017-02-03 03:00:00'), Timestamp('2017-02-03 03:15:00'), Timestamp('2017-02-03 03:30:00'), Timestamp('2017-02-03 03:45:00'), Timestamp('2017-02-03 04:00:00'), Timestamp('2017-02-03 04:15:00'), Timestamp('2017-02-03 04:30:00'), Timestamp('2017-02-03 04:45:00'), Timestamp('2017-02-03 05:00:00'), Timestamp('2017-02-03 05:15:00'), Timestamp('2017-02-03 05:30:00'), Timestamp('2017-02-03 05:45:00'), Timestamp('2017-02-03 06:00:00'), Timestamp('2017-02-03 06:15:00'), Timestamp('2017-02-03 06:30:00'), Timestamp('2017-02-03 06:45:00'), Timestamp('2017-02-03 07:00:00'), Timestamp('20

## Check wether data is missing

In [10]:
missflag = ss.day_time_check(ch_days['DT_SENSOR_READTIME'].tolist(), timelist_standard)
print(missflag)

False


## Normalize data on the standard time

In [11]:
ch_days = ch_days[ch_days['DT_SENSOR_READTIME'].isin(timelist_standard)]

## last datetime before front date

In [12]:
p_date = f_date - timedelta(minutes=24*60/sampling_rate)
print(p_date)

2017-02-02 23:45:00


## get chiller state on previous day

In [13]:
pre_state = get_chiller_state(chiller_data, chiller_id, p_date)  ## -1 means no value for the previous date
print(pre_state)

-1


## define an object for ss_info

In [14]:
ss_obj = ss.Startstopinfo(ch_days['flag'].tolist(), missingflag=missflag)

## get start_stop jump for each day

In [15]:
ss_obj.get_jump(update=True, p_state=pre_state) ## if update = True, then update jump info. using previous date

## get start_stop info for each day

In [16]:
[daily_ss_stat, jump2start_index, start_info] = ss_obj.get_start_info()
daily_ss_stat['id'] = chiller_id
daily_ss_stat['year_month_day'] = f_date
print(daily_ss_stat)

   cycle  start  stop  start_hour  stop_hour  hour_per_start  hour_per_stop  \
0     11      5     6        15.5        8.5             3.1       1.416667   

   id year_month_day  
0   1     2017-02-03  
