<a href="https://colab.research.google.com/github/BitKnitting/transfer_nilm_exploration/blob/master/ExploreNILM_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Overview
I am self-teaching myself Deep Learning methods using TensorFlow 2.0 and Keras.  I wanted to learn by applying the Deep Learning methods to a scenario that interests me.  The scenario I chose was figuring out if a microwave was on within a house's aggregated electricity readings.  This field of research is called NILM - Non-Intrusive Load Monitoring.  It's called this because it takes readings from how we typically measure a house's electricity - all appliances captured together - and disaggregates them into the appliances that used the electricity.  Appliances typically are microwaves, washing and drying machines, heaters, air conditioners, etc.

## Specific Scenario
To accomplish my goal of getting predictions from input data fed into a Deep Learning model, I need to identify:
- The Deep Learning model
- The input data

to be used.

### Deep Learning Model
I chose to replicate the Deep Learning model written about in the document [ _Transfer Learning for Non-Intrusive Load Monitoring_ ](https://arxiv.org/abs/1902.08835). Figure 1 on page 4 captures the Deep Learning model used in their research:

![alt text](https://drive.google.com/uc?id=1fkgxjqm0UzTKisCIyKJV9BNQhuAiBt6E)  

The researchers call this _seq2point Learning_ because the input is 599 samples of aggregate electricity readings and the output is the midpoint of the target appliance (in this case a microwave).

### Input Data

Section IV in [the research paper](https://arxiv.org/abs/1902.08835) discusses the Data sets.

I decided to use the REDD dataset.
#### Building the Input Data
The researchers kindly provided [a GitHub repo with their code](https://github.com/MingjunZhong/transferNILM).  I used:
- a modified version of [`create_trainset_redd.py`](https://github.com/MingjunZhong/transferNILM/blob/master/dataset_management/redd/create_trainset_redd.py).  The code uses the parameters file to get the data out of the other researcher's files stored on the internet.  It creates a training, validation, and test data sets that have been normalized using the method the researchers discuss in their paper.
- the parameters provided in [`redd_parameters.py`](https://github.com/MingjunZhong/transferNILM/blob/master/dataset_management/redd/redd_parameters.py).

My [modified version of `create_trainset_redd.py`](https://github.com/BitKnitting/transfer_nilm_exploration/blob/master/code/create_trainset_redd.py) creates [three files](https://github.com/BitKnitting/transfer_nilm_exploration/tree/master/created_data/REDD/microwave) - a training, validation, and test file.  

The files are saved as zipped pickle files because I find this format to be the easiest and fastest way of loading these type of files into Pandas.  



# Load The Data
I wrote a python file - [ `ziptodf.py` ](https://github.com/BitKnitting/transfer_nilm_exploration/blob/master/code/python_lib/ziptodf.py) - that loads zipped pickle files and then can show basics stats about the file.  

In [1]:
!wget https://raw.githubusercontent.com/BitKnitting/transfer_nilm_exploration/master/code/python_lib/ziptodf.py

--2019-12-08 13:14:34--  https://raw.githubusercontent.com/BitKnitting/transfer_nilm_exploration/master/code/python_lib/ziptodf.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1277 (1.2K) [text/plain]
Saving to: ‘ziptodf.py’


2019-12-08 13:14:39 (278 MB/s) - ‘ziptodf.py’ saved [1277/1277]



In [0]:
import ziptodf

In [0]:
df = ziptodf.get_dataframe_from_pkl_zip('https://raw.githubusercontent.com/BitKnitting/transfer_nilm_exploration/master/created_data/REDD/microwave/microwave_training_.pkl.zip')

In [4]:
ziptodf.print_stats(df,'REDD Microwave training data')

REDD Microwave training data
**************************
Start index: 0
--------------------------
End index: 300350
--------------------------
Shape: (300351, 2)
--------------------------
Data types: 
aggregate    float64
microwave    float64
dtype: object
--------------------------
Number of missing values:
aggregate    0
microwave    0
dtype: int64
--------------------------
Summary Stats:
            aggregate      microwave
count  300351.000000  300351.000000
mean       -0.208512      -0.610165
std         0.736927       0.130792
min        -0.601689      -0.625000
25%        -0.489411      -0.622500
50%        -0.329694      -0.620000
75%        -0.239884      -0.619167
max         8.755021       1.848750
--------------------------
Head:
    aggregate  microwave
0  -0.594218  -0.620000
1  -0.594218  -0.620000
2  -0.594211  -0.619583
3  -0.594240  -0.620000
4  -0.594295  -0.620000
--------------------------


The dataframe has two columns. The first column ('aggregate') are the normalized aggregated electricity readings from the houses measured during the REDD project.  This is the input data.  The second column ('microwave') are the microwave's electricity readings.  This is the target data.

The researchers specify a window size of 599.  I will set the input_shape=(599,1).

# Get Data Ready for the Deep Learning Model

I'll start  by loading TensorFlow (2.0). 


In [5]:
# Install TensorFlow
try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
  pass

import tensorflow as tf
tf.__version__

TensorFlow 2.x selected.


'2.0.0'

In [0]:
dataset = tf.data.Dataset.from_tensor_slices((df['aggregate'].values, df['microwave'].values))

# Make the Model
The researchers note in their paper: _...a fixed-length window of aggregate active power consumption signal is given as input. For the neural networks employed
in this paper, a sample window has 599 data points...In our experiments,
a sample window was generated by sliding the window forward by a
single data point, and so all the possible sample windows were used
for training. The windows of the mains were used as the inputs to the
6 neural networks, whilst the midpoints of the corresponding windows
of the appliances were used as the targets._
  
My goal is to replicate the model used in [`cnnModel.py`](https://github.com/MingjunZhong/transferNILM/blob/master/cnnModel.py).  The model uses a more advanced functional approach to creating a model than my current knowledge affords.  My goal is to replicate the model as much as possible using Keras's `Sequential` approach.  
Here is a copy of the model within `cnnModel.py`:  
```
def get_model(appliance, input_tensor, window_length, transfer_dense=False, transfer_cnn=False,
              cnn='kettle', n_dense=1, pretrainedmodel_dir='./models/'):

    reshape = Reshape((-1, window_length, 1),
                      )(input_tensor)

    cnn1 = Conv2D(filters=30,
                  kernel_size=(10, 1),
                  strides=(1, 1),
                  padding='same',
                  activation='relu',
                  )(reshape)

    cnn2 = Conv2D(filters=30,
                  kernel_size=(8, 1),
                  strides=(1, 1),
                  padding='same',
                  activation='relu',
                  )(cnn1)

    cnn3 = Conv2D(filters=40,
                  kernel_size=(6, 1),
                  strides=(1, 1),
                  padding='same',
                  activation='relu',
                  )(cnn2)

    cnn4 = Conv2D(filters=50,
                  kernel_size=(5, 1),
                  strides=(1, 1),
                  padding='same',
                  activation='relu',
                  )(cnn3)

    cnn5 = Conv2D(filters=50,
                  kernel_size=(5, 1),
                  strides=(1, 1),
                  padding='same',
                  activation='relu',
                  )(cnn4)

    flat = Flatten(name='flatten')(cnn5)

    d = Dense(1024, activation='relu', name='dense')(flat)

    if n_dense == 1:
        label = d
    elif n_dense == 2:
        d1 = Dense(1024, activation='relu', name='dense1')(d)
        label = d1
    elif n_dense == 3:
        d1 = Dense(1024, activation='relu', name='dense1')(d)
        d2 = Dense(1024, activation='relu', name='dense2')(d1)
        label = d2

    d_out = Dense(1, activation='linear', name='output')(label)

    model = Model(inputs=input_tensor, outputs=d_out)

    session = K.get_session()

    if transfer_dense:
        log("Transfer learning...")
        log("...loading an entire pre-trained model")
        weights_loader(model, pretrainedmodel_dir+'/cnn_s2p_' + appliance + '_pointnet_model')
        model_def = model
    elif transfer_cnn and not transfer_dense:
        log("Transfer learning...")
        log('...loading a ' + appliance + ' pre-trained-cnn')
        cnn_weights_loader(model, cnn, pretrainedmodel_dir)
        model_def = model
        for idx, layer1 in enumerate(model_def.layers):
            if hasattr(layer1, 'kernel_initializer') and 'conv2d' not in layer1.name and 'cnn' not in layer1.name:
                log('Re-initialize: {}'.format(layer1.name))
                layer1.kernel.initializer.run(session=session)

    elif not transfer_dense and not transfer_cnn:
        log("Standard training...")
        log("...creating a new model.")
        model_def = model
    else:
        raise argparse.ArgumentTypeError('Model selection error.')

    # Printing, logging and plotting the model
    print_summary(model_def)
    # plot_model(model, to_file='./model.png', show_shapes=True, show_layer_names=True, rankdir='TB')

    # Adding network structure to both the log file and output terminal
    files = [x for x in os.listdir('./') if x.endswith(".log")]
    with open(max(files, key=os.path.getctime), 'a') as fh:
        # Pass the file handle in as a lambda function to make it callable
        model_def.summary(print_fn=lambda x: fh.write(x + '\n'))

    # Check weights slice
    for v in tf.trainable_variables():
        if v.name == 'conv2d_1/kernel:0':
            cnn1_weights = session.run(v)
    return model_def, cnn1_weights

```





In [0]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Conv2D,Flatten,Dense,Reshape

In [0]:
model = Sequential()
model.add(Reshape((-1,599,1)))
model.add(Conv2D(filters=30,kernel_size=(10,1),strides=(1,1),padding='same',activation='relu',input_shape=(599,1)))
model.add(Conv2D(filters=30,kernel_size=(8,1),strides=(1,1),padding='same',activation='relu'))
model.add(Conv2D(filters=40,kernel_size=(6,1),strides=(1,1),padding='same',activation='relu'))
model.add(Conv2D(filters=50,kernel_size=(5,1),strides=(1,1),padding='same',activation='relu'))
model.add(Conv2D(filters=50,kernel_size=(5,1),strides=(1,1),padding='same',activation='relu'))
model.add(Flatten())
model.add(Dense(1024,activation='relu'))
model.add(Dense(1, activation='linear', name='output'))

# Compile the Model


In [0]:
model.compile(optimizer='adam',
              loss='mean_squared_error',
              metrics=['accuracy'])

# Fit the Model


In [21]:
model.fit(dataset, epochs=10)



To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.



ValueError: ignored

In [38]:
import numpy as np
offset = int(0.5*(599-1.0))
display(offset)
max_batchsize = 33372 - 2 * offset
display(max_batchsize)

299

32774

In [36]:
np.random.shuffle(skip_idx)
skip_idx

array([ 9., 23., 20., 19., 11.,  6.,  8., 12., 27., 17., 10.,  3., 26.,
       25.,  4., 14.,  5., 15., 18., 24.,  1.,  7.,  2.,  0., 16., 29.,
       22., 21., 28., 13.])

In [41]:
  indices = np.arange(max_batchsize)
  print('indices prior to shuffling: {}'.format(indices))
  np.random.shuffle(indices)
  print('indices after shuffling: {}'.format(indices))

indices prior to shuffling: [    0     1     2 ... 32771 32772 32773]
indices after shuffling: [15694 11232  7599 ... 21021 21856  2652]


In [31]:
np.random.shuffle(arr)
arr

array([1, 2, 0])

In [57]:

# Batch before shuffle.
dataset = tf.data.Dataset.from_tensor_slices([0, 0, 0, 1, 1, 1, 2, 2, 2, 3])
dataset = dataset.batch(2)
dataset = dataset.shuffle(10)

for elem in dataset:
  print(elem)

tf.Tensor([0 1], shape=(2,), dtype=int32)
tf.Tensor([0 0], shape=(2,), dtype=int32)
tf.Tensor([2 2], shape=(2,), dtype=int32)
tf.Tensor([2 3], shape=(2,), dtype=int32)
tf.Tensor([1 1], shape=(2,), dtype=int32)


In [46]:

# Shuffle before batch.
dataset = tf.data.Dataset.from_tensor_slices([0, 0, 0, 1, 1, 1, 2, 2, 2])
dataset = dataset.shuffle(9)
dataset = dataset.batch(3)

for elem in dataset:
  print(elem)

tf.Tensor([1 1 0], shape=(3,), dtype=int32)
tf.Tensor([2 2 0], shape=(3,), dtype=int32)
tf.Tensor([1 2 0], shape=(3,), dtype=int32)


In [16]:
help(np.arange)

Help on built-in function arange in module numpy:

arange(...)
    arange([start,] stop[, step,], dtype=None)
    
    Return evenly spaced values within a given interval.
    
    Values are generated within the half-open interval ``[start, stop)``
    (in other words, the interval including `start` but excluding `stop`).
    For integer arguments the function is equivalent to the Python built-in
    `range` function, but returns an ndarray rather than a list.
    
    When using a non-integer step, such as 0.1, the results will often not
    be consistent.  It is better to use `numpy.linspace` for these cases.
    
    Parameters
    ----------
    start : number, optional
        Start of interval.  The interval includes this value.  The default
        start value is 0.
    stop : number
        End of interval.  The interval does not include this value, except
        in some cases where `step` is not an integer and floating point
        round-off affects the length of `out`.
   

In [75]:
indices = np.arange(max_batchsize)
indices

array([    0,     1,     2, ..., 32771, 32772, 32773])

In [78]:
for start_idx in range(0, max_batchsize, 1000):
    excerpt = indices[start_idx:start_idx + 1000]
    display(start_idx)

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

11000

12000

13000

14000

15000

16000

17000

18000

19000

20000

21000

22000

23000

24000

25000

26000

27000

28000

29000

30000

31000

32000