<a href="https://colab.research.google.com/github/lisaong/diec/blob/tf2/day4/edge_online_learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Edge Online Learning

In this notebook, we will train a simple model that will be updated on the device using incoming data.

Tensorflow Lite does not support training on device CPU (yet), so we have to use the full version of Tensorflow. The implication is that models should be simple so that incremental training is not prohibitively expensive on the device.

1. Train simple model incrementally using Keras (Tensorflow backend)
2. Deploy full Tensorflow model on Raspberry Pi 3
3. Continue training with incoming data on Raspberry Pi 3

The model we will train is a binary classification gesture detector. The gesture data collection is covered separately. This notebook covers the machine learning and deployment portions.

The gesture data consists of accelerometer and compass readings collected from the BBC micro:bit. The target is True or False, depending on whether the gesture is happening.

In [0]:
%tensorflow_version 2.x

In [0]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pickle
import seaborn as sns

from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split

import tensorflow as tf
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Dense, Conv1D, Flatten
from tensorflow.keras.callbacks import ModelCheckpoint

sns.set(style='whitegrid')
%matplotlib inline

## Load initial dataset

Upload your data.csv to the Files tab of your Colab session before running the cell below.

(Note that you should not put it in the sample_data folder)

In [0]:
url = './data.csv'

df = pd.read_csv(url, names=['Gesture', 'AccX', 'AccY', 'AccZ', 'Heading'])
df.head()

In [0]:
# convert True to 1 and False to 0
# in this case we are just converting boolean types
# if converting non boolean types, use df['Gesture'].map(...)
df['Gesture_enc'] = df['Gesture'].astype('int8')
df.head()

In [0]:
# plot the data, coloured based on whether gesture is recognized

fig, ax = plt.subplots(figsize=(15, 8), nrows=2)

df[['AccX', 'AccY', 'AccZ', 'Heading']].plot(ax=ax[0])
df[['Gesture_enc']].plot(ax=ax[1])
plt.show()

In [0]:
# scale the data and plot again to see more patterns

scaler = MinMaxScaler(feature_range=(-1, 1))
scaled_data = scaler.fit_transform(df[['AccX', 'AccY', 'AccZ', 'Heading']])

scaled_df = pd.DataFrame(scaled_data,
                         columns=['AccX_scaled', 'AccY_scaled', 'AccZ_scaled', 'Heading_scaled'])
scaled_df.head()

### Exercise 2: Inspecting the pre-processed data
<p>
<font color="red">
Run the cell below to generate a plot of your gesture data.
</font>
</p>
<p>
<font color="green">
Submission:
Paste your plot into the submission worksheet</li>
</font>
</p>

In [0]:
# plot the data, coloured based on whether gesture is recognized

fig, ax = plt.subplots(figsize=(15, 8), nrows=2)

scaled_df.plot(ax=ax[0])
df[['Gesture_enc']].plot(ax=ax[1])
plt.show()

## Preparing data for training

From the plot above, you should be able to find some consistent patterns that are associated with your gesture. 

Let's apply windowing so that the past "window" of samples will be used for prediction. This is a multi-variate time series dataset.

Estimate the width of the patterns to determine what window to use for prediction.

Change the `timesteps` parameter to match your window size, for best results.

In [0]:
# convert the time series so that each entry contains a series of timesteps.
# Before: rows, features
# After: rows, timesteps, features
# Note that some rows will be removed because we are taking a window of values.

# TODO: Change this to match your gesture width.
timesteps = 20

# we will use the scaled version of the features
print('Before', scaled_df.shape) # (rows, features)

rolling_indexes = [(range(i, i+timesteps))
                   for i in range(scaled_df.shape[0]-timesteps)]

X_sequence = np.take(scaled_df.values, rolling_indexes, axis=0)
print('After', X_sequence.shape) # (rows, timesteps, features)

In [0]:
# compute y based on rolling average of window values
# make sure y is the same length as X_sequence

target = 'Gesture_enc'
print('Before', df[target].shape)

# rolling average, using numpy to avoid a dependency on pandas
# https://stackoverflow.com/questions/13728392/moving-average-or-running-mean
y = np.convolve(df[target].values, np.ones((timesteps,))/timesteps, mode='valid')

# shift forward by 1
y = y[1:]

# apply a threshold to convert to 1 and 0
y = np.where(y >= 0.5, 1, 0)

print('After', y.shape)

In [0]:
# check for imbalance
# imbalanced classes will make it harder to train.
# Ideally, a 50:50 ratio is preferred. Slightly imbalance is okay, but no more than 5x.
plt.hist(y)
plt.show()

In [0]:
# create training and validation set
X_train, X_val, y_train, y_val = train_test_split(X_sequence, y, test_size=0.1,
                                                  stratify=y)

X_train.shape, X_val.shape, y_train.shape, y_val.shape

In [0]:
# visualise a sampling of windows

n = 5
sample_X, sample_y = X_train[:n], y_train[:n]

fig, axes = plt.subplots(nrows=n, figsize=(10, 10))
plt.subplots_adjust(hspace=1)

for i in range(n):
  axes[i].plot(sample_X[i])
  axes[i].set_title(f'y={sample_y[i]}')
  
plt.show()

## Training



In [0]:
model = Sequential()

# input_shape=(timesteps, features)
model.add(Conv1D(64, kernel_size=3,
                 input_shape=(X_train.shape[1], X_train.shape[2]),
                 activation='relu'))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.summary()

In [0]:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['acc'])

In [0]:
mc = ModelCheckpoint('./cnn_online.h5', save_best_only=True)

In [0]:
history = model.fit(X_train, y_train, epochs=20,
                    validation_data=(X_val, y_val), callbacks=[mc])

In [0]:
fig, ax = plt.subplots(figsize=(10, 8))
ax.plot(history.history['loss'], label='train')
ax.plot(history.history['val_loss'], label='val')
ax.legend()
ax.set_title('Learning curve')
ax.set_xlabel('Epochs')
ax.set_ylabel('Loss')
plt.show()

## Training from checkpoint

We will load a checkpoint of the model and use it for incremental training.

In [0]:
# load the checkpoint
checkpoint = load_model('./cnn_online.h5')
checkpoint.summary()

In [0]:
# continue training for 1 epoch (just to test)
history = checkpoint.fit(X_train, y_train, epochs=1,
                         validation_data=(X_val, y_val))

## Preparing for Deployment

We will define/save the following for deployment on a Raspberry Pi:

1. Pre-processors: scaler, windowing, rolling average
2. Checkpoint
3. Validation dataset (for overfitting check)

In [0]:
# These are actually not pickled, but will be defined on the Raspberry Pi

def create_windows(X, timesteps):
  """convert the time series so that each entry contains a series of timesteps.
  Before: rows, features
  After: rows, timesteps, features
  """  
  rolling_indexes = [(range(i, i+timesteps))
                     for i in range(X.shape[0]-timesteps)]

  X_out = np.take(X, rolling_indexes, axis=0)
    
  return X_out

def create_rolling_average(y, timesteps):
  """compute y based on rolling average of window values
  https://stackoverflow.com/questions/13728392/moving-average-or-running-mean
  """
  y_out = np.convolve(y, np.ones((timesteps,))/timesteps, mode='valid')

  # shift forward by 1
  y_out = y_out[1:]

  # apply a threshold to convert to 1 and 0
  y_out = np.where(y_out >= 0.5, 1, 0)

  return y_out

In [0]:
# Save scaler, validation set

for_training = {
    'scaler': scaler,
    'X_val': X_val,
    'y_val': y_val
}

pickle.dump(for_training, open('./preprocessors_and_data.pkl', 'wb'))

### Testing deployment artifacts

The following script samples some test data and ensures that the checkpoint can still be trained incrementally

In [0]:
# load the dataset again, picking the last 2*window_size samples
df_test = pd.read_csv(url, names=['Gesture', 'AccX', 'AccY', 'AccZ', 'Heading'])

sample = df_test.iloc[-(timesteps*2):]
sample

In [0]:
y_sample = np.where(sample['Gesture'], 1, 0)
y_sample

In [0]:
X_sample = sample.loc[:, sample.columns != 'Gesture'].values
X_sample

In [0]:
for_training1 = pickle.load(open('./preprocessors_and_data.pkl', 'rb'))

In [0]:
X_sample_scaled = for_training1['scaler'].transform(X_sample)
X_sample_train = create_windows(X_sample_scaled, timesteps)
y_sample_train = create_rolling_average(y_sample, timesteps)

X_sample_train.shape, y_sample_train.shape

In [0]:
# continue training with sample data (just to test)
# use the original validation data
checkpoint.fit(X_sample_train, y_sample_train, epochs=1,
               validation_data=(for_training1['X_val'], for_training1['y_val']))

We are now ready to deploy on the Raspberry Pi for online training.