<a href="https://colab.research.google.com/github/PhilippMatthes/diplom/blob/master/src/shl-deep-learning-timeseries.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using a deep CNN to directly classify SHL timeseries data

The following notebook contains code to classify SHL timeseries data with deep convolutional neural networks. This is devided into the following steps:

1. Download the SHL dataset.
2. Preprocess the SHL dataset into features and make it readable efficiently by our training engine.
3. Define one or multiple ml models.
4. Train the model(s) and utilize grid search to find the best configuration.
5. Export the models and their training parameters for later analysis.

## Step 1: Download the SHL Dataset

The SHL dataset is very big, so we will need to free up some disk space on colab, first.

In [None]:
!rm -rf /usr/local/lib/python2.7
!rm -rf /swift
!rm -rf /usr/local/lib/python3.6/dist-packages/torch
!rm -rf /usr/local/lib/python3.6/dist-packages/pystan
!rm -rf /usr/local/lib/python3.6/dist-packages/spacy
!rm -rf /tensorflow-1.15.2/

Next, get our base repo so that we can use predefined architectures and pretrained scalers.

In [None]:
!git clone https://github.com/philippmatthes/diplom

Cloning into 'diplom'...
remote: Enumerating objects: 1833, done.[K
remote: Counting objects: 100% (1170/1170), done.[K
remote: Compressing objects: 100% (799/799), done.[K
remote: Total 1833 (delta 591), reused 842 (delta 311), pack-reused 663[K
Receiving objects: 100% (1833/1833), 39.82 MiB | 26.44 MiB/s, done.
Resolving deltas: 100% (967/967), done.


Switch to our src dir for further processing. This command is specific to Google Colab, so it might not work on your local Jupyter Notebook instance.

Additionally, we create the dataset dir in which our dataset will be downloaded next.

In [None]:
%cd /content/diplom/src
!mkdir shl-dataset

/content/diplom/src


Download the SHL dataset from the shl server. This might take some time, on Google Colab its approx. 45 minutes. You can also mount your Google Drive if you have enough space available.

In [None]:
!wget -nc -O shl-dataset/challenge-2019-user1_torso.zip http://www.shl-dataset.org/wp-content/uploads/SHLChallenge2019/challenge-2019-train_torso.zip
!wget -nc -O shl-dataset/challenge-2019-user1_bag.zip http://www.shl-dataset.org/wp-content/uploads/SHLChallenge2019/challenge-2019-train_bag.zip
!wget -nc -O shl-dataset/challenge-2019-user1_hips.zip http://www.shl-dataset.org/wp-content/uploads/SHLChallenge2019/challenge-2019-train_hips.zip
!wget -nc -O shl-dataset/challenge-2020-user1_hand.zip http://www.shl-dataset.org/wp-content/uploads/SHLChallenge2020/challenge-2020-train_hand.zip
!wget -nc -O shl-dataset/challenge-2020-users23_torso_bag_hips_hand.zip http://www.shl-dataset.org/wp-content/uploads/SHLChallenge2020/challenge-2020-validation.zip

--2021-08-23 09:18:07--  http://www.shl-dataset.org/wp-content/uploads/SHLChallenge2019/challenge-2019-train_torso.zip
Resolving www.shl-dataset.org (www.shl-dataset.org)... 37.187.125.22
Connecting to www.shl-dataset.org (www.shl-dataset.org)|37.187.125.22|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5852446972 (5.5G) [application/zip]
Saving to: ‘shl-dataset/challenge-2019-user1_torso.zip’


2021-08-23 09:26:38 (10.9 MB/s) - ‘shl-dataset/challenge-2019-user1_torso.zip’ saved [5852446972/5852446972]

--2021-08-23 09:26:39--  http://www.shl-dataset.org/wp-content/uploads/SHLChallenge2019/challenge-2019-train_bag.zip
Resolving www.shl-dataset.org (www.shl-dataset.org)... 37.187.125.22
Connecting to www.shl-dataset.org (www.shl-dataset.org)|37.187.125.22|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5628524721 (5.2G) [application/zip]
Saving to: ‘shl-dataset/challenge-2019-user1_bag.zip’


2021-08-23 09:34:50 (10.9 MB/s) - ‘shl-datas

Next we unzip our dataset into the running instance's filestorage. *Note that this will probably not work for free subscriptions of Google Colab, since the data is approximately 90-100 GB when extracted.*

In [None]:
# Unzip training datasets
!unzip -n -d shl-dataset/challenge-2019-user1_torso shl-dataset/challenge-2019-user1_torso.zip
!rm shl-dataset/challenge-2019-user1_torso.zip
!unzip -n -d shl-dataset/challenge-2019-user1_bag shl-dataset/challenge-2019-user1_bag.zip
!rm shl-dataset/challenge-2019-user1_bag.zip
!unzip -n -d shl-dataset/challenge-2019-user1_hips shl-dataset/challenge-2019-user1_hips.zip
!rm shl-dataset/challenge-2019-user1_hips.zip
!unzip -n -d shl-dataset/challenge-2020-user1_hand shl-dataset/challenge-2020-user1_hand.zip
!rm shl-dataset/challenge-2020-user1_hand.zip
!unzip -n -d shl-dataset/challenge-2020-users23_torso_bag_hips_hand shl-dataset/challenge-2020-users23_torso_bag_hips_hand.zip
!rm shl-dataset/challenge-2020-users23_torso_bag_hips_hand.zip

Archive:  shl-dataset/challenge-2019-user1_torso.zip
   creating: shl-dataset/challenge-2019-user1_torso/train/Torso/
  inflating: shl-dataset/challenge-2019-user1_torso/train/Torso/Acc_x.txt  
  inflating: shl-dataset/challenge-2019-user1_torso/train/Torso/Acc_y.txt  
  inflating: shl-dataset/challenge-2019-user1_torso/train/Torso/Acc_z.txt  
  inflating: shl-dataset/challenge-2019-user1_torso/train/Torso/Gra_x.txt  
  inflating: shl-dataset/challenge-2019-user1_torso/train/Torso/Gra_y.txt  
  inflating: shl-dataset/challenge-2019-user1_torso/train/Torso/Gra_z.txt  
  inflating: shl-dataset/challenge-2019-user1_torso/train/Torso/Gyr_x.txt  
  inflating: shl-dataset/challenge-2019-user1_torso/train/Torso/Gyr_y.txt  
  inflating: shl-dataset/challenge-2019-user1_torso/train/Torso/Gyr_z.txt  
  inflating: shl-dataset/challenge-2019-user1_torso/train/Torso/Label.txt  
  inflating: shl-dataset/challenge-2019-user1_torso/train/Torso/LAcc_x.txt  
  inflating: shl-dataset/challenge-2019-user1

## Step 2: Preprocess the data

Explanations will from now on be inside the code, so that you can copy it without losing the contextual information.

In [None]:
# Check the CUDA version

!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0


In [None]:
# Change into our project src directory
# Note: use this as an entrypoint when you already downloaded the dataset

%cd /content/diplom/src
%tensorflow_version 2.x

/content/diplom/src


In [None]:
# Check configuration and hardware resources

import distutils

import tensorflow as tf

print(f'Using TensorFlow: {tf.__version__}')

if distutils.version.LooseVersion(tf.__version__) < '2.0':
    raise Exception('This notebook is compatible with TensorFlow 2.0 or higher.')

print('GPU Devices:')
tf.config.list_physical_devices('GPU')

Using TensorFlow: 2.6.0
GPU Devices:


[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

In [None]:
# Define all datasets to train our model on

from pathlib import Path

TRAIN_DATASET_DIRS = [
    Path('shl-dataset/challenge-2019-user1_torso/train/Torso'),
    Path('shl-dataset/challenge-2019-user1_bag/train/Bag'),
    Path('shl-dataset/challenge-2019-user1_hips/train/Hips'),
    Path('shl-dataset/challenge-2020-user1_hand/train/Hand'),
    Path('shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Torso'),         
    Path('shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Bag'),   
    Path('shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Hips'),   
    Path('shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Hand'),   
]

In [None]:
# Define more useful constants about our dataset

LABEL_ORDER = [
    'Null',
    'Still',
    'Walking',
    'Run',
    'Bike',
    'Car',
    'Bus',
    'Train',
    'Subway',
]

SAMPLE_LENGTH = 500

In [None]:
# Results from data analysis

CLASS_WEIGHTS = {
    0: 0.0, # NULL label
    1: 1.0021671573438011, 
    2: 0.9985739895697523, 
    3: 2.8994439843842423, 
    4: 1.044135815617944, 
    5: 0.7723505499007343, 
    6: 0.8652474758172704, 
    7: 0.7842127155793044, 
    8: 1.0283208861290594
}

In [None]:
# Define features for our dataset

from collections import OrderedDict

import numpy as np

# Attributes to load from our dataset
X_attributes = [
    'acc_x', 'acc_y', 'acc_z',
    'mag_x', 'mag_y', 'mag_z',
    'gyr_x', 'gyr_y', 'gyr_z',
    # Parts that are not needed:
    # 'gra_x', 'gra_y', 'gra_z',
    # 'lacc_x', 'lacc_y', 'lacc_z',
    # 'ori_x', 'ori_y', 'ori_z', 'ori_w',
]

# Files within the dataset that contain our attributes
X_files = [
    'Acc_x.txt', 'Acc_y.txt', 'Acc_z.txt',
    'Mag_x.txt', 'Mag_y.txt', 'Mag_z.txt',
    'Gyr_x.txt', 'Gyr_y.txt', 'Gyr_z.txt',
    # Parts that are not needed:
    # 'Gra_x.txt', 'Gra_y.txt', 'Gra_z.txt',
    # 'LAcc_x.txt', 'LAcc_y.txt', 'LAcc_z.txt',
    # 'Ori_x.txt', 'Ori_y.txt', 'Ori_z.txt', 'Ori_w.txt',
]

# Features to generate from our loaded attributes
# Note that `a` is going to be a dict of attribute tracks
X_features = OrderedDict({
    'acc_mag': lambda a: np.sqrt(a['acc_x']**2 + a['acc_y']**2 + a['acc_z']**2),
    'mag_mag': lambda a: np.sqrt(a['mag_x']**2 + a['mag_y']**2 + a['mag_z']**2),
    'gyr_mag': lambda a: np.sqrt(a['gyr_x']**2 + a['gyr_y']**2 + a['gyr_z']**2),
})

# Define where to find our labels for supervised learning
y_file = 'Label.txt'
y_attribute = 'labels'

In [None]:
# Load pretrained power transformers for feature scaling

import joblib

X_feature_scalers = OrderedDict({})
for feature_name, _ in X_features.items():
    scaler_dir = f'models/shl-scalers/{feature_name}.scaler.joblib'
    scaler = joblib.load(scaler_dir)
    scaler.copy = False # Save memory
    X_feature_scalers[feature_name] = scaler
    print(f'Loaded scaler from {scaler_dir}.')

Loaded scaler from models/shl-scalers/acc_mag.scaler.joblib.
Loaded scaler from models/shl-scalers/mag_mag.scaler.joblib.
Loaded scaler from models/shl-scalers/gyr_mag.scaler.joblib.




In [None]:
# Load the training and validation data into a high performance datatype

import os
import shutil

from typing import Generator, List, Tuple

from tqdm import tqdm

import pandas as pd

def read_chunks(
    n_chunks: int, 
    X_attr_readers: List[pd.io.parsers.TextFileReader], 
    y_attr_reader: pd.io.parsers.TextFileReader
) -> Generator[Tuple[np.ndarray, np.ndarray], None, None]:
    """
    Read chunks of attribute data and yield it to the caller as tuples of X, y.
    
    This function returns a generator which can be iterated.
    """
    for _ in range(n_chunks):
        # Load raw attribute tracks
        X_raw_attrs = OrderedDict({})
        for X_attribute, X_attr_reader in zip(X_attributes, X_attr_readers):
            X_attr_track = next(X_attr_reader)
            X_attr_track = np.nan_to_num(X_attr_track.to_numpy())
            X_raw_attrs[X_attribute] = X_attr_track

        # Calculate features
        X_feature_tracks = None
        for X_feature_name, X_feature_func in X_features.items():
            X_feature_track = X_feature_func(X_raw_attrs)
            X_feature_track = X_feature_scalers[X_feature_name] \
                .transform(X_feature_track)
            if X_feature_tracks is None:
                X_feature_tracks = X_feature_track
            else:
                X_feature_tracks = np.dstack((X_feature_tracks, X_feature_track))

        # Load labels
        y_attr_track = next(y_attr_reader) # dim (None, sample_length)
        y_attr_track = np.nan_to_num(y_attr_track.to_numpy()) # dim (None, sample_length)
        y_attr_track = y_attr_track[:, 0] # dim (None, 1)

        yield X_feature_tracks, y_attr_track

def count_samples(dataset_dir: Path) -> int:
    """Count the total amount of samples in a shl dataset."""
    n_samples = 0
    # Every file in the dataset has the same length, use the labels file
    with open(dataset_dir / y_file) as f:
        for _ in tqdm(f, desc=f'Counting samples in {dataset_dir}'):
            n_samples += 1
    return n_samples

def create_chunked_readers(
    dataset_dir: Path,
    chunksize: int, 
    xdtype=np.float32, # Use np.float16 with caution, can lead to overflows
    ydtype=np.int
) -> Tuple[List[pd.io.parsers.TextFileReader], pd.io.parsers.TextFileReader]:
    """Initialize chunked csv readers and return them to the caller as a tuple."""
    read_csv_kwargs = { 'sep': ' ', 'header': None, 'chunksize': chunksize }

    X_attr_readers = [] # (dim datasets x readers)
    for filename in X_files:
        X_reader = pd.read_csv(dataset_dir / filename, dtype=xdtype, **read_csv_kwargs)
        X_attr_readers.append(X_reader)
    y_attr_reader = pd.read_csv(dataset_dir / y_file, dtype=ydtype, **read_csv_kwargs)

    return X_attr_readers, y_attr_reader

def export_tfrecords(
    dataset_dir: Path,
    n_chunks=16, # Load dataset in parts to not overload memory
):
    """Transform the given shl dataset into a memory efficient TFRecord."""
    target_dir = f'{dataset_dir}.tfrecord'
    if os.path.isfile(target_dir):
        print(f'{target_dir} already exists.')
        return

    print(f'Exporting to {target_dir}.')

    n_samples = count_samples(dataset_dir)
    chunksize = int(np.floor(n_samples / n_chunks))
    X_attr_readers, y_attr_reader = create_chunked_readers(dataset_dir, chunksize)    

    with tf.io.TFRecordWriter(str(target_dir)) as file_writer:
        with tqdm(total=n_samples, desc=f'Reading samples to {target_dir}') as pbar:
            for X_feature_tracks, y_attr_track in read_chunks(
                n_chunks, X_attr_readers, y_attr_reader
            ):
                for X, y in zip(X_feature_tracks, y_attr_track):
                    X_flat = X.flatten() # TFRecords don't support multidimensional arrays
                    record_bytes = tf.train.Example(features=tf.train.Features(feature={
                        'X': tf.train.Feature(float_list=tf.train.FloatList(value=X_flat)),
                        'y': tf.train.Feature(int64_list=tf.train.Int64List(value=[y])) 
                    })).SerializeToString()
                    file_writer.write(record_bytes)
                pbar.update(chunksize)

for dataset_dir in TRAIN_DATASET_DIRS:
    export_tfrecords(dataset_dir)

Exporting to shl-dataset/challenge-2019-user1_torso/train/Torso.tfrecord.


Counting samples in shl-dataset/challenge-2019-user1_torso/train/Torso: 196072it [00:02, 73033.40it/s]
Reading samples to shl-dataset/challenge-2019-user1_torso/train/Torso.tfrecord: 100%|█████████▉| 196064/196072 [04:17<00:00, 762.78it/s]


Exporting to shl-dataset/challenge-2019-user1_bag/train/Bag.tfrecord.


Counting samples in shl-dataset/challenge-2019-user1_bag/train/Bag: 196072it [00:02, 72501.94it/s]
Reading samples to shl-dataset/challenge-2019-user1_bag/train/Bag.tfrecord: 100%|█████████▉| 196064/196072 [04:19<00:00, 756.81it/s]


Exporting to shl-dataset/challenge-2019-user1_hips/train/Hips.tfrecord.


Counting samples in shl-dataset/challenge-2019-user1_hips/train/Hips: 196072it [00:02, 72529.54it/s]
Reading samples to shl-dataset/challenge-2019-user1_hips/train/Hips.tfrecord: 100%|█████████▉| 196064/196072 [04:21<00:00, 749.29it/s]


Exporting to shl-dataset/challenge-2020-user1_hand/train/Hand.tfrecord.


Counting samples in shl-dataset/challenge-2020-user1_hand/train/Hand: 196072it [00:02, 72518.70it/s]
Reading samples to shl-dataset/challenge-2020-user1_hand/train/Hand.tfrecord: 100%|█████████▉| 196064/196072 [04:23<00:00, 743.00it/s]


Exporting to shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Torso.tfrecord.


Counting samples in shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Torso: 28789it [00:00, 90447.34it/s]
Reading samples to shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Torso.tfrecord: 100%|█████████▉| 28784/28789 [00:42<00:00, 675.12it/s]


Exporting to shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Bag.tfrecord.


Counting samples in shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Bag: 28789it [00:00, 91870.51it/s]
Reading samples to shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Bag.tfrecord: 100%|█████████▉| 28784/28789 [00:42<00:00, 677.82it/s]


Exporting to shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Hips.tfrecord.


Counting samples in shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Hips: 28789it [00:00, 89658.63it/s]
Reading samples to shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Hips.tfrecord: 100%|█████████▉| 28784/28789 [00:42<00:00, 675.75it/s]


Exporting to shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Hand.tfrecord.


Counting samples in shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Hand: 28789it [00:00, 91186.93it/s]
Reading samples to shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Hand.tfrecord: 100%|█████████▉| 28784/28789 [00:42<00:00, 674.52it/s]


In [None]:
def decode_tfrecord(record_bytes) -> Tuple[tf.Tensor, tf.Tensor]:
    """Decode a TFRecord example to X, y from its serialized representation."""
    example = tf.io.parse_single_example(record_bytes, {
        'X': tf.io.FixedLenFeature([SAMPLE_LENGTH, len(X_features)], tf.float32),
        'y': tf.io.FixedLenFeature([1], tf.int64)
    })
    return example['X'], example['y']

def create_train_validation_datasets(
    dataset_dirs: List[Path], 
    batch_size=64,
    shuffle_size=20_000, # Must be larger than batch_size
    test_size=256 # In batches
) -> Tuple[tf.data.Dataset, tf.data.Dataset]:
    """
    Create interleaved, shuffled and batched train and 
    validation datasets from the dataset dirs.
    
    Note that this function reads previously generated TFRecords under 
    `dataset_dir.tfrecord` -> use `export_tfrecords` for that.
    """
    tfrecord_dirs = [f'{d}.tfrecord' for d in dataset_dirs]
    print(f'Creating train and validation dataset over {tfrecord_dirs}.')

    # Create a strategy to interleave the datasets
    dataset = tf.data.Dataset.from_tensor_slices(tfrecord_dirs) \
        .interleave(
            lambda x: tf.data.TFRecordDataset(x), 
            cycle_length=batch_size, # Number of input elements that are processed concurrently
            block_length=1 # Return only one element at a time, batching is done later
        ) \
        .shuffle(shuffle_size) \
        .map(decode_tfrecord, num_parallel_calls=tf.data.AUTOTUNE) \
        .batch(batch_size)
    count = sum(1 for _ in dataset)
    print(f'Counted {count * batch_size} samples in combined dataset.')
    training_dataset = dataset.skip(test_size)
    count = sum(1 for _ in training_dataset)
    print(f'Counted {count * batch_size} samples in training dataset.')
    validation_dataset = dataset.take(test_size)
    count = sum(1 for _ in validation_dataset)
    print(f'Counted {count * batch_size} samples in validation dataset.')
    return training_dataset, validation_dataset

train_dataset, validation_dataset = create_train_validation_datasets(TRAIN_DATASET_DIRS)

Creating train and validation dataset over ['shl-dataset/challenge-2019-user1_torso/train/Torso.tfrecord', 'shl-dataset/challenge-2019-user1_bag/train/Bag.tfrecord', 'shl-dataset/challenge-2019-user1_hips/train/Hips.tfrecord', 'shl-dataset/challenge-2020-user1_hand/train/Hand.tfrecord', 'shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Torso.tfrecord', 'shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Bag.tfrecord', 'shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Hips.tfrecord', 'shl-dataset/challenge-2020-users23_torso_bag_hips_hand/validation/Hand.tfrecord'].
Counted 899392 samples in combined dataset.
Counted 883008 samples in training dataset.
Counted 16384 samples in validation dataset.


## Steps 3-5: Defining, training and evaluating models

In [None]:
# We will use the keras tuner contribution package for a hyperparameter gridsearch

import sys

!{sys.executable} -m pip install keras-tuner -q

[?25l[K     |███▍                            | 10 kB 30.5 MB/s eta 0:00:01[K     |██████▉                         | 20 kB 35.6 MB/s eta 0:00:01[K     |██████████▏                     | 30 kB 22.5 MB/s eta 0:00:01[K     |█████████████▋                  | 40 kB 17.8 MB/s eta 0:00:01[K     |█████████████████               | 51 kB 9.3 MB/s eta 0:00:01[K     |████████████████████▍           | 61 kB 9.6 MB/s eta 0:00:01[K     |███████████████████████▊        | 71 kB 9.4 MB/s eta 0:00:01[K     |███████████████████████████▏    | 81 kB 10.5 MB/s eta 0:00:01[K     |██████████████████████████████▋ | 92 kB 10.7 MB/s eta 0:00:01[K     |████████████████████████████████| 96 kB 4.5 MB/s 
[?25h

In [None]:
# Mount Google Drive for progress logging
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import tempfile

from datetime import datetime

from keras_tuner import Hyperband
from keras_tuner.engine import hypermodel as hm_module
from keras_tuner.engine.logger import Logger

class Tuner(Hyperband):
    """
    A custom hyperband tuner.
    """
    def __init__(self, gridsearch_dir: Path, *init_args, **init_kwargs):
        self.gridsearch_dir = gridsearch_dir
        super().__init__(*init_args, **init_kwargs)

    def run_trial(self, trial, *fit_args, **fit_kwargs):
        """
        Zip our progress and save to Google Drive every time a trial is run.
        """
        with tempfile.TemporaryDirectory() as tempdir:
            # Copy all files (except the checkpoints which become very large)
            # to a temporary directory and zip them, then download
            files_to_ignore = shutil.ignore_patterns('checkpoints*')
            target = f'{tempdir}/gridsearch'
            shutil.copytree(self.gridsearch_dir, target, ignore=files_to_ignore)
            shutil.make_archive('models/gridsearch', 'zip', target)
            datestr = datetime.today().strftime('%Y-%m-%d')
            shutil.copyfile('models/gridsearch.zip', f'/content/drive/MyDrive/gridsearch-{datestr}.zip')
        super().run_trial(trial, *fit_args, **fit_kwargs)

    def _on_train_begin(self, model, hp, *fit_args, **fit_kwargs):
        """
        Circumvent an issue in the implementation of the Hyperband keras tuner - 
        Models seem to  start from cold every new epoch, which is clearly unwanted. 

        See: https://github.com/keras-team/keras-tuner/issues/372
        And: https://arxiv.org/pdf/1603.06560.pdf
        """
        prev_trial_id = hp.values['tuner/trial_id'] if 'tuner/trial_id' in hp else None
        if prev_trial_id:
            prev_trial = self.oracle.trials[prev_trial_id]
            best_epoch = prev_trial.best_step
            # the code below is from load_model method of Tuner class
            with hm_module.maybe_distribute(self.distribution_strategy):
                model.load_weights(self._get_checkpoint_fname(
                    prev_trial.trial_id, best_epoch
                ))

In [None]:
# We will use the kapre contribution package to include STFT layers

!{sys.executable} -m pip install kapre -q

In [None]:
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras import backend
from tensorflow.keras import layers

from keras_tuner import HyperParameters
from keras_tuner.applications import HyperResNet

class HyperResNet2D(HyperResNet):
    """
    A ResNet hypermodel based on 2D convolutions.
    """
    pass

class HyperResNet1D(HyperResNet):
    """
    A ResNet hypermodel based on 1D convolutions.
    
    The code of this class is based on https://github.com/keras-team/keras-tuner
    which is licensed under Apache License 2.0, see https://www.apache.org/licenses/LICENSE-2.0
    """

    def build(self, hp: HyperParameters):
        version = hp.Choice("version", ["v1", "v2", "next"], default="v2")
        conv3_depth = hp.Choice("conv3_depth", [4, 8])
        conv4_depth = hp.Choice("conv4_depth", [6, 23, 36])

        # Version-conditional fixed parameters
        preact = True if version == "v2" else False
        use_bias = False if version == "next" else True

        # Model definition.
        bn_axis = 2 # normalize feature axis, i.e. (n_batches, n_timesteps, n_features)

        if self.input_tensor is not None:
            inputs = tf.keras.utils.get_source_inputs(self.input_tensor)
            x = self.input_tensor
        else:
            inputs = layers.Input(shape=self.input_shape)
            x = inputs

        # Initial conv1d block.
        x = layers.ZeroPadding1D(padding=3, name="conv1_pad")(x)
        x = layers.Conv1D(64, 7, strides=2, use_bias=use_bias, name="conv1_conv")(x)
        if preact is False:
            x = layers.BatchNormalization(
                axis=bn_axis, epsilon=1.001e-5, name="conv1_bn"
            )(x)
            x = layers.Activation("relu", name="conv1_relu")(x)
        x = layers.ZeroPadding1D(padding=1, name="pool1_pad")(x)
        x = layers.MaxPooling1D(3, strides=2, name="pool1_pool")(x)

        # Middle hypertunable stack.
        if version == "v1":
            x = stack1(x, 64, 3, stride1=1, name="conv2")
            x = stack1(x, 128, conv3_depth, name="conv3")
            x = stack1(x, 256, conv4_depth, name="conv4")
            x = stack1(x, 512, 3, name="conv5")
        elif version == "v2":
            x = stack2(x, 64, 3, name="conv2")
            x = stack2(x, 128, conv3_depth, name="conv3")
            x = stack2(x, 256, conv4_depth, name="conv4")
            x = stack2(x, 512, 3, stride1=1, name="conv5")
        elif version == "next":
            x = stack3(x, 64, 3, name="conv2")
            x = stack3(x, 256, conv3_depth, name="conv3")
            x = stack3(x, 512, conv4_depth, name="conv4")
            x = stack3(x, 1024, 3, stride1=1, name="conv5")

        # Top of the model.
        if preact is True:
            x = layers.BatchNormalization(
                axis=bn_axis, epsilon=1.001e-5, name="post_bn"
            )(x)
            x = layers.Activation("relu", name="post_relu")(x)

        pooling = hp.Choice("pooling", ["avg", "max"], default="avg")
        if pooling == "avg":
            x = layers.GlobalAveragePooling1D(name="avg_pool")(x)
        elif pooling == "max":
            x = layers.GlobalMaxPooling1D(name="max_pool")(x)

        if self.include_top:
            x = layers.Dense(self.classes, activation="softmax", name="probs")(x)
            model = keras.Model(inputs, x, name="ResNet")
            optimizer_name = hp.Choice(
                "optimizer", ["adam", "rmsprop", "sgd"], default="adam"
            )
            optimizer = keras.optimizers.get(optimizer_name)
            optimizer.learning_rate = hp.Choice(
                "learning_rate", [0.1, 0.01, 0.001], default=0.01
            )
            model.compile(
                optimizer=optimizer,
                loss="categorical_crossentropy",
                metrics=["accuracy"],
            )
            return model
        else:
            return keras.Model(inputs, x, name="ResNet")


def block1(x, filters, kernel_size=3, stride=1, conv_shortcut=True, name=None):
    """
    A residual block.

    The code of this function is based on https://github.com/keras-team/keras-tuner
    which is licensed under Apache License 2.0, see https://www.apache.org/licenses/LICENSE-2.0

    Args:
        x: input tensor.
        filters: integer, filters of the bottleneck layer.
        kernel_size: default 3, kernel size of the bottleneck layer.
        stride: default 1, stride of the first layer.
        conv_shortcut: default True, use convolution shortcut if True,
            otherwise identity shortcut.
        name: string, block label.

    Returns:
        Output tensor for the residual block.
    """
    bn_axis = 2 # normalize feature axis, i.e. (n_batches, n_timesteps, n_features)

    if conv_shortcut is True:
        shortcut = layers.Conv1D(
            4 * filters, 1, strides=stride, name=name + "_0_conv"
        )(x)
        shortcut = layers.BatchNormalization(
            axis=bn_axis, epsilon=1.001e-5, name=name + "_0_bn"
        )(shortcut)
    else:
        shortcut = x

    x = layers.Conv1D(filters, 1, strides=stride, name=name + "_1_conv")(x)
    x = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + "_1_bn"
    )(x)
    x = layers.Activation("relu", name=name + "_1_relu")(x)

    x = layers.Conv1D(filters, kernel_size, padding="same", name=name + "_2_conv")(x)
    x = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + "_2_bn"
    )(x)
    x = layers.Activation("relu", name=name + "_2_relu")(x)

    x = layers.Conv1D(4 * filters, 1, name=name + "_3_conv")(x)
    x = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + "_3_bn"
    )(x)

    x = layers.Add(name=name + "_add")([shortcut, x])
    x = layers.Activation("relu", name=name + "_out")(x)
    return x


def stack1(x, filters, blocks, stride1=2, name=None):
    """
    A set of stacked residual blocks.

    The code of this function is based on https://github.com/keras-team/keras-tuner
    which is licensed under Apache License 2.0, see https://www.apache.org/licenses/LICENSE-2.0

    Args:
        x: input tensor.
        filters: integer, filters of the bottleneck layer in a block.
        blocks: integer, blocks in the stacked blocks.
        stride1: default 2, stride of the first layer in the first block.
        name: string, stack label.

    Returns:
        Output tensor for the stacked blocks.
    """
    x = block1(x, filters, stride=stride1, name=name + "_block1")
    for i in range(2, blocks + 1):
        x = block1(x, filters, conv_shortcut=False, name=name + "_block" + str(i))
    return x


def block2(x, filters, kernel_size=3, stride=1, conv_shortcut=False, name=None):
    """
    A residual block.

    The code of this function is based on https://github.com/keras-team/keras-tuner
    which is licensed under Apache License 2.0, see https://www.apache.org/licenses/LICENSE-2.0

    Args:
        x: input tensor.
        filters: integer, filters of the bottleneck layer.
        kernel_size: default 3, kernel size of the bottleneck layer.
        stride: default 1, stride of the first layer.
        conv_shortcut: default False, use convolution shortcut if True,
            otherwise identity shortcut.
        name: string, block label.

    Returns:
        Output tensor for the residual block.
    """
    bn_axis = 2 # normalize feature axis, i.e. (n_batches, n_timesteps, n_features)

    preact = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + "_preact_bn"
    )(x)
    preact = layers.Activation("relu", name=name + "_preact_relu")(preact)

    if conv_shortcut is True:
        shortcut = layers.Conv1D(
            4 * filters, 1, strides=stride, name=name + "_0_conv"
        )(preact)
    else:
        shortcut = layers.MaxPooling1D(1, strides=stride)(x) if stride > 1 else x

    x = layers.Conv1D(filters, 1, strides=1, use_bias=False, name=name + "_1_conv")(
        preact
    )
    x = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + "_1_bn"
    )(x)
    x = layers.Activation("relu", name=name + "_1_relu")(x)

    x = layers.ZeroPadding1D(padding=1, name=name + "_2_pad")(x)
    x = layers.Conv1D(
        filters, kernel_size, strides=stride, use_bias=False, name=name + "_2_conv"
    )(x)
    x = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + "_2_bn"
    )(x)
    x = layers.Activation("relu", name=name + "_2_relu")(x)

    x = layers.Conv1D(4 * filters, 1, name=name + "_3_conv")(x)
    x = layers.Add(name=name + "_out")([shortcut, x])
    return x


def stack2(x, filters, blocks, stride1=2, name=None):
    """
    A set of stacked residual blocks.

    The code of this function is based on https://github.com/keras-team/keras-tuner
    which is licensed under Apache License 2.0, see https://www.apache.org/licenses/LICENSE-2.0

    Args:
        x: input tensor.
        filters: integer, filters of the bottleneck layer in a block.
        blocks: integer, blocks in the stacked blocks.
        stride1: default 2, stride of the first layer in the first block.
        name: string, stack label.

    Returns:
        Output tensor for the stacked blocks.
    """
    x = block2(x, filters, conv_shortcut=True, name=name + "_block1")
    for i in range(2, blocks):
        x = block2(x, filters, name=name + "_block" + str(i))
    x = block2(x, filters, stride=stride1, name=name + "_block" + str(blocks))
    return x


def block3(
    x, filters, kernel_size=3, stride=1, groups=32, conv_shortcut=True, name=None
):
    """
    A residual block.

    The code of this function is based on https://github.com/keras-team/keras-tuner
    which is licensed under Apache License 2.0, see https://www.apache.org/licenses/LICENSE-2.0

    Args:
        x: input tensor.
        filters: integer, filters of the bottleneck layer.
        kernel_size: default 3, kernel size of the bottleneck layer.
        stride: default 1, stride of the first layer.
        groups: default 32, group size for grouped convolution.
        conv_shortcut: default True, use convolution shortcut if True,
            otherwise identity shortcut.
        name: string, block label.

    Returns:
        Output tensor for the residual block.
    """
    bn_axis = 2 # normalize feature axis, i.e. (n_batches, n_timesteps, n_features)

    if conv_shortcut is True:
        shortcut = layers.Conv1D(
            (64 // groups) * filters,
            1,
            strides=stride,
            use_bias=False,
            name=name + "_0_conv",
        )(x)
        shortcut = layers.BatchNormalization(
            axis=bn_axis, epsilon=1.001e-5, name=name + "_0_bn"
        )(shortcut)
    else:
        shortcut = x

    x = layers.Conv1D(filters, 1, use_bias=False, name=name + "_1_conv")(x)
    x = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + "_1_bn"
    )(x)
    x = layers.Activation("relu", name=name + "_1_relu")(x)

    c = filters // groups
    x = layers.ZeroPadding1D(padding=1, name=name + "_2_pad")(x)
    # We use SeparableConv1D instead of DepthwiseConv1D because that is
    # only available in the nightly build of tensorflow (as of August 23, 2021)
    x = layers.SeparableConv1D(
        filters,
        kernel_size,
        strides=stride,
        depth_multiplier=c,
        use_bias=False,
        name=name + "_2_conv",
    )(x)
    x_shape = backend.int_shape(x)[1:-1]
    x = layers.Reshape(x_shape + (groups, c, c))(x)
    output_shape = x_shape + (groups, c) if backend.backend() == "theano" else None

    x = layers.Lambda(
        lambda x: sum([x[:, :, :, :, i] for i in range(c)]),
        output_shape=output_shape,
        name=name + "_2_reduce",
    )(x)

    x = layers.Reshape(x_shape + (filters,))(x)

    x = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + "_2_bn"
    )(x)

    x = layers.Activation("relu", name=name + "_2_relu")(x)

    x = layers.Conv1D(
        (64 // groups) * filters, 1, use_bias=False, name=name + "_3_conv"
    )(x)

    x = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + "_3_bn"
    )(x)

    x = layers.Add(name=name + "_add")([shortcut, x])
    x = layers.Activation("relu", name=name + "_out")(x)
    return x


def stack3(x, filters, blocks, stride1=2, groups=32, name=None):
    """
    A set of stacked residual blocks.

    The code of this function is based on https://github.com/keras-team/keras-tuner
    which is licensed under Apache License 2.0, see https://www.apache.org/licenses/LICENSE-2.0

    Args:
        x: input tensor.
        filters: integer, filters of the bottleneck layer in a block.
        blocks: integer, blocks in the stacked blocks.
        stride1: default 2, stride of the first layer in the first block.
        groups: default 32, group size for grouped convolution.
        name: string, stack label.

    Returns:
        Output tensor for the stacked blocks.
    """
    x = block3(x, filters, stride=stride1, groups=groups, name=name + "_block1")

    for i in range(2, blocks + 1):
        x = block3(
            x,
            filters,
            groups=groups,
            conv_shortcut=False,
            name=name + "_block" + str(i),
        )
    return x

In [None]:
# Create a combined hypermodel

import kapre

from tensorflow.keras import layers, models, optimizers

def combined_hypermodel(hp: HyperParameters) -> models.Model:
    """
    Make a combined resnet hypermodel.
    
    Use either a 1D model to directly classify the timeseries data,
    or use a 2D model on a preceding STFT transform layer.
    """

    model_type = hp.Choice('model_type', ['1d', '2d'])

    input_shape = (SAMPLE_LENGTH, len(X_features))
    n_outputs = len(LABEL_ORDER)

    model = models.Sequential()

    if model_type == '1d':
        with hp.conditional_scope('model_type', ['1d']):
            # Direct timeseries classification
            model.add(HyperResNet1D(
                include_top=True,
                input_shape=input_shape,
                input_tensor=None,
                classes=n_outputs
            ).build(hp))
    elif model_type == '2d':
        with hp.conditional_scope('model_type', ['2d']):
            # Short-time fourier transform
            model.add(kapre.STFT( 
                n_fft=100,
                hop_length=5,
                pad_end=False,
                input_data_format='channels_last', 
                output_data_format='channels_last',
                input_shape=input_shape,
                name='stft-layer'
            ))
            # Convert resulting tensor into magnitudes (decibel)
            model.add(kapre.Magnitude())
            model.add(kapre.MagnitudeToDecibel())
            # Normalize magnitudes
            model.add(layers.LayerNormalization())
            model.add(layers.UpSampling2D(2))
            # Add our ResNet classifier hypermodel
            model.add(HyperResNet2D(
                include_top=True, 
                input_shape=(162, 102, 3), # Output shape of our upsampled STFT layer
                input_tensor=None, 
                classes=n_outputs
            ).build(hp))
    else:
        raise ValueError('Unknown meta architecture!')

    model.compile(
        loss='sparse_categorical_crossentropy', # No OHE necessary
        optimizer=optimizers.Adam(learning_rate=0.001),
        metrics=['acc']
    )

    return model

In [None]:
tuner = Tuner(
    gridsearch_dir=Path('models/shl-resnet-gridsearch'),
    hypermodel=combined_hypermodel, 
    objective='val_acc', 
    max_epochs=15,
    overwrite=False,
    directory='models',
    project_name='shl-resnet-gridsearch',
)

tuner.search_space_summary()

Search space summary
Default search space size: 7
model_type (Choice)
{'default': '1d', 'conditions': [], 'values': ['1d', '2d'], 'ordered': False}
version (Choice)
{'default': 'v2', 'conditions': [{'class_name': 'Parent', 'config': {'name': 'model_type', 'values': ['1d']}}], 'values': ['v1', 'v2', 'next'], 'ordered': False}
conv3_depth (Choice)
{'default': 4, 'conditions': [{'class_name': 'Parent', 'config': {'name': 'model_type', 'values': ['1d']}}], 'values': [4, 8], 'ordered': True}
conv4_depth (Choice)
{'default': 6, 'conditions': [{'class_name': 'Parent', 'config': {'name': 'model_type', 'values': ['1d']}}], 'values': [6, 23, 36], 'ordered': True}
pooling (Choice)
{'default': 'avg', 'conditions': [{'class_name': 'Parent', 'config': {'name': 'model_type', 'values': ['1d']}}], 'values': ['avg', 'max'], 'ordered': False}
optimizer (Choice)
{'default': 'adam', 'conditions': [{'class_name': 'Parent', 'config': {'name': 'model_type', 'values': ['1d']}}], 'values': ['adam', 'rmsprop', '

In [None]:
# Define callbacks for our training

from tensorflow.keras import callbacks

decay_lr = callbacks.ReduceLROnPlateau(
    monitor='val_acc',
    factor=0.5, 
    patience=5, # Epochs
    min_lr=0.00001, 
    verbose=1
)

stop_early = callbacks.EarlyStopping(
    monitor='val_acc', 
    patience=10, # Epochs
    verbose=1
)

update_tensorboard = callbacks.TensorBoard('logs/gridsearch')

In [None]:
# Activate TensorBoard

%load_ext tensorboard
%tensorboard --logdir logs/gridsearch

In [None]:
# Keras tuner grid search training

tuner.search(
    train_dataset,
    epochs=15,
    callbacks=[decay_lr, stop_early],
    validation_data=validation_dataset,
    verbose=1,
    shuffle=False, # Shuffling doesn't work with our prefetching
    class_weight=CLASS_WEIGHTS,
)


Search: Running Trial #7

Hyperparameter    |Value             |Best Value So Far 
model_type        |2d                |2d                
version           |next              |v2                
conv3_depth       |4                 |4                 
conv4_depth       |6                 |6                 
pooling           |avg               |avg               
optimizer         |sgd               |adam              
learning_rate     |0.001             |0.01              
tuner/epochs      |2                 |2                 
tuner/initial_e...|0                 |0                 
tuner/bracket     |2                 |2                 
tuner/round       |0                 |0                 

Epoch 1/2
   4017/Unknown - 1633s 403ms/step - loss: 0.5477 - acc: 0.7942