<a href="https://colab.research.google.com/github/PhilippMatthes/diplom/blob/master/src/shl-deep-learning-timeseries.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using a deep CNN to directly classify SHL timeseries data

In [1]:
# Get needed auxiliary files for colab
!git clone https://github.com/philippmatthes/diplom
%cd /content/diplom/src
!mkdir shl-dataset
!wget -nc -O shl-dataset/challenge-2019-train_torso.zip http://www.shl-dataset.org/wp-content/uploads/SHLChallenge2019/challenge-2019-train_torso.zip
!wget -nc -O shl-dataset/challenge-2019-train_bag.zip http://www.shl-dataset.org/wp-content/uploads/SHLChallenge2019/challenge-2019-train_bag.zip
!wget -nc -O shl-dataset/challenge-2019-train_hips.zip http://www.shl-dataset.org/wp-content/uploads/SHLChallenge2019/challenge-2019-train_hips.zip
!wget -nc -O shl-dataset/challenge-2020-train_hand.zip http://www.shl-dataset.org/wp-content/uploads/SHLChallenge2020/challenge-2020-train_hand.zip

Cloning into 'diplom'...
remote: Enumerating objects: 1480, done.[K
remote: Counting objects: 100% (817/817), done.[K
remote: Compressing objects: 100% (523/523), done.[K
remote: Total 1480 (delta 400), reused 657 (delta 254), pack-reused 663[K
Receiving objects: 100% (1480/1480), 28.14 MiB | 26.58 MiB/s, done.
Resolving deltas: 100% (776/776), done.
/content/diplom/src
--2021-08-09 21:48:57--  http://www.shl-dataset.org/wp-content/uploads/SHLChallenge2019/challenge-2019-train_torso.zip
Resolving www.shl-dataset.org (www.shl-dataset.org)... 37.187.125.22
Connecting to www.shl-dataset.org (www.shl-dataset.org)|37.187.125.22|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5852446972 (5.5G) [application/zip]
Saving to: ‘shl-dataset/challenge-2019-train_torso.zip’


2021-08-09 21:58:03 (10.2 MB/s) - ‘shl-dataset/challenge-2019-train_torso.zip’ saved [5852446972/5852446972]

--2021-08-09 21:58:03--  http://www.shl-dataset.org/wp-content/uploads/SHLChallenge2019/

In [2]:
# Switch to src dir and select tensorflow
%cd /content/diplom/src
%tensorflow_version 2.x

/content/diplom/src


In [3]:
# Define all datasets to train our model on

from pathlib import Path

DATASET_DIRS = [
    Path('shl-dataset/challenge-2019-train_torso.zip'),
    Path('shl-dataset/challenge-2019-train_bag.zip'),
    Path('shl-dataset/challenge-2019-train_hips.zip'),
    Path('shl-dataset/challenge-2020-train_hand.zip'),
]

In [4]:
# Create scalers that we will fit on our data

from sklearn.preprocessing import PowerTransformer

acc_scaler = PowerTransformer()
mag_scaler = PowerTransformer()
gyr_scaler = PowerTransformer()

In [5]:
from tensorflow import keras

# Check that we can use our GPU, to not wait forever during training
from tensorflow.python.client import device_lib
device_lib.list_local_devices()

[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 12240503479611988157, name: "/device:GPU:0"
 device_type: "GPU"
 memory_limit: 16183459840
 locality {
   bus_id: 1
   links {
   }
 }
 incarnation: 13667176603490109317
 physical_device_desc: "device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0"]

In [6]:
import numpy as np

from tools.dataset import load_zipped_shl_dataset

from tqdm import tqdm

# Join all datasets
acc_mag_conc = None
mag_mag_conc = None
gyr_mag_conc = None
y_conc = None

for dataset_dir in DATASET_DIRS:
    # Load dataset from zip file into temporary directory
    dataset = load_zipped_shl_dataset(dataset_dir, tqdm=tqdm)
    if acc_mag_conc is None:
        acc_mag_conc = dataset.acc_mag
    else:
        acc_mag_conc = np.concatenate((acc_mag_conc, dataset.acc_mag), axis=0)
    if mag_mag_conc is None:
        mag_mag_conc = dataset.mag_mag
    else:
        mag_mag_conc = np.concatenate((mag_mag_conc, dataset.mag_mag), axis=0)
    if gyr_mag_conc is None:
        gyr_mag_conc = dataset.gyr_mag
    else:    
        gyr_mag_conc = np.concatenate((gyr_mag_conc, dataset.gyr_mag), axis=0)
    if y_conc is None:
        y_conc = dataset.labels
    else:
        y_conc = np.concatenate((y_conc, dataset.labels), axis=0)
    del dataset
    

Extracting shl-dataset/challenge-2019-train_torso.zip: 100%|██████████| 22/22 [03:44<00:00, 10.18s/it]


Acc_x Import Done
Acc_y Import Done
Acc_z Import Done
Acc_mag Import Done
Mag_x Import Done
Mag_y Import Done
Mag_z Import Done
Mag_mag Import Done
Gyr_x Import Done
Gyr_y Import Done
Gyr_z Import Done
Gyr_mag Import Done
Labels Import Done


Extracting shl-dataset/challenge-2019-train_bag.zip: 100%|██████████| 22/22 [03:47<00:00, 10.34s/it]


Acc_x Import Done
Acc_y Import Done
Acc_z Import Done
Acc_mag Import Done
Mag_x Import Done
Mag_y Import Done
Mag_z Import Done
Mag_mag Import Done
Gyr_x Import Done
Gyr_y Import Done
Gyr_z Import Done
Gyr_mag Import Done
Labels Import Done


Extracting shl-dataset/challenge-2019-train_hips.zip: 100%|██████████| 22/22 [03:47<00:00, 10.34s/it]


Acc_x Import Done
Acc_y Import Done
Acc_z Import Done
Acc_mag Import Done
Mag_x Import Done
Mag_y Import Done
Mag_z Import Done
Mag_mag Import Done
Gyr_x Import Done
Gyr_y Import Done
Gyr_z Import Done
Gyr_mag Import Done
Labels Import Done


Extracting shl-dataset/challenge-2020-train_hand.zip: 100%|██████████| 23/23 [03:44<00:00,  9.76s/it]


Acc_x Import Done
Acc_y Import Done
Acc_z Import Done
Acc_mag Import Done
Mag_x Import Done
Mag_y Import Done
Mag_z Import Done
Mag_mag Import Done
Gyr_x Import Done
Gyr_y Import Done
Gyr_z Import Done
Gyr_mag Import Done
Labels Import Done


In [7]:
# Check that we don't have NaNs in our dataset
assert not np.isnan(acc_mag_conc).any()
assert not np.isnan(mag_mag_conc).any()
assert not np.isnan(gyr_mag_conc).any()

In [8]:
import joblib

from sklearn.preprocessing import PowerTransformer

acc_scaler = joblib.load('models/acc-scaler.joblib')
mag_scaler = joblib.load('models/mag-scaler.joblib')
gyr_scaler = joblib.load('models/gyr-scaler.joblib')



In [9]:
# Fit and export scalers
acc_mag_scaled = acc_scaler.transform(acc_mag_conc)
del acc_mag_conc
mag_mag_scaled = mag_scaler.transform(mag_mag_conc)
del mag_mag_conc
gyr_mag_scaled = gyr_scaler.transform(gyr_mag_conc)
del gyr_mag_conc

# Prepare training data
X = np.stack([
    acc_mag_scaled,
    mag_mag_scaled, 
    gyr_mag_scaled,
], axis=2)

del acc_mag_scaled
del mag_mag_scaled
del gyr_mag_scaled

In [10]:
from sklearn.utils.class_weight import compute_class_weight
# Compute class weights for unbiased training
class_weights = compute_class_weight('balanced', classes=np.unique(y_conc), y=y_conc)
class_weights = dict(zip(np.unique(y_conc), class_weights)) # Keras adaption
# Fill in NULL class for tf 2.x
class_weights[0] = 0

In [11]:
# Create our model

from tensorflow.keras import layers

from architectures.resnet import make_resnet
from tools.dataset import shl_dataset_label_order

MODEL_DIR = Path('models/shl-resnet')
OVERRIDE_MODEL = True

if MODEL_DIR.is_file() and not OVERRIDE_MODEL:
    # Load model for transfer learning
    model = keras.models.load_model(MODEL_DIR)
else:
    model = make_resnet(
        input_shape=[X.shape[1], X.shape[2]], 
        output_classes=len(shl_dataset_label_order)
    )

    model.compile(
        loss='sparse_categorical_crossentropy',
        optimizer='adam',
        metrics=['acc']
    )

In [12]:
model.summary()

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 500, 3)]     0                                            
__________________________________________________________________________________________________
conv1d (Conv1D)                 (None, 500, 64)      1600        input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 500, 64)      256         conv1d[0][0]                     
__________________________________________________________________________________________________
activation (Activation)         (None, 500, 64)      0           batch_normalization[0][0]        
______________________________________________________________________________________________

In [13]:
# Train model
callbacks = [
    keras.callbacks.CSVLogger(f'train.log', append=False),
    keras.callbacks.ModelCheckpoint(
        str(MODEL_DIR), save_best_only=True, monitor='val_loss', verbose=1
    ),
    keras.callbacks.ReduceLROnPlateau(
        monitor='val_loss', factor=0.25, patience=5, min_lr=0.0001, verbose=1
    ),
    keras.callbacks.EarlyStopping(monitor='val_loss', patience=30, verbose=1),
]
model.fit(
    X, 
    y_conc,
    epochs=200, 
    batch_size=32,
    callbacks=callbacks,
    validation_split=0.1,
    verbose=1,
    class_weight=class_weights,
    shuffle=True
)

Instructions for updating:
The `validate_indices` argument has no effect. Indices are always validated on CPU and never validated on GPU.
Epoch 1/200

Epoch 00001: val_loss improved from inf to 0.82826, saving model to models/shl-resnet




INFO:tensorflow:Assets written to: models/shl-resnet/assets
Epoch 2/200

Epoch 00002: val_loss improved from 0.82826 to 0.65121, saving model to models/shl-resnet




INFO:tensorflow:Assets written to: models/shl-resnet/assets
Epoch 3/200

Epoch 00003: val_loss did not improve from 0.65121
Epoch 4/200

Epoch 00004: val_loss did not improve from 0.65121
Epoch 5/200

Epoch 00005: val_loss improved from 0.65121 to 0.63679, saving model to models/shl-resnet
INFO:tensorflow:Assets written to: models/shl-resnet/assets




Epoch 6/200

Epoch 00006: val_loss did not improve from 0.63679
Epoch 7/200

Epoch 00007: val_loss improved from 0.63679 to 0.62989, saving model to models/shl-resnet




INFO:tensorflow:Assets written to: models/shl-resnet/assets
Epoch 8/200

Epoch 00008: val_loss did not improve from 0.62989
Epoch 9/200

Epoch 00009: val_loss improved from 0.62989 to 0.62696, saving model to models/shl-resnet
INFO:tensorflow:Assets written to: models/shl-resnet/assets




Epoch 10/200

Epoch 00010: val_loss did not improve from 0.62696
Epoch 11/200

Epoch 00011: val_loss did not improve from 0.62696
Epoch 12/200

Epoch 00012: val_loss did not improve from 0.62696
Epoch 13/200

Epoch 00013: val_loss did not improve from 0.62696
Epoch 14/200

Epoch 00014: val_loss did not improve from 0.62696

Epoch 00014: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628.
Epoch 15/200

Epoch 00015: val_loss did not improve from 0.62696
Epoch 16/200

Epoch 00016: val_loss did not improve from 0.62696
Epoch 17/200

Epoch 00017: val_loss did not improve from 0.62696
Epoch 18/200

Epoch 00018: val_loss did not improve from 0.62696
Epoch 19/200

Epoch 00019: val_loss did not improve from 0.62696

Epoch 00019: ReduceLROnPlateau reducing learning rate to 0.0001.
Epoch 20/200

Epoch 00020: val_loss did not improve from 0.62696
Epoch 21/200

Epoch 00021: val_loss did not improve from 0.62696
Epoch 22/200

Epoch 00022: val_loss did not improve from 0.62696
Epoch 23

<tensorflow.python.keras.callbacks.History at 0x7f7dda6bad50>