# Example - Opportunity challenge.


#### Download and preprocess data

First, download the dataset for the opportunity challenge and set up the directories to hold the raw and processed data:

In [1]:
! wget https://archive.ics.uci.edu/ml/machine-learning-databases/00226/OpportunityUCIDataset.zip
! mkdir -p data/raw
! mv OpportunityUCIDataset.zip data/raw

--2021-09-29 12:50:26--  https://archive.ics.uci.edu/ml/machine-learning-databases/00226/OpportunityUCIDataset.zip
Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252
Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 306636009 (292M) [application/x-httpd-php]
Saving to: ‘OpportunityUCIDataset.zip’


2021-09-29 12:51:33 (4.44 MB/s) - ‘OpportunityUCIDataset.zip’ saved [306636009/306636009]



Then, run the following script to preprocess the data. We collate all the training data into one file, same for the validation and testing data.

In [4]:
! python3 preprocess_opportunity.py

Checking dataset data/raw/OpportunityUCIDataset.zip
Processing dataset files ...
Generating training files
... file OpportunityUCIDataset/dataset/S1-Drill.dat -> train_data
... file OpportunityUCIDataset/dataset/S1-ADL1.dat -> train_data
... file OpportunityUCIDataset/dataset/S1-ADL2.dat -> train_data
... file OpportunityUCIDataset/dataset/S1-ADL3.dat -> train_data
... file OpportunityUCIDataset/dataset/S1-ADL4.dat -> train_data
... file OpportunityUCIDataset/dataset/S2-Drill.dat -> train_data
... file OpportunityUCIDataset/dataset/S2-ADL1.dat -> train_data
... file OpportunityUCIDataset/dataset/S2-ADL2.dat -> train_data
... file OpportunityUCIDataset/dataset/S3-Drill.dat -> train_data
... file OpportunityUCIDataset/dataset/S3-ADL1.dat -> train_data
... file OpportunityUCIDataset/dataset/S3-ADL2.dat -> train_data
... file OpportunityUCIDataset/dataset/S2-ADL3.dat -> train_data
... file OpportunityUCIDataset/dataset/S3-ADL3.dat -> train_data
Generating validation files
... file Opportun

#### Dataset Config

We choose the parameters of our sliding window, and specify the name and location of the target dataset.

In [5]:
target_dataset = 'opportunity'
window_size = 24
window_step = 12
n_classes = 18

config_dataset = {
        "dataset": target_dataset,
        "window": window_size,
        "stride": window_step,
        "stride_test": 1,
        "path_processed": f"data/{target_dataset}",
    }

Import the data for training and validation (see datasets.py for sensor dataset implementation). We get the number of channels from the dataset.

In [6]:
from datasets import SensorDataset
dataset = SensorDataset(**config_dataset, prefix="train")
dataset_val = SensorDataset(**config_dataset, prefix="val")
n_channels = dataset.n_channels


[92mCreating opportunity train HAR dataset of size 43985 ...[0m
[92mCreating opportunity val HAR dataset of size 2509 ...[0m


### Import DeepConvLSTM class

See DeepConvLSTM_py3.py for implementation.

In [7]:
from DeepConvLSTM_py3 import DeepConvLSTM

#### Create an instance of DeepConvLSTM, with the number of channels and classes defined earlier. 
The dataset arg determines where the results and training checkpoints should be saved.


In [8]:
deepconv = DeepConvLSTM(n_channels=n_channels, n_classes=n_classes, dataset=target_dataset).cuda()

### Train the model


We'll train the model for 300 epochs. We use a learning rate scheduler to decrease the maximum learning rate for all parameters every 10 epochs, by a factor of 0.9. Check model_train docstring for explanation of the config keys.

In [9]:
from DeepConvLSTM_py3 import model_train

# Define train config options
config_train = {'batch_size': 256,
                'optimizer': 'Adam',
                'lr': 1e-3,
                'lr_step': 10,
                'lr_decay': 0.9,
                'init_weights': 'orthogonal',
                'epochs': 300,
                'print_freq': 100
               }

model_train(deepconv, dataset, dataset_val, config_train, verbose=True)

[92mRunning HAR training loop ...[0m
[92m[-] Initializing weights (orthogonal)...[0m
----------------------------------------------------------------------------------------------------
[-] Learning rate:  0.001
[-] Batch 0/172	 Loss: 2.890290
[-] Batch 100/172	 Loss: 1.374507
[92m[-] Epoch 0/300	Train loss: 0.99 	acc: 73.92(%)	fm: 12.87(%)	fw: 68.17(%)	[0m
[92m[-] Epoch 0/300	Val loss: 0.63 	acc: 83.30(%)	fm: 10.82(%)	fw: 80.32(%)[0m
[94m[*] Saving checkpoint... (0.0->0.10818871616427658)[0m
----------------------------------------------------------------------------------------------------
[-] Learning rate:  0.001
[-] Batch 0/172	 Loss: 0.973741
[-] Batch 100/172	 Loss: 0.880867
[92m[-] Epoch 1/300	Train loss: 0.63 	acc: 80.04(%)	fm: 33.59(%)	fw: 75.72(%)	[0m
[92m[-] Epoch 1/300	Val loss: 0.47 	acc: 84.02(%)	fm: 21.69(%)	fw: 82.00(%)[0m
[94m[*] Saving checkpoint... (0.10818871616427658->0.216920304039408)[0m
-----------------------------------------------------------

Load the testing data to evaluate the trained model, and setup the test configuration.

In [10]:
dataset_test = SensorDataset(**config_dataset, prefix="test")
test_config = {'batch_size': 256,
              'train_mode': False,
              'dataset': target_dataset,
              'num_batches_eval': 212}

[92mCreating opportunity test HAR dataset of size 118726 ...[0m


Import model_eval and test the model.

In [11]:
from DeepConvLSTM_py3 import model_eval
model_eval(deepconv, dataset_test, test_config, return_results=False)

[92mRunning HAR evaluation loop ...[0m
[-] Loading checkpoint ...
[92m[-] Test loss: 0.93	acc: 88.55(%)	fm: 62.12(%)	fw: 88.93(%)[0m
[92m[Finished HAR evaluation loop (h:m:s): 0:01:18[0m


We can also use [thop](https://pypi.org/project/thop/) to count the number of floating point operations performed during a forward pass of one batch of synthetic data. We divide by the batch size to get the number of operations per window.

In [12]:
# Get number of flops
from thop import profile
import torch
deepconv = deepconv.train()

x = torch.ones([config_train['batch_size'], config_dataset['window'], n_channels]).cuda()
macs, params = profile(deepconv, inputs=(x,), verbose=True)
flops = macs / config_train['batch_size']
print(f'Number of floating point operations per forward pass of one sliding window segment: {flops:,}')
print(f'Number of parameters: {params:,}')

[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.dropout.Dropout'>.
[INFO] Register count_lstm() for <class 'torch.nn.modules.rnn.LSTM'>.
[INFO] Register count_linear() for <class 'torch.nn.modules.linear.Linear'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.activation.ReLU'>.
[91m[WARN] Cannot find rule for <class 'DeepConvLSTM_py3.DeepConvLSTM'>. Treat it as zero Macs and zero Params.[00m
Number of floating point operations per forward pass of one sliding window segment: 115,671,040.0
Number of parameters: 3,965,778.0
