# Time Series Classification Using Deep Learning - Part 1
> A soft introduction to time series classification using state of the art neural network architecture.

- toc: true 
- badges: true
- comments: true
- categories: [jupyter]
- author: Farid Hassainia, PhD
- categories: [time series, deep learning, AI, classification, regression, segmentation, forecasting]

## Introduction
Unlike Computer Vision (CV), Time Series (TS) analysis is not the hottest topic, in the AI / Deep Learning world. Lately, CV starts to lose a bit of its luster because of the amount of controversies around facial detection abusive applications among other reasons. Meanwhile, one may hope that innovation in time series analysis will get more attention and more traction. Compared to both CV and NLP, there are not a lot of publications in time series analysis using Deep Learning. This open doors to talented people to stepping in this domain and start innovating and creating new models that are specially designed to time series. Those new models could be inspired by some already established Neural Network models such as ResNet in CV, and LSTM in NLP. 

This brings us to fastai 2 library. The latter has a tremendous advantage in comparison to other DL frameworks by having a unified APIs that already spans several domains such as Vison, NLP, and Tabular. Therefore, it offers a singular opportunity for DL practitioners interested in creating complementary libraries that mimics the already well established fastai2 structure. By doing so, not only, it accelerates the development of new libraries but it also offers a fast learning curve for the users of those new extension libraries: in other word, we are taking advantage of the transfer learning between different modules that constitutes the fastai2 library.


## Audience
This post is the part 1 of a 3-parts series that targets a large eclectic audience spanning from individuals having limited knowledge in deep learning to developers seeking to follow a walk through in developing a fastai2 extension using fastai2, fastcore,and nbdev libraries. 

One of the goals of this series of posts is to break the fear of developing new modules, libraries, pieces of code some may feel starting playing with deep learning code and/or developing interesting things using the new fastai2. 


## What is in this series of articles?
First, let’s introduce the `timeseries` package: an unofficial extension for fastai version 2 (fastai2). This package is still under development (github) yet very functional and with a detailed documentation (docs). `timeseries` package focus on classification and regression tasks. 

The 3-posts series can be presented as follow:
- Part 1: it will introduce you to the 1) `timeseries` package, 2) Data loading, 3) Datasets, 4) DataLoaders, 5) InceptionTime model, and 5) train and validating the model. In part 1, we will learn how to create DataLoaders objects using Datasets classes. Datasets ar considered as mid-level APIs.

- Part 2: we will create DataLoaders objects using DataBlock objects which are considered as high-level APIs.

- Part 3: In this part, we will learn how to create TSDataloaders, a class that derives from fastai2 DataLoaders. Its purpose is to abstract the different steps found in using DataBlock class. We will also get deeper in the beast guts by exploring who the `timeseries` is internally structured. 


## What is a time series?
First, let’s introduce what time series are in order to level the ground for those who are familiar with time series analysis. For those familiar with this subject, they can skip the following part.

Time series data have a natural temporal ordering. This makes time series analysis distinct from tabular data (Spreadsheet-like data) in which there is no natural ordering of the observations (e.g. explaining people's wages by reference to their respective education levels, where the individuals' data could be entered in any order). 


Time series can divided in to 3 categories: 

1. Time Series Classsification (TSC) : We present a time serie to a Neural Network (NN) model and the latter predicts its label (class). In this tutorial will be focused on this category

2. Time Series Regression (TSR) : This is quiet similar to TSC and share the same data processing. It can be considered as specila case where the number of labels (classes) is reduced to 1, and represented by a float instead of a category. If it is confusing, rest assured we all had the same feeling at a given time of our learning journey. Every bit of information will become more clear as we advance in these tutorials: so hang on and we will reach our destination in a short time.


3. Time series Annotation: 
It includes anomaly detection, segmentation

4. Time series Forecasting:


## The fastai way
Following a long fastai tradition, I will show how we can train an NN model and reach a 98,6% accuracy in less than 20 epochs of a time series classification of one of the datasets used to benchmark different NN models. This high accuracy was considered as a state-of-the-art (SOTA) result just a couple of month ago. It is worthwhile to stress out that this result has been achieved by writing 4 lines of code:


In [None]:
#collapse-show
path = unzip_data(URLs_TS.NATOPS)
dls = TSDataLoaders.from_files(bs=32,fnames=[path/'NATOPS_TRAIN.arff', path/'NATOPS_TEST.arff'], batch_tfms=[Normalize()]) 
learn = ts_learner(dls)
learn.fit_one_cycle(25, lr_max=1e-3) 

If you find it is too much magic for you, I woul agree with you if this the only way that we offer to train a model. Far from that, thanks to fastai2 APIs, a practitioner is able to decide what level (of magic) of the APIs to would like to operate at. It means they can use TfmLists, Datasets, DataBlock, and customized DataLoaders. If this is very cryptic for you, that is quite normal. Going through this series of blog posts, you will be grasp the underlying foundation of the APIs and become aquatinted with those concepts. So, hang on and your efforts will be rewarded very soon.

First, let’s decipher the code above line by line:
1. In line 1, we are downloading a dataset (NATOPS dataset precisely) that is hosted at by http://www.timeseriesclassification.com/ website (a Time Series Classification Repository), unzip the dataset, and save it a separate folder under the local ./fastai/data folder. In the case of NATOPS, we have 360 samples (It is small by the big data standard but we are using for testing purposes)

2. In line 2, There is a lot of action there that can be summarized as follow:
- Create a 2 Dataset object containing a train dataset (288 samples), and a valid dataset (112 samples)
- Create 2 DataLoader objects that allows us to create mini batches for each type of datasets. A batch is just a sample number of samples (e.g. 32 samples)

3. In line 3, we create a learner where we basically do the following:
- Create an NN model, InceptionTime (ref) in our case. The latter was published in September 2019, and achieved SOTA results. Oh did I tell that the researcher who introduced this new model is a member of the fastai community [Hassan Fawaz](https://forums.fast.ai/u/hfawaz/summary)? He also open sourced his Keras implementation. What is also amazing is that one of fastai member [Ignacio Oguiza]( https://forums.fast.ai/u/oguiza/summary) ported that code to fastai version 1, in record time and also open sourced it with many other NN models.
- Create a Learner using some state-of-the-art as defaults such as the Ranger optimizer

4. At line 4, we reach the high where we can train our model using one the fastai secret ingredient being the fast converging training algorithm called fit_one_cycle(). Running the last line, we achieve accuracy higher than 98% in less than 20 epochs, and this one of the reasons fastai has the adjective fast in its name, I guess.
Now, that we known how to achieve a SOTA results using 4 lines, let’s further break down the explanation that I offered you here above. There many to tackle that by choosing which level should we use to explain to a new fastai2 comer how this machinery works. Different people would choose different level. My intuition consists in using the Datasets objects as the starting block. I hope it will transparently reveal you the different steps necessary in building datasets (train and valid), dataloaders in charge of creating the mini-batches that will feed our learner in order to train our NN model. 


# Example
In example, presented here below, we a multivariate time series. The data is generated by sensors on the hands, elbows, wrists and thumbs. The data are the x,y,z coordinates for each of the eight locations. For each sensor and each axis, we have a time serie that represent the value of x (or y, or z) during execution time of a command. For instance, channel 3 (ch3) on the graph shows the Hand tip right, X coordinate at different time laps.

![](images/NATOPS.jpg)

**Right arm vs Left arm recordings**

*#3 represents the 'Not clear' Command (see picture here above)*

![](images/ts-right-arm.png)
![](images/ts-left-arm.png)

## What is the goal of the time series classification?
Let's we ask our operator to execute the command #4: Spread wings and collect the sensors data (24 channels) and save as sample. We then . Our goal is train our Neural Network model with many examples (in our case, we will see that we 360 samples), and then when we feed our model a sensor data without knowing to which command (fro example #2: All clear) it corresponds, our model will able to predict that this sensor data correspond to the #2: All clear command.

(of say #4: Spread wings) in order to be able to tell us that

The goal is to provide our Neural Network model data of 24 channels (for a given sample : co)

## Downloading data

In [None]:
path = unzip_data(URLs_TS.NATOPS)

As you may noticed, the dataset has been downlaoded, unzipped, and stored in a new folder `NATOPS` in default fastai data folder in one single line. There are many `.arff` and `.ts` files. Both arff and ts are simply txt files that either store individual channel data (like `NATOPSDimension12_TRAIN.arff` which contain data from the 12 channel called dimension) or all channels (`NATOPS_TRAIN.arff` which contain all the 24 channels data in same file). Our dataset is split in `TRAIN` and `TEST` files.

> Note: `arff` and `ts` are just ASCII data format used to store timeseries data. `ts` format is both more compact and contains more metadata. You can open these files in any text editor and explore them 

## Using `Datasets` class

In [None]:
#collapse-show
tfms = [[ItemGetter(0), ToTensorTS()], [ItemGetter(1), Categorize()]]
# Create a dataset
ds = Datasets(items, tfms, splits=splits)
ax = show_at(ds, 2, figsize=(1,1))

## Creating a `Dataloaders` object

In [None]:
#collapse-show
bs = 32                            
# Normalize at batch time
tfm_norm = Normalize(scale_subtype = 'per_sample_per_channel', scale_range=(0, 1)) 
dls1 = ds.dataloaders(bs=bs, val_bs=bs * 2, after_batch=[tfm_norm]) 

In [None]:
dls1.show_batch(max_n=9, chs=range(0,12,3))

## Training a Model

In [None]:
#collapse-show
# Number of channels
c_in = get_n_channels(dls2.train) # 24 for NATOPS dataset
# Number of classes
c_out= dls2.c  # 6 for NATOPS dataset

### Create the InceptioTime model

In [None]:
model = inception_time(c_in, c_out).to(device=default_device())

### Creating a Learner object

In [None]:
learn = Learner(dls2, model, opt_func=Ranger, loss_func=LabelSmoothingCrossEntropy(), metrics=accuracy)

### Finding optimal leaninig rate

In [None]:
#collapse-show
lr_min, lr_steep = learn.lr_find()
lr_min, lr_steep

### Training model

In [None]:
learn.fit_one_cycle(30, lr_max=lr_steep)

### Ploting the loss function

### Showing the results

In [None]:
learn.show_results(max_n=9, chs=range(0,12,3))

### Showing the confusion matrix

In [None]:
#collapse-show
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

## Appendix A: Installation

One straightfoward and convenient way to install `timeseries` package is to install the 3 necessary packages from their corresponding github repositories.


In [None]:
#collapse-show
# Install the latest version of fastai shared on github
!pip install git+https://github.com/fastai/fastai2.git

# Install the latest version of fastcore shared on github
!pip install git+https://github.com/fastai/fastcore.git

# Install the latest version of timeseries shared on github
!pip install git+https://github.com/ai-fast-track/timeseries.git

> Note: One can also install the editable version of the 3 packages. Check the [documentation](https://ai-fast-track.github.io/timeseries/)

## Conclusion
I hope this first article convey you into trying the `timesries` package. If you give it a try and found it interesting/helpful, please let others know it by staring it on github, and share with your friends and colleagues who might be interested in this topic.

Until the next post, I wish everybody well, and I hope you and your family, friends, loved ones will stay safe from this COVID-19 threat. I hope that something good will come out of this disruptive situation. As a positive outcome, we already have seen how more and more people are reaching out to each others. I hope the fastai community will use this adversity to build something even greater. I can already see this impact in growing online user group with video-conferences which will have a positive impact in sharing and spreading knowledge and ultimately democratizing further AI and Deep Learning.  


## Appendix B: Existing packages

[sktime](https://github.com/alan-turing-institute/sktime) is a new scikit-learn compatible Python library for time series using machine learning techniques. It provides a uniﬁed interface for several time series learning tasks (univariate / multivariate classification, forecasting). 

[documentation](https://alan-turing-institute.github.io/sktime/index.html)

It also features a Deep Learning extension called [sktime-dl](https://github.com/sktime/sktime-dl)
sktime-dl is written using Keras. Presently, classification models are based on NN models found in dl-4-tsc. The latter is a library is open sourced by [Hassan Fawaz](https://github.com/hfawaz/dl-4-tsc)


![](images/tree.jpg)