# Time series classification

Time series classification (TSC) operates on time series data, a series of values that is ordered by time. Data samples are labelled as belonging to a particular class. The TSC system is trained using this data to classify unlabelled samples. There is a wide range of TSC applications. Smartwatch data is used to classify human activities (walking, running, ascending stairs, etc.). Animal behaviour (hunting, sleeping) is monitored using accelerometers on tagged, wild animals for environmental studies. Sensors on industrial machines are used to classify time series samples as either normal or preceding a failure, informing machine maintenance schedules.

This exercise uses the SonyAIBORobotSurface1 dataset from the UEA & UCR Time Series Classification Repository (Dau et al, 2018). This dataset was collected by Vail and Veloso (2004), Carnegie Mellon University, from an accelerometer on a Sony AIBO robot. Their aim was to detect the surface that the robot was walking on in order to optimise its gait for that surface. The robots competed in the RoboCup League, a football game played on a carpeted field.

![The Sony AIBO Robot is a robot dog. It is pictured with a ball.](https://i1.wp.com/www.techdigest.tv/wp-content/uploads/2015/06/aibo-560.jpg "Sony AIBO Robot")

## References
Dau, H. A., Bagnall, A., Kamgar, K., Yeh, C.-C. M., Zhu, Y., Gharghabi, S., Ratanamahatana, C. A. and Keogh, E. (2018) ‘The UCR Time Series Archive’, [Online]. Available at http://arxiv.org/abs/1810.07758 (Accessed 4 May 2019).

Vail, D. and Veloso, M. (2004) ‘Learning from accelerometer data on a legged robot’, *IFAC Proceedings*, vol. 37, no. 8, pp. 822–827 [Online]. Available at https://www.cs.cmu.edu/~mmv/papers/04iav-doug.pdf (Accessed 4 May 2019).



 


# Load Python packages
Import the Python packages that we will need.

In [None]:
from pathlib import Path
import time

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
import seaborn as sns

# General settings
sns.set_style('whitegrid')

# User settings

In [None]:
load_from_web = False

# Load the data
The robot data provided is the x-axis accelerometer data sampled at 125Hz (125 times per second). A positive value relates acceleration in the forward direction. Each data sample has 70 data points (0.56s) and is labelled as either cement or carpet. The original data had a positive mean, because the robot leans forwards slightly, and was in the range approximately [0, 0.4] gravities. The dataset provided has been normalised.

The machine learning approach that Vail and Veloso took was to take a one second window and extract statistical features from all three accelerometer axes. Six features were calculated – variance in acceleration and correlation between the accelerations. A decision tree was used for learning. The paper reports on three classes – walking on cement, carpet in their laboratory and carpet on the football field. The overall classification accuracy was 84.9%.

The dataset has been split into two, balanced, datasets. One for model development and one for our final test to evaluate the finished model.

In [None]:
if load_from_web:
    url = 'https://raw.githubusercontent.com/Withington/deepscent/master/data/SonyAIBORobotSurface1_IoC/SonyAIBORobotSurface1_IoC_ALL.txt'
    robot_df = pd.read_csv(url, sep='\t', header=None)
    print('Loaded from', url)
    robot_data = robot_df.values
else:
    data_dir = '../../data'
    data_name = 'SonyAIBORobotSurface1_IoC'
    data_filename = data_dir+'/'+data_name+'/'+data_name+'_ALL.txt'
    robot_data = np.loadtxt(Path(data_filename))
    print('Loaded from', data_filename)
print('The shape of robot_data is', robot_data.shape)
print('robot_data:', robot_data)

Extract the labels, y, and the data samples, x. For convenience we will use labels class 0 and 1 instead of classes 1 and 2. 

class 0 : cement

class 1 : carpet

In [None]:
y = robot_data[:,0]
x = robot_data[:,1:]
print('The shape of x is', x.shape)
print('The shape of y is', y.shape)

# Change from classes 1 and 2 to classes 0 and 1
y = (y - y.min())/(y.max()-y.min())

## Plot the data

In [None]:
sample_number = 0 ### CHANGE PARAMETER HERE ###
plt.plot(x[sample_number], label='class'+str(y[sample_number]))
plt.legend(loc='upper right', frameon=False)

In [None]:
sample_a = 0 ### CHANGE PARAMETER HERE ###
sample_b = 1 ### CHANGE PARAMETER HERE ###
plt.plot(x[sample_a], label='class'+str(y[sample_a]))
plt.plot(x[sample_b], label='class'+str(y[sample_b]))
plt.legend(loc='upper right', frameon=False)

In [None]:
y[:17]

In [None]:
i = 1
plt.plot(x[i], label='class'+str(y[i]))
i = 4
plt.plot(x[i], label='class'+str(y[i]))
i = 9
plt.plot(x[i], label='class'+str(y[i]))
i = 11
plt.plot(x[i], label='class'+str(y[i]))
i = 16
plt.plot(x[i], label='class'+str(y[i]))
plt.legend(loc='upper right', frameon=False)
plt.ylim([-3.5, 3.5])
plt.title('Walking on cement')

In [None]:
samples = [0, 2, 3, 5, 6]
for i in samples:
    plt.plot(x[i], label='class'+str(y[i]))
plt.legend(loc='upper right', frameon=False)
plt.ylim([-3.5, 3.5])
plt.title('Walking on carpet')

# Examine the balance of the dataset

In [None]:
print('Number of samples of class 0', (y == 0).sum())
print('Number of samples of class 1', (y == 1).sum())
y_df = pd.DataFrame(y)
y_df[0].value_counts().plot(kind='bar')

# Load a pre-prepared balanced dataset

In [None]:
if load_from_web:
    url = 'https://raw.githubusercontent.com/Withington/deepscent/master/data/SonyAIBORobotSurface1_IoC/SonyAIBORobotSurface1_IoC_BALANCED.txt'
    robot_df = pd.read_csv(url, sep='\t', header=None)
    print('Loaded from', url)
    robot_data = robot_df.values
else:
    data_dir = '../../data'
    data_name = 'SonyAIBORobotSurface1_IoC'
    data_filename = data_dir+'/'+data_name+'/'+data_name+'_BALANCED.txt'
    robot_data = np.loadtxt(Path(data_filename))
    print('Loaded from', data_filename)
print('The shape of robot_data is', robot_data.shape)

In [None]:
y = robot_data[:,0]
x = robot_data[:,1:]
y = (y - y.min())/(y.max()-y.min())

print('Number of samples of class 0', (y == 0).sum())
print('Number of samples of class 1', (y == 1).sum())
y_df = pd.DataFrame(y)
y_df[0].value_counts().plot(kind='bar')

# Split the dataset into development and final test datasets

In [None]:
x_dev, x_finaltest, y_dev, y_finaltest = train_test_split(x, y, test_size=100, random_state=21, stratify=y)
print('The shape of x_dev is', x_dev.shape)
print('The shape of x_finaltest is', x_finaltest.shape)
print('Development data:')
print('Number of samples of class 0', (y_dev == 0).sum())
print('Number of samples of class 1', (y_dev == 1).sum())
print('Final test data:')
print('Number of samples of class 0', (y_finaltest == 0).sum())
print('Number of samples of class 1', (y_finaltest == 1).sum())