# CNN Model for 1D
## Develop 1D Convolutional Neural Network

Reference: https://machinelearningmastery.com/cnn-models-for-human-activity-recognition-time-series-classification/


We start by talking about what exactly is convolution, in particular convolution in 1D

We define an input 1d column matrix called as feature vector
and another column matrix filter/weights. 

The result is another column matrix we define it to be Feature Map.

For each element fm_i, in feature map, at index i, we can write 

fm_i = sum ( fv_j * w_k )
in shorter terms fm = fv * w (convolution)

j varying over i and i + w.size
and 
k varying over 0 and w.size

This elementarily defines convolution of 2 column matrices, that is the feature vector and weights.
This defines single layer convolution, that is a non deep CNN.

However if we talk about multiple layers we take the feature map of one layer 
as the feature vector for the next layer and a maybe different filter/weights 
column vector. The same can go on for a few layers.


![Alternative text](convolution.png)

Reference for explanation 
https://www.youtube.com/watch?v=yd_j_zdLDWs

Here we now talk about what has been implemented, that is CNN to do activity recognition

Idea is to do activity recognition 
output to be given as a number bwn 1-6 denoting 

1 WALKING

2 WALKING_UPSTAIRS

3 WALKING_DOWNSTAIRS

4 SITTING

5 STANDING

6 LAYING

We have 9 inputs, that is features, body acceleration in all 3 directions, body angular velocity in all 3 directions and the total acceleration in all 3 directions

Further, each series of data has been partitioned into overlapping windows of 2.65 seconds of data, or 128 time steps
and we have 7352 such example time series.

Correspondingly we get the result of type of activity being done, as described by the above 6 actions

In [210]:
# cnn model
from numpy import mean
from numpy import std
from numpy import dstack
from pandas import read_csv
from matplotlib import pyplot
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import Dropout
from keras.layers import Conv1D
from keras.layers import MaxPooling1D
# removed .convolutional from the above 2 imports
from keras.utils import to_categorical
import pandas as pd
import matplotlib.pyplot as plt

In [211]:
# load a list of files and return as a 3d numpy array
def load_group(filenames, prefix=''):
	loaded = list()
	for name in filenames:
		dataframe = read_csv((prefix + name), header = None, sep='\s+')
		data = dataframe.values
		loaded.append(data)
	# stack group so that features are the 3rd dimension
	loaded = dstack(loaded)
	return loaded

# load a dataset group, such as train or test
def load_dataset_group(group, prefix='./'):
	filepath = prefix + group + '/InertialSignals/'
	filenames = list()
	# total acceleration
	filenames += ['total_acc_x_'+group+'.txt', 'total_acc_y_'+group+'.txt', 'total_acc_z_'+group+'.txt']
	# body acceleration
	filenames += ['body_acc_x_'+group+'.txt', 'body_acc_y_'+group+'.txt', 'body_acc_z_'+group+'.txt']
	# body gyroscope
	filenames += ['body_gyro_x_'+group+'.txt', 'body_gyro_y_'+group+'.txt', 'body_gyro_z_'+group+'.txt']
	# load input data
	X = load_group(filenames, filepath)
	print('filenames', filenames)
	# load class output
	y_dataframe = read_csv((prefix + group + '/y_'+group+'.txt'), header = None, sep='\s+')
	y = y_dataframe.values
	return X, y

In [212]:
# load the dataset, returns train and test X and y elements
def load_dataset(prefix=''):
	# load all train
	trainX, trainy = load_dataset_group('train', prefix + 'HARDataset/')
	# load all test
	testX, testy = load_dataset_group('test', prefix + 'HARDataset/')
	# zero-offset class values
	trainy = trainy - 1
	testy = testy - 1
	# one hot encode y
	trainy = to_categorical(trainy)
	testy = to_categorical(testy)
	return trainX, trainy, testX, testy

In [213]:
scores = list()
repeats = 5
# TODO: change repeats to 10 while submitting
trainX, trainy, testX, testy = load_dataset()

filenames ['total_acc_x_train.txt', 'total_acc_y_train.txt', 'total_acc_z_train.txt', 'body_acc_x_train.txt', 'body_acc_y_train.txt', 'body_acc_z_train.txt', 'body_gyro_x_train.txt', 'body_gyro_y_train.txt', 'body_gyro_z_train.txt']
filenames ['total_acc_x_test.txt', 'total_acc_y_test.txt', 'total_acc_z_test.txt', 'body_acc_x_test.txt', 'body_acc_y_test.txt', 'body_acc_z_test.txt', 'body_gyro_x_test.txt', 'body_gyro_y_test.txt', 'body_gyro_z_test.txt']


In [214]:
# fit and evaluate a model
def evaluate_model(trainX, trainy, testX, testy):
	verbose, epochs, batch_size = 0, 10, 32
	n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1]
	print(n_timesteps, n_features, n_outputs)
	model = Sequential()
	# 2 Layer Neural Networks
	model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(n_timesteps,n_features)))
	model.add(Conv1D(filters=64, kernel_size=3, activation='relu'))
	model.add(Dropout(0.5))
	model.add(MaxPooling1D(pool_size=2))
	model.add(Flatten())
	model.add(Dense(100, activation='relu'))
	model.add(Dense(n_outputs, activation='softmax'))
	#output classification
	model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
	# fit network
	model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose)
	# evaluate model
	_, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0)
	return accuracy

In [215]:
for r in range(repeats):
    if(r == 0):
        df_trainX = trainX
        df_trainY = trainy
        df_testX = testX
        df_testY = testy
    score = evaluate_model(trainX, trainy, testX, testy)
    score = score * 100.0
    print('>#%d: %.3f' % (r+1, score))
    scores.append(score)
# summarize results
print(scores)
m, s = mean(scores), std(scores)
print('Accuracy: %.3f%% (+/-%.3f)' % (m, s))

128 9 6


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


>#1: 90.940
128 9 6
>#2: 91.313
128 9 6
>#3: 91.619
128 9 6
>#4: 90.702
128 9 6
>#5: 91.076
[90.93993902206421, 91.31320118904114, 91.6185975074768, 90.7024085521698, 91.07567071914673]
Accuracy: 91.130% (+/-0.314)


In [216]:
#BAD Practice to have same names!
df_trainX = pd.DataFrame(df_trainX[0])
df_trainY = pd.DataFrame(df_trainY[0])
df_testX = pd.DataFrame(df_testX[0])
df_testY = pd.DataFrame(df_testY[0])

In [217]:
df_trainX.rename(columns = {
    0:'total_acc_x_train',
    1:'total_acc_y_train',
    2:'total_acc_z_train',
    3:'body_acc_x_train',
    4:'body_acc_y_train',
    5:'body_acc_z_train',
    6:'body_gyro_x_train',
    7:'body_gyro_y_train',
    8:'body_gyro_z_train'
}, inplace=True)
df_trainX

Unnamed: 0,total_acc_x_train,total_acc_y_train,total_acc_z_train,body_acc_x_train,body_acc_y_train,body_acc_z_train,body_gyro_x_train,body_gyro_y_train,body_gyro_z_train
0,1.012817,-0.123217,0.102934,0.000181,0.010767,0.055561,0.030191,0.066014,0.022859
1,1.022833,-0.126876,0.105687,0.010139,0.006579,0.055125,0.043711,0.042699,0.010316
2,1.022028,-0.124004,0.102102,0.009276,0.008929,0.048405,0.035688,0.074850,0.013250
3,1.017877,-0.124928,0.106553,0.005066,0.007489,0.049775,0.040402,0.057320,0.017751
4,1.023680,-0.125767,0.102814,0.010810,0.006141,0.043013,0.047097,0.052343,0.002553
...,...,...,...,...,...,...,...,...,...
123,1.019815,-0.127010,0.094843,0.000228,-0.002929,-0.003412,0.025197,-0.005166,0.007355
124,1.019290,-0.126185,0.098350,-0.000300,-0.002023,0.000359,0.032328,-0.001298,0.002669
125,1.018445,-0.124070,0.100385,-0.001147,0.000171,0.002648,0.039852,0.001909,-0.002170
126,1.019372,-0.122745,0.099874,-0.000222,0.001574,0.002381,0.037449,-0.000080,-0.005643


Idea is that the model scans through the 9 values across 128 data points or time steps and gives out result as a single number.

An explanation of the function evaluate_model is as: 
We have ReLU (Rectified Linear Unit) as the activation function that outputs the input directly if it's positive, otherwise outputs zero.
We set kernel size as 3 

![Alternative text](ConvTime.png)