# ‘Indoor User Movement Prediction‘

### Importing the necessary libraries.

In [0]:
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
from os import listdir

In [0]:
from keras.preprocessing import sequence
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM

from keras.optimizers import Adam
from keras.models import load_model
from keras.callbacks import ModelCheckpoint

##  About and Understanding the Data



Dataset comprises of 316 files:

314 MovementAAL csv files containing the readings from motion sensors placed in the environment.


A Target csv file that contains the target variable for each MovementAAL file

Group Data csv file to identify which MovementAAL file belongs to which setup group

The Path csv file that contains the path which the object took

### Loading the Data

In [67]:
!gdown https://drive.google.com/uc?id=1XW3q_tluk3XiXJ-ABSnCmOUgStnOn40l
!unzip MovementAAL.zip

Downloading...
From: https://drive.google.com/uc?id=1XW3q_tluk3XiXJ-ABSnCmOUgStnOn40l
To: /content/MovementAAL.zip
  0% 0.00/333k [00:00<?, ?B/s]100% 333k/333k [00:00<00:00, 47.5MB/s]
Archive:  MovementAAL.zip
replace MovementAAL/MovementAAL_RSS_184.csv? [y]es, [n]o, [A]ll, [N]one, [r]ename: 

In [0]:
df1 = pd.read_csv("MovementAAL/MovementAAL_RSS_1.csv")
df2 = pd.read_csv("MovementAAL/MovementAAL_RSS_2.csv")

In [69]:
df1.head()

Unnamed: 0,#RSS_anchor1,RSS_anchor2,RSS_anchor3,RSS_anchor4
0,-0.90476,-0.48,0.28571,0.3
1,-0.57143,-0.32,0.14286,0.3
2,-0.38095,-0.28,-0.14286,0.35
3,-0.28571,-0.2,-0.47619,0.35
4,-0.14286,-0.2,0.14286,-0.2


In [70]:
df2.head()

Unnamed: 0,#RSS_anchor1,RSS_anchor2,RSS_anchor3,RSS_anchor4
0,-0.57143,-0.2,0.71429,0.5
1,-0.7619,-0.48,0.7619,-0.25
2,-0.85714,-0.6,0.85714,0.55
3,-0.7619,-0.4,0.71429,0.6
4,-0.7619,-0.84,0.85714,0.45


In [71]:
df1.shape, df2.shape


((27, 4), (26, 4))

The files contain normalized data from the four sensors – A1, A2, A3, A4.
The first reading was for a duration of 27 seconds (so 27 rows), while another reading was for 26 seconds (so 26 rows).

### Read and store the values from the sensors 

In [72]:
path = 'MovementAAL/MovementAAL_RSS_'
sequences = list()
for i in range(1,315):
    file_path = path + str(i) + '.csv'
    print(file_path)
    df = pd.read_csv(file_path, header=0)
    values = df.values
    sequences.append(values)

MovementAAL/MovementAAL_RSS_1.csv
MovementAAL/MovementAAL_RSS_2.csv
MovementAAL/MovementAAL_RSS_3.csv
MovementAAL/MovementAAL_RSS_4.csv
MovementAAL/MovementAAL_RSS_5.csv
MovementAAL/MovementAAL_RSS_6.csv
MovementAAL/MovementAAL_RSS_7.csv
MovementAAL/MovementAAL_RSS_8.csv
MovementAAL/MovementAAL_RSS_9.csv
MovementAAL/MovementAAL_RSS_10.csv
MovementAAL/MovementAAL_RSS_11.csv
MovementAAL/MovementAAL_RSS_12.csv
MovementAAL/MovementAAL_RSS_13.csv
MovementAAL/MovementAAL_RSS_14.csv
MovementAAL/MovementAAL_RSS_15.csv
MovementAAL/MovementAAL_RSS_16.csv
MovementAAL/MovementAAL_RSS_17.csv
MovementAAL/MovementAAL_RSS_18.csv
MovementAAL/MovementAAL_RSS_19.csv
MovementAAL/MovementAAL_RSS_20.csv
MovementAAL/MovementAAL_RSS_21.csv
MovementAAL/MovementAAL_RSS_22.csv
MovementAAL/MovementAAL_RSS_23.csv
MovementAAL/MovementAAL_RSS_24.csv
MovementAAL/MovementAAL_RSS_25.csv
MovementAAL/MovementAAL_RSS_26.csv
MovementAAL/MovementAAL_RSS_27.csv
MovementAAL/MovementAAL_RSS_28.csv
MovementAAL/MovementAAL_RSS_2

In [0]:
targets = pd.read_csv('MovementAAL/MovementAAL_target.csv')
targets = targets.values[:,1]

This is a list ‘sequences’ that contains the data from the motion sensors and ‘targets’ which holds the labels for the csv files. When we print sequences[0], we get the values of sensors from the first csv file:



In [74]:
sequences[0]

array([[-0.90476 , -0.48    ,  0.28571 ,  0.3     ],
       [-0.57143 , -0.32    ,  0.14286 ,  0.3     ],
       [-0.38095 , -0.28    , -0.14286 ,  0.35    ],
       [-0.28571 , -0.2     , -0.47619 ,  0.35    ],
       [-0.14286 , -0.2     ,  0.14286 , -0.2     ],
       [-0.14286 , -0.2     ,  0.047619,  0.      ],
       [-0.14286 , -0.16    , -0.38095 ,  0.2     ],
       [-0.14286 , -0.04    , -0.61905 , -0.2     ],
       [-0.095238, -0.08    ,  0.14286 , -0.55    ],
       [-0.047619,  0.04    , -0.095238,  0.05    ],
       [-0.19048 , -0.04    ,  0.095238,  0.4     ],
       [-0.095238, -0.04    , -0.14286 ,  0.35    ],
       [-0.33333 , -0.08    , -0.28571 , -0.2     ],
       [-0.2381  ,  0.04    ,  0.14286 ,  0.35    ],
       [ 0.      ,  0.08    ,  0.14286 ,  0.05    ],
       [-0.095238,  0.04    ,  0.095238,  0.1     ],
       [-0.14286 , -0.2     ,  0.14286 ,  0.5     ],
       [-0.19048 ,  0.04    , -0.42857 ,  0.3     ],
       [-0.14286 , -0.08    , -0.2381  ,  0.15

### Loading the DatasetGroup csv file now

The dataset was collected in three different pairs of rooms – hence three groups. This information can be used to divide the dataset into train, test and validation sets

In [0]:
groups = pd.read_csv('MovementAAL/MovementAAL_DatasetGroup.csv', header=0)
groups = groups.values[:,1]

### Preprocessing Steps

In [76]:
len_sequences = []
for one_seq in sequences:
    len_sequences.append(len(one_seq))
pd.Series(len_sequences).describe()

count    314.000000
mean      42.028662
std       16.185303
min       19.000000
25%       26.000000
50%       41.000000
75%       56.000000
max      129.000000
dtype: float64

In [0]:
#Padding the sequence with the values in last row to max length
to_pad = 129
new_seq = []
for one_seq in sequences:
    len_one_seq = len(one_seq)
    last_val = one_seq[-1]
    n = to_pad - len_one_seq
   
    to_concat = np.repeat(one_seq[-1], n).reshape(4, n).transpose()
    new_one_seq = np.concatenate([one_seq, to_concat])
    new_seq.append(new_one_seq)
final_seq = np.stack(new_seq)

#truncate the sequence to length 60
from keras.preprocessing import sequence
seq_len = 60
final_seq=sequence.pad_sequences(final_seq, maxlen=seq_len, padding='post', dtype='float', truncating='post')

### Preparing the train, validation and test sets:

In [0]:
train = [final_seq[i] for i in range(len(groups)) if (groups[i]==2)]
validation = [final_seq[i] for i in range(len(groups)) if groups[i]==1]
test = [final_seq[i] for i in range(len(groups)) if groups[i]==3]

train_target = [targets[i] for i in range(len(groups)) if (groups[i]==2)]
validation_target = [targets[i] for i in range(len(groups)) if groups[i]==1]
test_target = [targets[i] for i in range(len(groups)) if groups[i]==3]


In [0]:
train = np.array(train)
validation = np.array(validation)
test = np.array(test)

train_target = np.array(train_target)
train_target = (train_target+1)/2

validation_target = np.array(validation_target)
validation_target = (validation_target+1)/2

test_target = np.array(test_target)
test_target = (test_target+1)/2

### Building a Time Series Classification model

In [0]:
model = Sequential()
model.add(LSTM(256, input_shape=(seq_len, 4)))
model.add(Dense(1, activation='sigmoid'))

In [83]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_3 (LSTM)                (None, 256)               267264    
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 257       
Total params: 267,521
Trainable params: 267,521
Non-trainable params: 0
_________________________________________________________________


In [88]:
adam = Adam(lr=0.0001)
chk = ModelCheckpoint('best_model.pkl', monitor='val_acc', save_best_only=True, mode='max', verbose=1)
model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
model.fit(train, train_target, epochs=500, batch_size=128, callbacks=[chk], validation_data=(validation,validation_target))


Train on 106 samples, validate on 104 samples
Epoch 1/500

Epoch 00001: val_acc improved from -inf to 0.72115, saving model to best_model.pkl
Epoch 2/500

Epoch 00002: val_acc did not improve from 0.72115
Epoch 3/500

Epoch 00003: val_acc did not improve from 0.72115
Epoch 4/500

Epoch 00004: val_acc did not improve from 0.72115
Epoch 5/500

Epoch 00005: val_acc did not improve from 0.72115
Epoch 6/500

Epoch 00006: val_acc did not improve from 0.72115
Epoch 7/500

Epoch 00007: val_acc did not improve from 0.72115
Epoch 8/500

Epoch 00008: val_acc did not improve from 0.72115
Epoch 9/500

Epoch 00009: val_acc did not improve from 0.72115
Epoch 10/500

Epoch 00010: val_acc did not improve from 0.72115
Epoch 11/500

Epoch 00011: val_acc did not improve from 0.72115
Epoch 12/500

Epoch 00012: val_acc did not improve from 0.72115
Epoch 13/500

Epoch 00013: val_acc did not improve from 0.72115
Epoch 14/500

Epoch 00014: val_acc did not improve from 0.72115
Epoch 15/500

Epoch 00015: val_acc

<keras.callbacks.History at 0x7f7122cd7ac8>

In [87]:
#loading the model and checking accuracy on the test data
model = load_model('best_model.pkl')

from sklearn.metrics import accuracy_score
test_preds = model.predict_classes(test)
accuracy_score(test_target, test_preds)

0.7115384615384616

I got an accuracy score of **0.7115384615384616**. It’s quite a promising start but we can definitely improve the performance of the LSTM model by playing around with the hyperparameters, changing the **learning rate**, and/or the **number of epochs** as well.

