Data available on Kaggle for smartphone accelerometer data labelled by user action: <br>https://www.kaggle.com/uciml/human-activity-recognition-with-smartphones/data
<br>This type of data is a great example of what we can feed a neural network: obscure data in abundance (561 values per datapoint) that correlates in some way with a classification.


In [None]:
import pandas as pd
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import os
import cv2
import sys, re
import shutil

In [None]:
shutil.rmtree("examplemodel")

In [None]:
po = pd.read_csv("./smartphone/train.csv")
print(po.head(5))

As you can see, we have a lot of data to look at. We're not going to try to understand the data ourselves, but rather let a deep learning model understand it for us.
<br>First, lets load the data and separate it into labels and input. As you can see from the above, column index 1 contains the activity labels and the rest of each row (beginning from index 2) contains the sensors data from the smartphone.

In [None]:
po = pd.read_csv("./smartphone/train.csv")
sensors = po[po.columns[2:]]
actions = po[po.columns[1]]
print(str(sensors.shape))
print(str(actions.shape))

For the neural network model, the labels need to be integers. The loss function tf.losses.sparse_softmax_cross_entropy requires int32 or int64.<br>This can be achieved by using the index to a list of activity strings. Lets create that list and make a note of the corresponding index.

In [None]:
action_labels = []
action_indices = []

#Go through all action strings and store the unique strings into action_labels
print("Collecting all the action labels: ")
for s in range(0,len(actions)):
    lbln = actions[s]
    if lbln not in action_labels:
        action_labels.append(lbln)
        print(lbln)
    action_indices.append(action_labels.index(lbln))
    
#Convert into numpy arrays for tensorflow
input = np.asarray(sensors, np.float32)
output = np.asarray(action_indices, np.int32)

print("Input/output pairs prepared: ")
print("Input: "+str(input.shape))
print("Output: "+str(output.shape))
print("The first 20 labels look like this now: "+str(output[0:20]))



With 3609 data samples, we can think of our training strategy: how much of the data to use for training, and how much for evaluation. Remember, the model may become overfitted for the training data and not be general enough to correctly predict new data. <br>We will reserve about 27% of the data for evaluation later.

In [None]:
eval_input  =  input[0:1000,:]
eval_output = output[0:1000]
input  =  input[1000:,:]
output = output[1000:]

On to the exciting stuff! Next, we'll define the model function. Add hidden layers as follows:
<li>2d convolutional layer
<li>max pooling layer with strides and pool size 2
<li>2d convolutional layer
<li>Dense layer with CLASSESCNT nodes (this will be the output)
<p>If you like, you can also use 1d layers, such as tf.layers.conv1d and tf.layers.max_pooling1d

In [None]:
CLASSESCNT = len(action_labels)
DATALEN = input.shape[1]#number of values per sample (second dimension of the shape)
def model_function(features, labels, mode):
  global DATALEN
  global CLASSESCNT
  inpu = tf.reshape(features["x"], [-1, DATALEN, 1, 1])
  
  logits = tf.reshape(dense, [-1, CLASSESCNT])

  predictions = {
    "classes": tf.argmax(input=logits, axis=1),
    "probabilities": tf.nn.softmax(logits, name="softmax_tensor")
  }

  if mode == tf.estimator.ModeKeys.PREDICT:
    return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
  loss = tf.losses.sparse_softmax_cross_entropy( labels = labels, logits=logits )
  tf.summary.scalar("loss", loss)

  if mode == tf.estimator.ModeKeys.TRAIN:
      optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
      gradients, variables = zip(*optimizer.compute_gradients(loss))
      gradients, _ = tf.clip_by_global_norm(gradients, 5.0)
      optimize = optimizer.apply_gradients(zip(gradients, variables), global_step=tf.train.get_global_step())
      return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=optimize)

  eval_metric_ops = { "accuracy": tf.metrics.accuracy(labels=labels, predictions=predictions["classes"])}
  return tf.estimator.EstimatorSpec( mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)


In [None]:
my_checkpointing_config = tf.estimator.RunConfig(
    save_checkpoints_secs = 60*1,
    keep_checkpoint_max = 10,
)
estimator = tf.estimator.Estimator(model_fn=model_function, model_dir="examplemodel", config=my_checkpointing_config)

train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": input},
    y=output,
    batch_size=100,
    num_epochs=None,
    shuffle=True)

estimator.train( input_fn=train_input_fn, steps = 10000 )


In [None]:
print("Eval")
eval_input_fn = tf.estimator.inputs.numpy_input_fn( x={"x":eval_input}, y=eval_output, num_epochs=1, shuffle=False )
eval_results = estimator.evaluate(input_fn=eval_input_fn)
#fin = eval_results.__next__()
print("result size: "+str(list(eval_results)))