# Sensor data analysis

**Note**: This report is generated from a Jupyter notebook.

In this notebook, we will absolve some basic data analysis and machine learning tasks using a range of datasets, tools, and algorithms.

## Basic machine learning with TensorFlow

### 1) Basic structure
We have the following TensorFlow program:

In [1]:
import tensorflow

x = 3
y = 5
a = tensorflow.add(x, y)
print(a)
with tensorflow.Session() as session:
    writer = tensorflow.summary.FileWriter('./graphs', session.graph)
    print(session.run(a))
writer.close()

Tensor("Add:0", shape=(), dtype=int32)
8


1) What value does the program compute?

$$(2 + 3)^{(2 * 3)} = 5^6 = 15625$$

2) What is the computational graph of the program?
```
x-|
  |- add      - |
y-|             |
                |- power
x-|             |
  |- multiply - |
y-|
```

### Basic parameters of neural networks
We run the following 1-layer neural network with different parameters:

In [4]:
"""
Simple python script to train a 1-layer neural network to classify cifar10 images use the TensorFlow library
Code adapted from:
https://kth.instructure.com/courses/4962/files/806181/download?verifier=9keHpBCsp2CAtZtVSKE8F4XsLLvqOu1zwgWkuRw2&wrap=1
"""

import tensorflow as tf

# class written to replicate input_data from tensorflow.examples.tutorials.mnist for CIFAR-10
from examples import cifar10_read


def run_network(path, batch_size, iterations, learning_rate):
    # read in the dataset
    print('reading in the CIFAR10 dataset')
    dataset = cifar10_read.read_data_sets(path, one_hot=True, reshape=True)

    using_tensorboard = True

    ##################################################
    # PHASE 1  - ASSEMBLE THE GRAPH

    # 1.1) define the placeholders for the input data and the ground truth labels

    # x_input can handle an arbitrary number of input vectors of length input_dim = d
    # y_  are the labels (each label is a length 10 one-hot encoding) of the inputs in x_input
    # If x_input has shape [N, input_dim] then y_ will have shape [N, 10]

    input_dim = 32 * 32 * 3  # d
    x_input = tf.placeholder(tf.float32, shape=[None, input_dim])
    y_ = tf.placeholder(tf.float32, shape=[None, 10])

    # 1.2) define the parameters of the network
    # W: 3072 x 10 weight matrix,  b: bias vector of length 10

    W = tf.Variable(tf.truncated_normal([input_dim, 10], stddev=.01))
    b = tf.Variable(tf.constant(0.1, shape=[10]))

    # 1.3) define the sequence of operations in the network to produce the output
    # y = W *  x_input + b
    # y will have size [N, 10]  if x_input has size [N, input_dim]
    y = tf.matmul(x_input, W) + b

    # 1.4) define the loss funtion
    # cross entropy loss:
    # Apply softmax to each output vector in y to give probabilities for each class then compare to the ground truth labels via the cross-entropy loss and then compute the average loss over all the input examples
    cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))

    train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)

    # (optional) definiton of performance measures
    # definition of accuracy, count the number of correct predictions where the predictions are made by choosing the class with highest score
    correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    # 1.6) Add an op to initialize the variables.
    init = tf.global_variables_initializer()

    ##################################################


    # If using TENSORBOARD
    if using_tensorboard:
        # keep track of the loss and accuracy for the training set
        tf.summary.scalar('training loss', cross_entropy, collections=['training'])
        tf.summary.scalar('training accuracy', accuracy, collections=['training'])
        # merge the two quantities
        tsummary = tf.summary.merge_all('training')

        # keep track of the loss and accuracy for the validation set
        tf.summary.scalar('validation loss', cross_entropy, collections=['validation'])
        tf.summary.scalar('validation accuracy', accuracy, collections=['validation'])
        # merge the two quantities
        vsummary = tf.summary.merge_all('validation')

    ##################################################


    ##################################################
    # PHASE 2  - PERFORM COMPUTATIONS ON THE GRAPH

    n_iter = iterations

    # 2.1) start a TensorFlow session
    with tf.Session() as sess:
        ##################################################
        # If using TENSORBOARD
        if using_tensorboard:
            # set up a file writer and directory to where it should write info +
            # attach the assembled graph
            summary_writer = tf.summary.FileWriter(
                '/Users/timotheuskampik/Desktop/github/sensing_perception/graphs/network1/results/test', sess.graph)
        ##################################################

        # 2.2)  Initialize the network's parameter variables
        # Run the "init" op (do this when training from a random initialization)
        sess.run(init)

        # 2.3) loop for the mini-batch training of the network's parameters
        for i in range(n_iter):

            # grab a random batch (size nbatch) of labelled training examples
            nbatch = batch_size
            batch = dataset.train.next_batch(nbatch)

            # create a dictionary with the batch data
            # batch data will be fed to the placeholders for inputs "x_input" and labels "y_"
            batch_dict = {
                x_input: batch[0],  # input data
                y_: batch[1],  # corresponding labels
            }

            # run an update step of mini-batch by calling the "train_step" op
            # with the mini-batch data. The network's parameters will be updated after applying this operation
            sess.run(train_step, feed_dict=batch_dict)

            # periodically evaluate how well training is going
            if i % 50 == 0:

                # compute the performance measures on the training set by
                # calling the "cross_entropy" loss and "accuracy" ops with the training data fed to the placeholders "x_input" and "y_"

                tr = sess.run([cross_entropy, accuracy],
                              feed_dict={x_input: dataset.train.images, y_: dataset.train.labels})

                # compute the performance measures on the validation set by
                # calling the "cross_entropy" loss and "accuracy" ops with the validation data fed to the placeholders "x_input" and "y_"

                val = sess.run([cross_entropy, accuracy],
                               feed_dict={x_input: dataset.validation.images, y_: dataset.validation.labels})

                info = [i] + tr + val
                print(info)

                ##################################################
                # If using TENSORBOARD
                if using_tensorboard:
                    # compute the summary statistics and write to file
                    summary_str = sess.run(tsummary, feed_dict={x_input: dataset.train.images, y_: dataset.train.labels})
                    summary_writer.add_summary(summary_str, i)

                    summary_str1 = sess.run(vsummary,
                                            feed_dict={x_input: dataset.validation.images, y_: dataset.validation.labels})
                    summary_writer.add_summary(summary_str1, i)
                ##################################################

        # evaluate the accuracy of the final model on the test data
        test_acc = sess.run(accuracy, feed_dict={x_input: dataset.test.images, y_: dataset.test.labels})
        final_msg = 'test accuracy:' + str(test_acc)
        print(final_msg)

    ##################################################

First, we investigate how the network performs with different learning rates:

In [10]:
data_dir = 'Datasets/cifar-10-batches-py/'
run_network(data_dir, 200, 1000, 0.01)
run_network(data_dir, 200, 1000, 0.0001)
run_network(data_dir, 200, 1000, 0.1)

reading in the CIFAR10 dataset
Reading the training images


Reading the test images


KeyboardInterrupt: 

When the learning rate is too high...

 1) ...the training accuracy does not improve continuously:

 2) ...the loss function does not converge.

When the learning rate is too low...

 1) ...the training accuracy increases too slowly.

 2) ...the loss function diverges too slowly.

Then, we take a look at different batch sizes:

In [5]:
run_network(data_dir, 10, 1000, 0.01)
run_network(data_dir, 2000, 1000, 0.01)

reading in the CIFAR10 dataset
Reading the training images


Reading the test images


INFO:tensorflow:Summary name training loss is illegal; using training_loss instead.


INFO:tensorflow:Summary name training accuracy is illegal; using training_accuracy instead.


INFO:tensorflow:Summary name validation loss is illegal; using validation_loss instead.


INFO:tensorflow:Summary name validation accuracy is illegal; using validation_accuracy instead.


[0, 2.5848298, 0.10031111, 2.5824816, 0.0972]


InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder' with dtype float and shape [?,3072]
	 [[Node: Placeholder = Placeholder[dtype=DT_FLOAT, shape=[?,3072], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'Placeholder', defined at:
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 486, in start
    self.io_loop.start()
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tornado/ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events
    self._handle_recv()
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv
    self._run_callback(callback, msg)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback
    callback(*args, **kwargs)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 233, in dispatch_shell
    handler(stream, idents, msg)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel/ipkernel.py", line 208, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel/zmqshell.py", line 537, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2728, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2850, in run_ast_nodes
    if self.run_code(code, result):
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-d8cece9a7770>", line 2, in <module>
    run_network(data_dir, 200, 1000, 0.01)
  File "<ipython-input-2-835974be7300>", line 29, in run_network
    x_input = tf.placeholder(tf.float32, shape=[None, input_dim])
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1680, in placeholder
    return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3141, in _placeholder
    "Placeholder", dtype=dtype, shape=shape, name=name)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
    op_def=op_def)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder' with dtype float and shape [?,3072]
	 [[Node: Placeholder = Placeholder[dtype=DT_FLOAT, shape=[?,3072], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]


When the batch size is too low (10), the resulting model underfits.

When the batch size is too high (2000), the resulting model overfits.

When running the network with a batch size of 500, a learning rate of 0.05, and 50000 iterations, the accuracy seems to converge to ~42%:

In [11]:
run_network(data_dir, 300, 50000, 0.05)

reading in the CIFAR10 dataset
Reading the training images


Reading the test images


INFO:tensorflow:Summary name training loss is illegal; using training_loss instead.


INFO:tensorflow:Summary name training accuracy is illegal; using training_accuracy instead.


INFO:tensorflow:Summary name validation loss is illegal; using validation_loss instead.


INFO:tensorflow:Summary name validation accuracy is illegal; using validation_accuracy instead.


[0, 2.571825, 0.13855556, 2.5629606, 0.1446]


InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder_9' with dtype float and shape [?,10]
	 [[Node: Placeholder_9 = Placeholder[dtype=DT_FLOAT, shape=[?,10], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'Placeholder_9', defined at:
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 486, in start
    self.io_loop.start()
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tornado/ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events
    self._handle_recv()
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv
    self._run_callback(callback, msg)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback
    callback(*args, **kwargs)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 233, in dispatch_shell
    handler(stream, idents, msg)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel/ipkernel.py", line 208, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/ipykernel/zmqshell.py", line 537, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2728, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2856, in run_ast_nodes
    if self.run_code(code, result):
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-8-04a6403534dc>", line 2, in <module>
    run_network(data_dir, 200, 1000, 0.01)
  File "<ipython-input-4-ceb68947a6f2>", line 31, in run_network
    y_ = tf.placeholder(tf.float32, shape=[None, 10])
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1680, in placeholder
    return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3141, in _placeholder
    "Placeholder", dtype=dtype, shape=shape, name=name)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
    op_def=op_def)
  File "/Users/timotheuskampik/Desktop/github/sensing_perception/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder_9' with dtype float and shape [?,10]
	 [[Node: Placeholder_9 = Placeholder[dtype=DT_FLOAT, shape=[?,10], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
