In this chapter, we will go over the first _artificial neural networks_(ANNs) and then present _Multi-Layer Perceptrons_(MLPs) and implement one in TF to tackle the MNIST dataset.

## The Perceptron
The perceptron is based on an artificial neuron called a _linear threshold unit_(LTU).

An example of using a Linear Perceptron with Scikit:

In [1]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.linear_model import Perceptron

iris = load_iris()
X = iris.data[:, (2,3)] # Pedal length and width
y = (iris.target == 0).astype(np.int) # Iris Setosa?
per_clf = Perceptron(random_state=42)
per_clf.fit(X,y)

y_pred = per_clf.predict([[2, 0.5]])

print("Predicion: ", y_pred)

Predicion:  [1]




This is equiv to SGDClassifier with the loss set to perceptron. Recall that unlike Logistic Regresion classifers which output probabilities, Perceptrons make predictions on hard thresholds.

Because of the preceptrons limitiations, the study of _connectionism_ (the study of neural networks) was dropped. BUT, it turns out that that some limitations could be eliminated by simply stacking the perceptrons. This is called _Multi-Layer Perceptron_(MLP).

## Multi-Layer Perceptron Backpropagation

An MLP is composed of a passthrough layer, one or more layers of LTUs called hidden layers, and a final layer of LTUs called the output layer. When an _Artificial Neural Network_(ANN) has two or more hidden layers, it is called a _deep neural network_(DNN).

For many years, researchers couldn't find a way to train MLPs, but then they formulated the _backpropagation_ training algorithm. This is known as Gradient Descent.

Aside from the oh so famous logistic function, there are the tanh function, and the ReLU function.

MLP is typically used for binary classification. When used for exclusive classes, there is typically a softmax function. Given that the signal flows only one way, it is called a _feedforward neural network_(FNN).

## Training an MLP with TensorFlow's High-Level API

The easiest way to train an MLP is to use the built-in tensorflow API. The `DNNClassifier` makes it trivial to train a DNN with any number of hidden layers and a softmax output. For example, let's make a DNN for classification with two hidden layers and a softmax output layer with 10 neurons:

In [7]:
# Load MNIST dataset
import os
import pandas as pd

def load_MNIST_data(path='.'):    
    csv_path = os.path.join(path, "mnist_784.csv")
    return pd.read_csv(csv_path)

mnist_pd = load_MNIST_data()
mnist = mnist_pd.values

In [8]:
# Get the data and separate it!
X, y = mnist[:,0:784], mnist[:,784:]

X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]

In [10]:
import tensorflow as tf

feature_columns = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[300, 100], n_classes=10,
                                        feature_columns=feature_columns)
dnn_clf.fit(x=X_train, y=y_train, batch_size=50, steps=40000)

Instructions for updating:
Please switch to tf.contrib.estimator.*_head.
Instructions for updating:
Please replace uses of any Estimator from tf.contrib.learn with an Estimator from tf.estimator.*
Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f947fe28d68>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_train_distribute': None, '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_log_step_count_steps': 100, '_session_config': None, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': '/tmp/tmpqaz

INFO:tensorflow:global_step/sec: 213.491
INFO:tensorflow:loss = 0.3777306, step = 3201 (0.468 sec)
INFO:tensorflow:global_step/sec: 242.188
INFO:tensorflow:loss = 0.20093453, step = 3301 (0.414 sec)
INFO:tensorflow:global_step/sec: 214.31
INFO:tensorflow:loss = 0.12598404, step = 3401 (0.465 sec)
INFO:tensorflow:global_step/sec: 211.739
INFO:tensorflow:loss = 0.50144124, step = 3501 (0.472 sec)
INFO:tensorflow:global_step/sec: 231.036
INFO:tensorflow:loss = 0.54182696, step = 3601 (0.437 sec)
INFO:tensorflow:global_step/sec: 226.952
INFO:tensorflow:loss = 0.6525817, step = 3701 (0.438 sec)
INFO:tensorflow:global_step/sec: 216.168
INFO:tensorflow:loss = 0.3483754, step = 3801 (0.462 sec)
INFO:tensorflow:global_step/sec: 214.653
INFO:tensorflow:loss = 0.43588853, step = 3901 (0.466 sec)
INFO:tensorflow:global_step/sec: 224.031
INFO:tensorflow:loss = 0.15058702, step = 4001 (0.446 sec)
INFO:tensorflow:global_step/sec: 234.098
INFO:tensorflow:loss = 0.1812924, step = 4101 (0.429 sec)
INFO:

INFO:tensorflow:loss = 0.18160439, step = 11401 (0.415 sec)
INFO:tensorflow:global_step/sec: 243.164
INFO:tensorflow:loss = 0.03378515, step = 11501 (0.411 sec)
INFO:tensorflow:global_step/sec: 241.99
INFO:tensorflow:loss = 0.26104668, step = 11601 (0.413 sec)
INFO:tensorflow:global_step/sec: 243.618
INFO:tensorflow:loss = 0.13653375, step = 11701 (0.412 sec)
INFO:tensorflow:global_step/sec: 194.186
INFO:tensorflow:loss = 0.10583084, step = 11801 (0.514 sec)
INFO:tensorflow:global_step/sec: 244.588
INFO:tensorflow:loss = 0.21622434, step = 11901 (0.409 sec)
INFO:tensorflow:global_step/sec: 237.411
INFO:tensorflow:loss = 0.1935614, step = 12001 (0.421 sec)
INFO:tensorflow:global_step/sec: 246.062
INFO:tensorflow:loss = 0.26123866, step = 12101 (0.409 sec)
INFO:tensorflow:global_step/sec: 217.353
INFO:tensorflow:loss = 0.26927236, step = 12201 (0.457 sec)
INFO:tensorflow:global_step/sec: 242.139
INFO:tensorflow:loss = 0.28119397, step = 12301 (0.414 sec)
INFO:tensorflow:global_step/sec: 

INFO:tensorflow:global_step/sec: 239.677
INFO:tensorflow:loss = 0.25415817, step = 19601 (0.418 sec)
INFO:tensorflow:global_step/sec: 240.469
INFO:tensorflow:loss = 0.23967786, step = 19701 (0.415 sec)
INFO:tensorflow:global_step/sec: 246.438
INFO:tensorflow:loss = 0.06210378, step = 19801 (0.406 sec)
INFO:tensorflow:global_step/sec: 237.673
INFO:tensorflow:loss = 0.059527006, step = 19901 (0.420 sec)
INFO:tensorflow:global_step/sec: 233.649
INFO:tensorflow:loss = 0.15518238, step = 20001 (0.428 sec)
INFO:tensorflow:global_step/sec: 226.836
INFO:tensorflow:loss = 0.06099398, step = 20101 (0.440 sec)
INFO:tensorflow:global_step/sec: 235.1
INFO:tensorflow:loss = 0.15587409, step = 20201 (0.426 sec)
INFO:tensorflow:global_step/sec: 242.379
INFO:tensorflow:loss = 0.19869548, step = 20301 (0.413 sec)
INFO:tensorflow:global_step/sec: 234.102
INFO:tensorflow:loss = 0.047822677, step = 20401 (0.427 sec)
INFO:tensorflow:global_step/sec: 243.04
INFO:tensorflow:loss = 0.04219195, step = 20501 (0.

INFO:tensorflow:loss = 0.044660844, step = 27701 (0.423 sec)
INFO:tensorflow:global_step/sec: 230.724
INFO:tensorflow:loss = 0.02414952, step = 27801 (0.435 sec)
INFO:tensorflow:global_step/sec: 237.831
INFO:tensorflow:loss = 0.051004022, step = 27901 (0.419 sec)
INFO:tensorflow:global_step/sec: 230.385
INFO:tensorflow:loss = 0.08218043, step = 28001 (0.434 sec)
INFO:tensorflow:global_step/sec: 242.393
INFO:tensorflow:loss = 0.043567184, step = 28101 (0.413 sec)
INFO:tensorflow:global_step/sec: 233.277
INFO:tensorflow:loss = 0.020149836, step = 28201 (0.429 sec)
INFO:tensorflow:global_step/sec: 239.069
INFO:tensorflow:loss = 0.040030416, step = 28301 (0.418 sec)
INFO:tensorflow:global_step/sec: 225.235
INFO:tensorflow:loss = 0.2103261, step = 28401 (0.444 sec)
INFO:tensorflow:global_step/sec: 234.762
INFO:tensorflow:loss = 0.088026196, step = 28501 (0.426 sec)
INFO:tensorflow:global_step/sec: 230.041
INFO:tensorflow:loss = 0.020665465, step = 28601 (0.434 sec)
INFO:tensorflow:global_st

INFO:tensorflow:loss = 0.11828631, step = 35801 (0.554 sec)
INFO:tensorflow:global_step/sec: 216.234
INFO:tensorflow:loss = 0.08582124, step = 35901 (0.462 sec)
INFO:tensorflow:global_step/sec: 232.332
INFO:tensorflow:loss = 0.114619635, step = 36001 (0.430 sec)
INFO:tensorflow:global_step/sec: 219.133
INFO:tensorflow:loss = 0.015041902, step = 36101 (0.460 sec)
INFO:tensorflow:global_step/sec: 201
INFO:tensorflow:loss = 0.040207185, step = 36201 (0.495 sec)
INFO:tensorflow:global_step/sec: 226.414
INFO:tensorflow:loss = 0.10164454, step = 36301 (0.441 sec)
INFO:tensorflow:global_step/sec: 213.88
INFO:tensorflow:loss = 0.065593235, step = 36401 (0.467 sec)
INFO:tensorflow:global_step/sec: 205.757
INFO:tensorflow:loss = 0.08369605, step = 36501 (0.486 sec)
INFO:tensorflow:global_step/sec: 204.982
INFO:tensorflow:loss = 0.08001676, step = 36601 (0.487 sec)
INFO:tensorflow:global_step/sec: 211.545
INFO:tensorflow:loss = 0.05954801, step = 36701 (0.473 sec)
INFO:tensorflow:global_step/sec:

DNNClassifier(params={'head': <tensorflow.contrib.learn.python.learn.estimators.head._MultiClassHead object at 0x7f948069efd0>, 'hidden_units': [300, 100], 'feature_columns': (_RealValuedColumn(column_name='', dimension=784, default_value=None, dtype=tf.int64, normalizer=None),), 'optimizer': None, 'activation_fn': <function relu at 0x7f9491575510>, 'dropout': None, 'gradient_clip_norm': None, 'embedding_lr_multipliers': None, 'input_layer_min_slice_size': None})

Running this code on MNIST achives a great accuracy!

In [11]:
from sklearn.metrics import accuracy_score

y_pred = list(dnn_clf.predict(X_test))
print("Accuracy is: ", accuracy_score(y_test, y_pred))

Instructions for updating:
Please switch to predict_classes, or set `outputs` argument.
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpqazjp8wa/model.ckpt-40000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
Accuracy is:  0.9466


The TF.Learn library also has some convenience functions to evaluate the models...

In [12]:
print("Evlauation is: ", dnn_clf.evaluate(X_test, y_test))

Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
INFO:tensorflow:Starting evaluation at 2019-06-06-15:09:42
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpqazjp8wa/model.ckpt-40000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2019-06-06-15:09:42
INFO:tensorflow:Saving dict for global step 40000: accuracy = 0.9466, global_s

## Training a DNN Using Plain Tensorflow

We will now use tensorflow's lower lever API to have fun with the MNIST dataset! The first step is the construction phase, then later we will get to the execution phase...

## Construciton Phase

