Implementation of single LTU network using Perceptron class

In [1]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.linear_model import Perceptron

In [2]:
iris = load_iris()
x = iris.data[:, (2,3)]
y = (iris.target == 0).astype(np.int)

In [3]:
percep_clf = Perceptron()
percep_clf.fit(x, y)



Perceptron(alpha=0.0001, class_weight=None, eta0=1.0, fit_intercept=True,
      max_iter=None, n_iter=None, n_jobs=1, penalty=None, random_state=0,
      shuffle=True, tol=None, verbose=0, warm_start=False)

In [4]:
y_pred = percep_clf.predict([[2, 0.5]])

Training a DNN using TensorFlow API

In [5]:
import tensorflow as tf

In [6]:
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()
X_train = X_train.astype(np.float32).reshape(-1, 28*28) / 255.0
X_test = X_test.astype(np.float32).reshape(-1, 28*28) / 255.0
y_train = y_train.astype(np.int32)
y_test = y_test.astype(np.int32)
X_valid, X_train = X_train[:5000], X_train[5000:]
y_valid, y_train = y_train[:5000], y_train[5000:]

Here, we have taken MNIST dataset and split the data into training and testing sets.

Now, we will train a DNN for classification with two hidden layers (one with 300 neurons, and the other with 100 neurons) and a softmax output layer with 10 neurons.

In [7]:
feature_cols = [tf.feature_column.numeric_column("X", shape = [28*28])]
dnn_clf = tf.estimator.DNNClassifier(hidden_units = [300,100], n_classes = 10, feature_columns = feature_cols)
input_fn = tf.estimator.inputs.numpy_input_fn(x = {"X": X_train}, y = y_train, num_epochs = 40, batch_size = 50, shuffle = True)
dnn_clf.train(input_fn = input_fn)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': 'C:\\Users\\manog\\AppData\\Local\\Temp\\tmps050b1z7', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x0000016268AD2748>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
To 

INFO:tensorflow:global_step/sec: 314.077
INFO:tensorflow:loss = 2.9387155, step = 5401 (0.319 sec)
INFO:tensorflow:global_step/sec: 279.629
INFO:tensorflow:loss = 1.133555, step = 5501 (0.357 sec)
INFO:tensorflow:global_step/sec: 277.705
INFO:tensorflow:loss = 0.35938102, step = 5601 (0.361 sec)
INFO:tensorflow:global_step/sec: 267.391
INFO:tensorflow:loss = 6.7884827, step = 5701 (0.374 sec)
INFO:tensorflow:global_step/sec: 275.427
INFO:tensorflow:loss = 0.281543, step = 5801 (0.363 sec)
INFO:tensorflow:global_step/sec: 276.563
INFO:tensorflow:loss = 3.821343, step = 5901 (0.362 sec)
INFO:tensorflow:global_step/sec: 276.091
INFO:tensorflow:loss = 2.2740083, step = 6001 (0.362 sec)
INFO:tensorflow:global_step/sec: 280.112
INFO:tensorflow:loss = 0.26894513, step = 6101 (0.357 sec)
INFO:tensorflow:global_step/sec: 277.322
INFO:tensorflow:loss = 1.0954212, step = 6201 (0.361 sec)
INFO:tensorflow:global_step/sec: 277.652
INFO:tensorflow:loss = 4.882656, step = 6301 (0.360 sec)
INFO:tensorf

INFO:tensorflow:global_step/sec: 276.183
INFO:tensorflow:loss = 0.01145824, step = 13601 (0.362 sec)
INFO:tensorflow:global_step/sec: 275.804
INFO:tensorflow:loss = 0.020249365, step = 13701 (0.362 sec)
INFO:tensorflow:global_step/sec: 275.805
INFO:tensorflow:loss = 0.16038454, step = 13801 (0.363 sec)
INFO:tensorflow:global_step/sec: 276.182
INFO:tensorflow:loss = 0.16520247, step = 13901 (0.362 sec)
INFO:tensorflow:global_step/sec: 274.677
INFO:tensorflow:loss = 0.26072243, step = 14001 (0.365 sec)
INFO:tensorflow:global_step/sec: 275.428
INFO:tensorflow:loss = 0.12963904, step = 14101 (0.363 sec)
INFO:tensorflow:global_step/sec: 275.051
INFO:tensorflow:loss = 0.084393926, step = 14201 (0.364 sec)
INFO:tensorflow:global_step/sec: 279.631
INFO:tensorflow:loss = 0.022450002, step = 14301 (0.358 sec)
INFO:tensorflow:global_step/sec: 276.94
INFO:tensorflow:loss = 0.22454497, step = 14401 (0.361 sec)
INFO:tensorflow:global_step/sec: 275.429
INFO:tensorflow:loss = 0.065046296, step = 14501

INFO:tensorflow:global_step/sec: 283.962
INFO:tensorflow:loss = 0.009586998, step = 21701 (0.353 sec)
INFO:tensorflow:global_step/sec: 278.856
INFO:tensorflow:loss = 0.0149701, step = 21801 (0.358 sec)
INFO:tensorflow:global_step/sec: 275.427
INFO:tensorflow:loss = 0.022572953, step = 21901 (0.364 sec)
INFO:tensorflow:global_step/sec: 277.703
INFO:tensorflow:loss = 0.119284905, step = 22001 (0.360 sec)
INFO:tensorflow:global_step/sec: 279.244
INFO:tensorflow:loss = 0.0053054616, step = 22101 (0.358 sec)
INFO:tensorflow:global_step/sec: 278.471
INFO:tensorflow:loss = 0.08018149, step = 22201 (0.359 sec)
INFO:tensorflow:global_step/sec: 274.304
INFO:tensorflow:loss = 0.029272603, step = 22301 (0.365 sec)
INFO:tensorflow:global_step/sec: 278.856
INFO:tensorflow:loss = 0.032424074, step = 22401 (0.359 sec)
INFO:tensorflow:global_step/sec: 279.241
INFO:tensorflow:loss = 0.05384083, step = 22501 (0.359 sec)
INFO:tensorflow:global_step/sec: 275.805
INFO:tensorflow:loss = 0.008523341, step = 2

INFO:tensorflow:global_step/sec: 281.582
INFO:tensorflow:loss = 0.0056290166, step = 29801 (0.355 sec)
INFO:tensorflow:global_step/sec: 275.428
INFO:tensorflow:loss = 0.0032763504, step = 29901 (0.363 sec)
INFO:tensorflow:global_step/sec: 278.856
INFO:tensorflow:loss = 0.001667137, step = 30001 (0.359 sec)
INFO:tensorflow:global_step/sec: 277.322
INFO:tensorflow:loss = 0.016456064, step = 30101 (0.360 sec)
INFO:tensorflow:global_step/sec: 278.856
INFO:tensorflow:loss = 0.038457714, step = 30201 (0.359 sec)
INFO:tensorflow:global_step/sec: 276.183
INFO:tensorflow:loss = 0.04173205, step = 30301 (0.362 sec)
INFO:tensorflow:global_step/sec: 276.561
INFO:tensorflow:loss = 0.032262243, step = 30401 (0.362 sec)
INFO:tensorflow:global_step/sec: 277.322
INFO:tensorflow:loss = 0.0125072375, step = 30501 (0.361 sec)
INFO:tensorflow:global_step/sec: 274.303
INFO:tensorflow:loss = 0.009326413, step = 30601 (0.365 sec)
INFO:tensorflow:global_step/sec: 277.705
INFO:tensorflow:loss = 0.06461491, step

INFO:tensorflow:global_step/sec: 280.798
INFO:tensorflow:loss = 0.0018186925, step = 37901 (0.356 sec)
INFO:tensorflow:global_step/sec: 274.303
INFO:tensorflow:loss = 0.0075440085, step = 38001 (0.365 sec)
INFO:tensorflow:global_step/sec: 276.942
INFO:tensorflow:loss = 0.000476402, step = 38101 (0.361 sec)
INFO:tensorflow:global_step/sec: 266.674
INFO:tensorflow:loss = 0.041034214, step = 38201 (0.375 sec)
INFO:tensorflow:global_step/sec: 273.199
INFO:tensorflow:loss = 0.0048142658, step = 38301 (0.366 sec)
INFO:tensorflow:global_step/sec: 273.188
INFO:tensorflow:loss = 0.03812752, step = 38401 (0.366 sec)
INFO:tensorflow:global_step/sec: 269.176
INFO:tensorflow:loss = 0.0014221391, step = 38501 (0.372 sec)
INFO:tensorflow:global_step/sec: 273.559
INFO:tensorflow:loss = 0.006695306, step = 38601 (0.365 sec)
INFO:tensorflow:global_step/sec: 274.304
INFO:tensorflow:loss = 0.006142506, step = 38701 (0.365 sec)
INFO:tensorflow:global_step/sec: 272.82
INFO:tensorflow:loss = 0.021008682, ste

<tensorflow_estimator.python.estimator.canned.dnn.DNNClassifier at 0x16268ad2400>

In [8]:
test_input_fn = tf.estimator.inputs.numpy_input_fn(x = {"X": X_test}, y = y_test, shuffle = False)
eval_results = dnn_clf.evaluate(input_fn = test_input_fn)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2019-06-12T19:13:21Z
INFO:tensorflow:Graph was finalized.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from C:\Users\manog\AppData\Local\Temp\tmps050b1z7\model.ckpt-44000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2019-06-12-19:13:21
INFO:tensorflow:Saving dict for global step 44000: accuracy = 0.9794, average_loss = 0.1085036, global_step = 44000, loss = 13.734633
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 44000: C:\Users\manog\AppData\Local\Temp\tmps050b1z7\model.ckpt-44000


In [9]:
eval_results

{'accuracy': 0.9794,
 'average_loss': 0.1085036,
 'loss': 13.734633,
 'global_step': 44000}

By running the above mentioned DNN for classification on the earlier taken MNIST dataset, we got a model that has accuracy over 98% on the testing set.

In [10]:
y_pred_iter = dnn_clf.predict(input_fn = test_input_fn)
y_pred = list(y_pred_iter)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\manog\AppData\Local\Temp\tmps050b1z7\model.ckpt-44000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.


In [11]:
y_pred[0]

{'logits': array([ -7.886765 ,   0.1859046,  -1.1919025,   5.61782  ,  -5.2724557,
         -6.1540008, -18.582842 ,  22.397022 ,  -5.5561256,   1.4554625],
       dtype=float32),
 'probabilities': array([7.0456085e-14, 2.2585751e-10, 5.6945615e-11, 5.1628007e-08,
        9.6227366e-13, 3.9851811e-13, 1.5946681e-18, 1.0000000e+00,
        7.2460636e-13, 8.0388984e-10], dtype=float32),
 'class_ids': array([7], dtype=int64),
 'classes': array([b'7'], dtype=object)}

Now, we will train a DNN for classification with the same above mentioned MLP architecture using Plain TensorFlow (lower- level Python API).

We will take the MNIST dataset and the first step is the Construction phase, where TensorFlow graph is built.

In [12]:
import tensorflow as tf

In [13]:
n_inputs = 28*28     # number of inputs in MNIST dataset
n_hidden1 = 300
n_hidden2 = 100
n_outputs = 10      # number of outputs in the softmax output layer

We create placeholders for the inputs and the targets.

In [14]:
X = tf.placeholder(tf.float32, shape = (None, n_inputs), name = "X")
y = tf.placeholder(tf.int64, shape = (None), name = "y")

Now, we will create a neuron_layer() function that will be used to create one layer at a time. It will need parameters to specify the inputs, the number of neurons, the activation function, and the name of the layer.

In [15]:
def neuron_layer(X, n_neurons, name, activation = None):
    with tf.name_scope(name):
        n_inputs = int(X.get_shape()[1])    # number of inputs in 2nd dimension and number of instances in the 1st dimension
        stddev = 2 / np.sqrt(n_inputs)     # standard deviation helps the algorithm to converge faster  
        init = tf.truncated_normal((n_inputs, n_neurons), stddev = stddev)    # shape of W variable
        W = tf.Variable(init, name="weights")  # weights matrix, 2D tensor of connection weights b/w each input & each neuron
        b = tf.Variable(tf.zeros([n_neurons]), name = "bias")    # one bias parameter per neuron
        z = tf.matmul(X, W) + b
        if activation == "relu":
            return tf.nn.relu(z)
        else:
            return z

Now, we will use this neuron_layer() function to create the DNN with the above mentioned MLP architecture.

In [16]:
with tf.name_scope("dnn"):
    hidden1 = neuron_layer(X, n_hidden1, "hidden1", activation = "relu")
    hidden2 = neuron_layer(hidden1, n_hidden2, "hidden2", activation = "relu")
    logits = neuron_layer(hidden2, n_outputs, "outputs")

Now, we will define the cost function "cross entropy" that will be used to train the neural network model and then compute the mean cross entropy over all instances.

In [17]:
with tf.name_scope("loss"):
    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels = y, logits = logits)   # 1D tensor containing 
                                                                                       # the cross entropy for each instance
    loss = tf.reduce_mean(cross_entropy, name = "loss")

Now, we need to define a "GradientDescentOptimizer" that will tweak the model parameters to minimize the cost function.

In [18]:
learning_rate = 0.01

with tf.name_scope("train"):
    optimizer = tf.train.GradientDescentOptimizer(learning_rate)
    train_out = optimizer.minimize(loss)

The last important step in the Construction phase is to specify how to evaluate the model.

In [19]:
with tf.name_scope("eval"):
    correct_pred = tf.cast(tf.nn.in_top_k(logits, y, 1), tf.float32)   # determine if the neural network’s prediction is 
                                                  # correct by checking whether or not the highest logit corresponds to the
                                                  # target class. cast these 1D tensor of booleans to float.
    accuracy = tf.reduce_mean(correct_pred)     # network's overall accuracy as our performance measure

We need to create a node to initialize all variables, and we will also create a Saver to save our trained model parameters to disk.

In [20]:
init = tf.global_variables_initializer()
saver = tf.train.Saver()

The next step is the Execution phase, where we run the above built graph to train the model.

In [21]:
n_epochs = 400    # number of epochs to be run
batch_size = 50    # size of the mini-batches

Now, we train the model using Mini-batch Gradient Descent.

In [22]:
def shuffle_batch(X, y, batch_size):
    rnd_idx = np.random.permutation(len(X))
    n_batches = len(X) // batch_size
    for batch_idx in np.array_split(rnd_idx, n_batches):
        X_batch, y_batch = X[batch_idx], y[batch_idx]
        yield X_batch, y_batch

In [24]:
with tf.Session() as sess:
    init.run()
    for epoch in range(n_epochs):
        for X_batch, y_batch in shuffle_batch(X_train, y_train, batch_size):   # iterates through a number of mini-batches 
                                                                               # that corresponds to the training set size.
            sess.run(train_out, feed_dict = {X: X_batch, y: y_batch})
        acc_batch = accuracy.eval(feed_dict = {X: X_batch, y: y_batch})
        acc_test = accuracy.eval(feed_dict = {X: X_valid, y: y_valid})
        print(epoch, "Batch accuracy:", acc_batch, "Test accuracy:", acc_test)

    save_path = saver.save(sess, "./my_model_final.ckpt")     # model parameters are saved to disk.

0 Batch accuracy: 0.88 Test accuracy: 0.9126
1 Batch accuracy: 0.98 Test accuracy: 0.9322
2 Batch accuracy: 0.92 Test accuracy: 0.9418
3 Batch accuracy: 0.92 Test accuracy: 0.9482
4 Batch accuracy: 0.94 Test accuracy: 0.9518
5 Batch accuracy: 0.98 Test accuracy: 0.9572
6 Batch accuracy: 0.98 Test accuracy: 0.9584
7 Batch accuracy: 1.0 Test accuracy: 0.9626
8 Batch accuracy: 0.98 Test accuracy: 0.9642
9 Batch accuracy: 0.98 Test accuracy: 0.9658
10 Batch accuracy: 1.0 Test accuracy: 0.9686
11 Batch accuracy: 0.98 Test accuracy: 0.9682
12 Batch accuracy: 0.98 Test accuracy: 0.9686
13 Batch accuracy: 0.98 Test accuracy: 0.971
14 Batch accuracy: 0.98 Test accuracy: 0.9722
15 Batch accuracy: 1.0 Test accuracy: 0.9722
16 Batch accuracy: 1.0 Test accuracy: 0.9742
17 Batch accuracy: 0.98 Test accuracy: 0.974
18 Batch accuracy: 0.94 Test accuracy: 0.9762
19 Batch accuracy: 0.98 Test accuracy: 0.975
20 Batch accuracy: 0.98 Test accuracy: 0.9764
21 Batch accuracy: 1.0 Test accuracy: 0.9758
22 Bat

181 Batch accuracy: 1.0 Test accuracy: 0.9798
182 Batch accuracy: 1.0 Test accuracy: 0.98
183 Batch accuracy: 1.0 Test accuracy: 0.9798
184 Batch accuracy: 1.0 Test accuracy: 0.9796
185 Batch accuracy: 1.0 Test accuracy: 0.9798
186 Batch accuracy: 1.0 Test accuracy: 0.9794
187 Batch accuracy: 1.0 Test accuracy: 0.98
188 Batch accuracy: 1.0 Test accuracy: 0.9798
189 Batch accuracy: 1.0 Test accuracy: 0.9794
190 Batch accuracy: 1.0 Test accuracy: 0.9798
191 Batch accuracy: 1.0 Test accuracy: 0.9796
192 Batch accuracy: 1.0 Test accuracy: 0.98
193 Batch accuracy: 1.0 Test accuracy: 0.9796
194 Batch accuracy: 1.0 Test accuracy: 0.9798
195 Batch accuracy: 1.0 Test accuracy: 0.9798
196 Batch accuracy: 1.0 Test accuracy: 0.98
197 Batch accuracy: 1.0 Test accuracy: 0.9798
198 Batch accuracy: 1.0 Test accuracy: 0.9798
199 Batch accuracy: 1.0 Test accuracy: 0.9798
200 Batch accuracy: 1.0 Test accuracy: 0.98
201 Batch accuracy: 1.0 Test accuracy: 0.9796
202 Batch accuracy: 1.0 Test accuracy: 0.98


363 Batch accuracy: 1.0 Test accuracy: 0.98
364 Batch accuracy: 1.0 Test accuracy: 0.9802
365 Batch accuracy: 1.0 Test accuracy: 0.9802
366 Batch accuracy: 1.0 Test accuracy: 0.9802
367 Batch accuracy: 1.0 Test accuracy: 0.98
368 Batch accuracy: 1.0 Test accuracy: 0.98
369 Batch accuracy: 1.0 Test accuracy: 0.98
370 Batch accuracy: 1.0 Test accuracy: 0.98
371 Batch accuracy: 1.0 Test accuracy: 0.98
372 Batch accuracy: 1.0 Test accuracy: 0.98
373 Batch accuracy: 1.0 Test accuracy: 0.9798
374 Batch accuracy: 1.0 Test accuracy: 0.98
375 Batch accuracy: 1.0 Test accuracy: 0.9804
376 Batch accuracy: 1.0 Test accuracy: 0.9804
377 Batch accuracy: 1.0 Test accuracy: 0.9802
378 Batch accuracy: 1.0 Test accuracy: 0.98
379 Batch accuracy: 1.0 Test accuracy: 0.98
380 Batch accuracy: 1.0 Test accuracy: 0.98
381 Batch accuracy: 1.0 Test accuracy: 0.98
382 Batch accuracy: 1.0 Test accuracy: 0.98
383 Batch accuracy: 1.0 Test accuracy: 0.9798
384 Batch accuracy: 1.0 Test accuracy: 0.98
385 Batch accura

Now, we will use this trained neural network model to make predictions.

In [25]:
with tf.Session() as sess:
    saver.restore(sess, "./my_model_final.ckpt")    # loads the model parameters from the disk
    X_new_scaled = X_test[:30]                      # loads somenew images to be classified (feature scaled from 0 to 1)    
    Z = logits.eval(feed_dict = {X: X_new_scaled})  # evaluates the logits node
    y_pred = np.argmax(Z, axis = 1)         # class prediction by simply picking the class that has the highest logit value

INFO:tensorflow:Restoring parameters from ./my_model_final.ckpt


In [26]:
print("Actual classes:   ", y_test[:30])
print("Predicted classes:", y_pred)

Actual classes:    [7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4 9 6 6 5 4 0 7 4 0 1]
Predicted classes: [7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4 9 6 6 5 4 0 7 4 0 1]
