# Basic Classification Example with TensorFlow

This notebook is a companion of [A Visual and Interactive Guide to the Basics of Neural Networks](https://jalammar.github.io/visual-interactive-guide-basics-neural-networks/).

This is an example of how to do classification on a simple dataset in TensorFlow. Basically, we're building a model to help a friend choose a house to buy. She has given us the table below of houses and whether she likes them or not. We're to build a model that takes a house area and number of bathrooms as input, and outputs a prediction of whether she would like the house or not.

| Area (sq ft) (x1) | Bathrooms (x2) | Label (y) |
 | --- | --- | --- |
 | 2,104 |  3 | Good |
 | 1,600 |  3 | Good |
 | 2,400 |  3 | Good |
 | 1,416 | 	2 | Bad |
 | 3,000 | 	4 | Bad |
 | 1,985 | 	4 | Good |
 | 1,534 | 	3 | Bad |
 | 1,427 | 	3 | Good |
 | 1,380 | 	3 | Good |
 | 1,494 | 	3 | Good |
 
 
 
 We'll start by loading our favorite libraries

In [2]:
%matplotlib inline               
import pandas as pd              # A beautiful library to help us work with data as tables
import numpy as np               # So we can use number matrices. Both pandas and TensorFlow need it. 
import matplotlib.pyplot as plt  # Visualize the things
import tensorflow as tf          # Fire from the gods


We'll then load the house data CSV. Pandas is an incredible library that gives us great flexibility in dealing with table-like data. We load tables (or csv files, or excel sheets) into a "data frame", and process it however we like. You can think of it as a programatic way to do a lot of the things you previously did with Excel.

In [28]:
dataframe = pd.read_csv("input.csv") # Let's have Pandas load our dataset as a dataframe
dataframe = dataframe.drop(["Date", "Floors"], axis=1) # Remove columns we don't care about
dataframe = dataframe[0:10] # We'll only use the first 10 rows of the dataset in this example
dataframe = dataframe.replace({',': ''}, regex=True)
dataframe = dataframe.apply(pd.to_numeric)
dataframe # Let's have the notebook show us how the dataframe looks now

Unnamed: 0,Calories Burned,Steps,Distance,Minutes Sedentary,Minutes Lightly Active,Minutes Fairly Active,Minutes Very Active,Activity Calories
0,2615,5527,4.41,739,153,42,31,1134
1,2372,5175,3.61,806,118,21,27,816
2,3045,9914,7.52,577,238,43,41,1659
3,2245,2245,1.56,897,174,0,0,709
4,2940,6629,5.23,622,171,77,46,1532
5,3114,7061,4.92,528,110,116,68,1710
6,2642,6293,4.39,598,183,25,31,1132
7,2229,1941,1.35,697,132,21,5,670
8,2481,4583,3.77,803,86,44,38,939
9,2313,1804,1.26,762,119,45,14,856


The dataframe now only has the features. Let's introduce the labels.

In [29]:
labels = pd.read_csv("output.csv") # Let's have Pandas load our dataset as a dataframe
labels = labels.drop(["Date"], axis=1) # Remove columns we don't care about
labels = labels[0:10] # We'll only use the first 10 rows of the dataset in this example
labels # Let's have the notebook show us how the dataframe looks now

Unnamed: 0,Minutes Asleep,Minutes Awake,Number of Awakenings,Time in Bed
0,397,20,2,417
1,482,20,1,502
2,508,33,2,541
3,353,14,1,369
4,428,11,0,439
5,503,36,3,539
6,502,42,0,544
7,506,33,1,539
8,484,31,0,516
9,485,15,0,500


Now that we have all our data in the dataframe, we'll need to shape it in matrices to feed it to TensorFlow

In [30]:
inputX = dataframe.as_matrix()
inputY = labels.as_matrix()

So now our input matrix looks like this:

In [31]:
inputX

array([[  2.61500000e+03,   5.52700000e+03,   4.41000000e+00,
          7.39000000e+02,   1.53000000e+02,   4.20000000e+01,
          3.10000000e+01,   1.13400000e+03],
       [  2.37200000e+03,   5.17500000e+03,   3.61000000e+00,
          8.06000000e+02,   1.18000000e+02,   2.10000000e+01,
          2.70000000e+01,   8.16000000e+02],
       [  3.04500000e+03,   9.91400000e+03,   7.52000000e+00,
          5.77000000e+02,   2.38000000e+02,   4.30000000e+01,
          4.10000000e+01,   1.65900000e+03],
       [  2.24500000e+03,   2.24500000e+03,   1.56000000e+00,
          8.97000000e+02,   1.74000000e+02,   0.00000000e+00,
          0.00000000e+00,   7.09000000e+02],
       [  2.94000000e+03,   6.62900000e+03,   5.23000000e+00,
          6.22000000e+02,   1.71000000e+02,   7.70000000e+01,
          4.60000000e+01,   1.53200000e+03],
       [  3.11400000e+03,   7.06100000e+03,   4.92000000e+00,
          5.28000000e+02,   1.10000000e+02,   1.16000000e+02,
          6.80000000e+01,   1.7

And our labels matrix looks like this:

In [32]:
inputY

array([[397,  20,   2, 417],
       [482,  20,   1, 502],
       [508,  33,   2, 541],
       [353,  14,   1, 369],
       [428,  11,   0, 439],
       [503,  36,   3, 539],
       [502,  42,   0, 544],
       [506,  33,   1, 539],
       [484,  31,   0, 516],
       [485,  15,   0, 500]])

Let's prepare some parameters for the training process

In [33]:
# Parameters
learning_rate = 0.000001
training_epochs = 2000
display_step = 50
n_samples = inputY.size

And now to define the TensorFlow operations. Notice that this is a declaration step where we tell TensorFlow how the prediction is calculated. If we execute it, no calculation would be made. It would just acknowledge that it now knows how to do the operation.

In [55]:
x = tf.placeholder(tf.float32, [None, 8])   # Okay TensorFlow, we'll feed you an array of examples. Each example will
                                            # be an array of two float values (area, and number of bathrooms).
                                            # "None" means we can feed you any number of examples
                                            # Notice we haven't fed it the values yet
            
W = tf.Variable(tf.zeros([8, 4]))           # Maintain a 2 x 2 float matrix for the weights that we'll keep updating 
                                            # through the training process (make them all zero to begin with)
    
b = tf.Variable(tf.zeros([4]))              # Also maintain two bias values

y_values = tf.add(tf.matmul(x, W), b)       # The first step in calculating the prediction would be to multiply
                                            # the inputs matrix by the weights matrix then add the biases
    
y = tf.nn.softmax(y_values)                 # Then we use softmax as an "activation function" that translates the
                                            # numbers outputted by the previous layer into probability form
    
y_ = tf.placeholder(tf.float32, [None,4])   # For training purposes, we'll also feed you a matrix of labels

Let's specify our cost function and use Gradient Descent

In [56]:

# Cost function: Mean squared error
cost = tf.reduce_sum(tf.pow(y_ - y, 2))/(2*n_samples)
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)


In [57]:
# Initialize variabls and tensorflow session
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

Instructions for updating:
Use `tf.global_variables_initializer` instead.


*Drum roll*

And now for the actual training

In [58]:
for i in range(training_epochs):  
    sess.run(optimizer, feed_dict={x: inputX, y_: inputY}) # Take a gradient descent step using our inputs and labels

    # That's all! The rest of the cell just outputs debug messages. 
    # Display logs per epoch step
    if (i) % display_step == 0:
        cc = sess.run(cost, feed_dict={x: inputX, y_:inputY})
        print "Training step:", '%04d' % (i), "cost=", "{:.9f}".format(cc) #, \"W=", sess.run(W), "b=", sess.run(b)

print "Optimization Finished!"
training_cost = sess.run(cost, feed_dict={x: inputX, y_: inputY})
print "Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n'


Training step: 0000 cost= 4141163.250000000
Training step: 0050 cost= nan
Training step: 0100 cost= nan
Training step: 0150 cost= nan
Training step: 0200 cost= nan
Training step: 0250 cost= nan
Training step: 0300 cost= nan
Training step: 0350 cost= nan
Training step: 0400 cost= nan
Training step: 0450 cost= nan
Training step: 0500 cost= nan
Training step: 0550 cost= nan
Training step: 0600 cost= nan
Training step: 0650 cost= nan
Training step: 0700 cost= nan
Training step: 0750 cost= nan
Training step: 0800 cost= nan
Training step: 0850 cost= nan
Training step: 0900 cost= nan
Training step: 0950 cost= nan
Training step: 1000 cost= nan
Training step: 1050 cost= nan
Training step: 1100 cost= nan
Training step: 1150 cost= nan
Training step: 1200 cost= nan
Training step: 1250 cost= nan
Training step: 1300 cost= nan
Training step: 1350 cost= nan
Training step: 1400 cost= nan
Training step: 1450 cost= nan
Training step: 1500 cost= nan
Training step: 1550 cost= nan
Training step: 1600 cost= 

Now the training is done. TensorFlow is now holding on to our trained model (Which is basically just the defined operations, plus the variables W and b that resulted from the training process).

Is a cost value of 0.109537 good or bad? I have no idea. At least it's better than the first cost value of 0.114958666. Let's use the model on our dataset to see how it does, though:

In [59]:
sess.run(y, feed_dict={x: inputX })

array([[ nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,  nan]], dtype=float32)

So It's guessing they're all good houses. That makes it get 7/10 correct. Not terribly impressive. A model with a hidden layer should do better, I guess.

Btw, this is how I calculated the softmax values in the post:

In [14]:
sess.run(tf.nn.softmax([1., 2.]))

array([ 0.26894143,  0.7310586 ], dtype=float32)