# Basic Classification Example with TensorFlow

This notebook is a companion of [A Visual and Interactive Guide to the Basics of Neural Networks](https://jalammar.github.io/visual-interactive-guide-basics-neural-networks/).

This is an example of how to do classification on a simple dataset in TensorFlow. Basically, we're building a model to help a friend choose a house to buy. She has given us the table below of houses and whether she likes them or not. We're to build a model that takes a house area and number of bathrooms as input, and outputs a prediction of whether she would like the house or not.

| Area (sq ft) (x1) | Bathrooms (x2) | Label (y) |
 | --- | --- | --- |
 | 2,104 |  3 | Good |
 | 1,600 |  3 | Good |
 | 2,400 |  3 | Good |
 | 1,416 | 	2 | Bad |
 | 3,000 | 	4 | Bad |
 | 1,985 | 	4 | Good |
 | 1,534 | 	3 | Bad |
 | 1,427 | 	3 | Good |
 | 1,380 | 	3 | Good |
 | 1,494 | 	3 | Good |
 
 
 
 We'll start by loading our favorite libraries

In [1]:
%matplotlib inline               
import pandas as pd              # A beautiful library to help us work with data as tables
import numpy as np               # So we can use number matrices. Both pandas and TensorFlow need it. 
import matplotlib.pyplot as plt  # Visualize the things
import tensorflow as tf          # Fire from the gods


We'll then load the house data CSV. Pandas is an incredible library that gives us great flexibility in dealing with table-like data. We load tables (or csv files, or excel sheets) into a "data frame", and process it however we like. You can think of it as a programatic way to do a lot of the things you previously did with Excel.

In [2]:
dataframe = pd.read_csv("data.csv") # Let's have Pandas load our dataset as a dataframe
dataframe = dataframe.drop(["index", "price", "sq_price"], axis=1) # Remove columns we don't care about
dataframe = dataframe[0:30] # We'll only use the first 10 rows of the dataset in this example
dataframe.head() # Let's have the notebook show us how the dataframe looks now

Unnamed: 0,area,bathrooms
0,2104.0,3.0
1,1600.0,3.0
2,2400.0,3.0
3,1416.0,2.0
4,3000.0,4.0


The dataframe now only has the features. Let's introduce the labels.

In [3]:
dataframe.loc[:, ("y1")] = np.random.randint(0,2,30)
dataframe.head()

Unnamed: 0,area,bathrooms,y1
0,2104.0,3.0,1
1,1600.0,3.0,1
2,2400.0,3.0,0
3,1416.0,2.0,1
4,3000.0,4.0,1


In [4]:
 # This is our friend's list of which houses she liked # 1 = good, 0 = bad
dataframe.loc[:, ("y2")] = dataframe["y1"] == 0           # y2 is the negation of y1
dataframe.loc[:, ("y2")] = dataframe["y2"].astype(int)    # Turn TRUE/FALSE values into 1/0
# y2 means we don't like a house
# (Yes, it's redundant. But learning to do it this way opens the door to Multiclass classification)
dataframe.head() # How is our dataframe looking now?

Unnamed: 0,area,bathrooms,y1,y2
0,2104.0,3.0,1,0
1,1600.0,3.0,1,0
2,2400.0,3.0,0,1
3,1416.0,2.0,1,0
4,3000.0,4.0,1,0


Now that we have all our data in the dataframe, we'll need to shape it in matrices to feed it to TensorFlow

In [8]:
inputX = dataframe.loc[:, ['area', 'bathrooms']].values
inputY = dataframe.loc[:, ["y1", "y2"]].values

So now our input matrix looks like this:

In [9]:
inputX

array([[2.104e+03, 3.000e+00],
       [1.600e+03, 3.000e+00],
       [2.400e+03, 3.000e+00],
       [1.416e+03, 2.000e+00],
       [3.000e+03, 4.000e+00],
       [1.985e+03, 4.000e+00],
       [1.534e+03, 3.000e+00],
       [1.427e+03, 3.000e+00],
       [1.380e+03, 3.000e+00],
       [1.494e+03, 3.000e+00],
       [1.940e+03, 4.000e+00],
       [2.000e+03, 3.000e+00],
       [1.890e+03, 3.000e+00],
       [4.478e+03, 5.000e+00],
       [1.268e+03, 3.000e+00],
       [2.300e+03, 4.000e+00],
       [1.320e+03, 2.000e+00],
       [1.236e+03, 3.000e+00],
       [2.609e+03, 4.000e+00],
       [3.031e+03, 4.000e+00],
       [1.767e+03, 3.000e+00],
       [1.888e+03, 2.000e+00],
       [1.604e+03, 3.000e+00],
       [1.962e+03, 4.000e+00],
       [3.890e+03, 3.000e+00],
       [1.100e+03, 3.000e+00],
       [1.458e+03, 3.000e+00],
       [2.526e+03, 3.000e+00],
       [2.200e+03, 3.000e+00],
       [2.637e+03, 3.000e+00]])

And our labels matrix looks like this:

In [10]:
inputY

array([[1, 0],
       [1, 0],
       [0, 1],
       [1, 0],
       [1, 0],
       [0, 1],
       [0, 1],
       [1, 0],
       [0, 1],
       [0, 1],
       [0, 1],
       [1, 0],
       [0, 1],
       [1, 0],
       [1, 0],
       [1, 0],
       [0, 1],
       [1, 0],
       [1, 0],
       [0, 1],
       [1, 0],
       [1, 0],
       [1, 0],
       [0, 1],
       [0, 1],
       [1, 0],
       [1, 0],
       [1, 0],
       [0, 1],
       [0, 1]])

Let's prepare some parameters for the training process

In [43]:
# Parameters
learning_rate = 0.000001
training_epochs = 1000
display_step = 10
n_samples = inputY.size

And now to define the TensorFlow operations. Notice that this is a declaration step where we tell TensorFlow how the prediction is calculated. If we execute it, no calculation would be made. It would just acknowledge that it now knows how to do the operation.

In [30]:
# We select the TF1 compatibility 

import tensorflow.compat.v1 as v1
#v1.enable_eager_execution()
tf.compat.v1.disable_eager_execution()

In [31]:
x = v1.placeholder(v1.float32, [None, 2])   # Okay TensorFlow, we'll feed you an array of examples. Each example will
                                            # be an array of two float values (area, and number of bathrooms).
                                            # "None" means we can feed you any number of examples
                                            # Notice we haven't fed it the values yet
            
W = v1.Variable(tf.zeros([2, 2]))           # Maintain a 2 x 2 float matrix for the weights that we'll keep updating 
                                            # through the training process (make them all zero to begin with)
    
b = v1.Variable(tf.zeros([2]))              # Also maintain two bias values

y_values = v1.add(tf.matmul(x, W), b)       # The first step in calculating the prediction would be to multiply
                                            # the inputs matrix by the weights matrix then add the biases
    
y = v1.nn.softmax(y_values)                 # Then we use softmax as an "activation function" that translates the
                                            # numbers outputted by the previous layer into probability form
    
y_ = v1.placeholder(tf.float32, [None,2])   # For training purposes, we'll also feed you a matrix of labels

Let's specify our cost function and use Gradient Descent

In [33]:

# Cost function: Mean squared error
cost = tf.reduce_sum(tf.pow(y_ - y, 2))/(2*n_samples)
# Gradient descent
optimizer = v1.train.GradientDescentOptimizer(learning_rate).minimize(cost)


In [38]:
# Initialize variabls and tensorflow session
init = v1.initialize_all_variables()
sess = v1.Session()
sess.run(init)

*Drum roll*

And now for the actual training

In [44]:
for i in range(training_epochs):  
    sess.run(optimizer, feed_dict={x: inputX, y_: inputY}) # Take a gradient descent step using our inputs and labels

    # That's all! The rest of the cell just outputs debug messages. 
    # Display logs per epoch step
    if (i) % display_step == 0:
        cc = sess.run(cost, feed_dict={x: inputX, y_:inputY})
        print("Training step:", '%04d' % (i), "cost=", "{:.9f}".format(cc)) #, \"W=", sess.run(W), "b=", sess.run(b)

print("Optimization Finished!")
training_cost = sess.run(cost, feed_dict={x: inputX, y_: inputY})
print("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')


Training step: 0000 cost= 0.123911470
Training step: 0010 cost= 0.123911470
Training step: 0020 cost= 0.123911455
Training step: 0030 cost= 0.123911448
Training step: 0040 cost= 0.123911448
Training step: 0050 cost= 0.123911448
Training step: 0060 cost= 0.123911440
Training step: 0070 cost= 0.123911433
Training step: 0080 cost= 0.123911433
Training step: 0090 cost= 0.123911433
Training step: 0100 cost= 0.123911418
Training step: 0110 cost= 0.123911418
Training step: 0120 cost= 0.123911418
Training step: 0130 cost= 0.123911418
Training step: 0140 cost= 0.123911403
Training step: 0150 cost= 0.123911403
Training step: 0160 cost= 0.123911403
Training step: 0170 cost= 0.123911403
Training step: 0180 cost= 0.123911388
Training step: 0190 cost= 0.123911388
Training step: 0200 cost= 0.123911388
Training step: 0210 cost= 0.123911388
Training step: 0220 cost= 0.123911381
Training step: 0230 cost= 0.123911373
Training step: 0240 cost= 0.123911373
Training step: 0250 cost= 0.123911373
Training ste

Now the training is done. TensorFlow is now holding on to our trained model (Which is basically just the defined operations, plus the variables W and b that resulted from the training process).

Is a cost value of 0.109537 good or bad? I have no idea. At least it's better than the first cost value of 0.114958666. Let's use the model on our dataset to see how it does, though:

In [45]:
sess.run(y, feed_dict={x: inputX })

array([[0.54504037, 0.4549596 ],
       [0.53430134, 0.4656987 ],
       [0.5513286 , 0.44867137],
       [0.53035897, 0.469641  ],
       [0.5640359 , 0.4359641 ],
       [0.5425214 , 0.4574786 ],
       [0.53289247, 0.46710756],
       [0.5306073 , 0.46939278],
       [0.52960306, 0.47039694],
       [0.53203833, 0.46796167],
       [0.5415633 , 0.45843676],
       [0.5428275 , 0.4571725 ],
       [0.5404851 , 0.45951492],
       [0.59493035, 0.4050697 ],
       [0.52720916, 0.47279087],
       [0.54921913, 0.45078084],
       [0.5283075 , 0.4716925 ],
       [0.52652496, 0.47347507],
       [0.555772  , 0.44422793],
       [0.56468964, 0.43531036],
       [0.53786373, 0.46213627],
       [0.54042923, 0.45957077],
       [0.5343867 , 0.46561328],
       [0.54203176, 0.4579683 ],
       [0.5826943 , 0.41730574],
       [0.52361596, 0.476384  ],
       [0.5312695 , 0.46873057],
       [0.5540005 , 0.44599947],
       [0.5470815 , 0.4529185 ],
       [0.5563518 , 0.44364825]], dtype=flo

So It's guessing they're all good houses. That makes it get 7/10 correct. Not terribly impressive. A model with a hidden layer should do better, I guess.

Btw, this is how I calculated the softmax values in the post:

In [46]:
sess.run(tf.nn.softmax([1., 2.]))

array([0.26894143, 0.7310586 ], dtype=float32)