Model we will build: classification. Classification basically groups data points by telling us all classes each data point can have.

Importing everything we need.

In [None]:
from __future__ import absolute_import, division, print_function, unicode_literals

import pandas as pd

import tensorflow as tf

We will use a flower dataset based on sepal length & width and petal length & width. We have to find which species each flower belongs to.

In [None]:
COLUMN_NAMES = ['SepalLength','SepalWidth','PetalLength','PetalWidth', 'Species']
SPECIES = ["Setosa", "Versicolor", "Virginica"]

train_path = tf.keras.utils.get_file(
    "iris_training.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_training.csv")
test_path = tf.keras.utils.get_file(
    "iris_test.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv")

train = pd.read_csv(train_path, names=COLUMN_NAMES, header=0)
test = pd.read_csv(test_path, names=COLUMN_NAMES, header=0)

y_train = train.pop('Species')
y_test = test.pop('Species')

Preparing data for our model as before, just the function is different. We just convert to a dataset and shuffle.

In [None]:
def input_fn(features, labels, training=True, batch_size=256):
    # Convert the inputs to a Dataset.
    dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))

    # Shuffle and repeat if you are in training mode.
    if training:
        dataset = dataset.shuffle(1000).repeat()

    return dataset.batch(batch_size)

Feature columns as before.

In [None]:
feature_columns = []

for feature_name in train.keys():
    feature_columns.append(tf.feature_column.numeric_column(feature_name))

Let's build the model. We will use a deep neural network classifier because the linear classifier may not find a linear correspondence in our data and basically fail.

In [None]:
classifier = tf.estimator.DNNClassifier(
    feature_columns=feature_columns,
    hidden_units=[30,10],
    n_classes=3
)

Training. lambda helps us avoid the double function we had in LinearRegressionML by doing the same thing but in 1 line.

Steps are not really epochs. We will loop through the data set until we have trained on 5000 data points.

In [None]:
classifier.train(
    input_fn=lambda: input_fn(train, y_train), steps=5000
)

Time to evaluate our model.

In [None]:
result = classifier.evaluate(input_fn=lambda: input_fn(test, y_test, training=False))

print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**result))

Little script for predicting 1 flower from user input. Kinda self-explanatory.

In [None]:
def input_fn(features, batch_size=256):
    # Convert the inputs to a Dataset without labels.
    return tf.data.Dataset.from_tensor_slices(dict(features)).batch(batch_size)

features = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth']
predict = {}

print("Please type numeric values as prompted.")
for feature in features:
    valid = True
    while valid:
        val = input(feature + ": ")
        if not val.isdigit(): valid = False

    predict[feature] = [float(val)]

predictions = classifier.predict(input_fn=lambda: input_fn(predict))
for pred_dict in predictions:
    class_id = pred_dict['class_ids'][0]
    probability = pred_dict['probabilities'][class_id]

    print('Prediction is "{}" ({:.1f}%)'.format(
        SPECIES[class_id], 100 * probability))

That's it for classification. Really useful ML in my opinion.