<a href="https://colab.research.google.com/github/Sweta-Das/TensorFlow-Python-Projects/blob/Fundamentals/freecodecamp/2_Classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Classification is used to separate data points into classes of different labels.

**Ref**: https://www.tensorflow.org/tutorials/estimator/premade

In [None]:
import pandas as pd
import tensorflow as tf
from __future__ import absolute_import, division, print_function, unicode_literals

Here, we are using *Iris Flower Dataset*. This specific dataset separates flowers into 3 different classes of species:
- Setosa
- Versicolor
- Virginica </br>

The features attached with these species are:
- Sepal Length
- Sepal Width
- Petal Length
- Petal Width

In [None]:
column_headers = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species']
species = ['Setosa', 'Virginica', 'Versicolor']

Using keras module of tensorflow to get training and testing dataset.

In [5]:
train_path = tf.keras.utils.get_file(
    "iris_training.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_training.csv"
)

test_path = tf.keras.utils.get_file(
    "iris_test.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv"
)

train = pd.read_csv(train_path, names=column_headers, header=0)
test = pd.read_csv(test_path, names=column_headers, header=0)

Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/iris_training.csv
[1m2194/2194[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2us/step
Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv
[1m573/573[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5us/step


In [6]:
train.head()

Unnamed: 0,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
0,6.4,2.8,5.6,2.2,2
1,5.0,2.3,3.3,1.0,1
2,4.9,2.5,4.5,1.7,2
3,4.9,3.1,1.5,0.1,0
4,5.7,3.8,1.7,0.3,0


Here, we can see that the Species column has also been encoder to give numerical representation of different species.

In [7]:
train.shape

(120, 5)

In [None]:
# Removing the final labels from train and test dataset `Species`
train_y = train.pop("Species")
test_y = test.pop("Species")

In [8]:
# Input Function
def input_fn(features,
             labels,
             training=True,
             batch_size=256):

  # Converting inputs to a Dataset
  dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))

  # If in training mode, shuffling and repeating the dataset
  if training:
    dataset = dataset.shuffle(1000).repeat()

  return dataset.batch(batch_size)

In [9]:
# Extracting feature columns
feature_columns = []
for key in train.keys():
  feature_columns.append(tf.feature_column.numeric_column(key=key))
feature_columns

Instructions for updating:
Use Keras preprocessing layers instead, either directly or via the `tf.keras.utils.FeatureSpace` utility. Each of `tf.feature_column.*` has a functional equivalent in `tf.keras.layers` for feature preprocessing when training a Keras model.


[NumericColumn(key='SepalLength', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 NumericColumn(key='SepalWidth', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 NumericColumn(key='PetalLength', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 NumericColumn(key='PetalWidth', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 NumericColumn(key='Species', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)]

Here, we can see that the feature columns are numeric along with their datatype, default value, & shape.

### Classification Model

For classification tasks, there are variety of different estimators/models that we can choose within tensforflow. Some options include:
- DNNClassifier (Deep Neural Network)
- LinearClassifier

In [19]:
# Building a DNN using Sequential API of keras model

# Input layer for features
inputs = {
    col.name: tf.keras.layers.Input(name=col.name, shape=(), dtype=tf.float32)
    for col in feature_columns
}

# Convert feature inputs into a dense concatenated vector
concatenated_inputs = tf.keras.layers.Concatenate()(
    [tf.keras.layers.Reshape((1,))(inputs[col.name]) for col in feature_columns]
)

# Create the model (DNN)
x = tf.keras.layers.Dense(30, activation="relu")(concatenated_inputs)
x = tf.keras.layers.Dense(10, activation="relu")(x)
outputs = tf.keras.layers.Dense(3, activation="softmax")(x)

classifier = tf.keras.Model(inputs=inputs, outputs=outputs)

In [20]:
classifier.summary()