##### Flower project / Classification problem

<img src="https://raw.githubusercontent.com/Genereux-akotenou/DataScience_Projects/0c5f17d4b70a9f763b30b16a3a3a9c85d2242d67/Iris_Project/favicon.jpg" alt="flower image" />

**Imports and Setup**

In [1]:
%tensorflow_version 2.x

Colab only includes TensorFlow 2.x; %tensorflow_version has no effect.


In [2]:
from __future__ import absolute_import, division, print_function, unicode_literals

import tensorflow as tf
import pandas as pd
from IPython.display import clear_output

**Dataset**
<p>This specific dataset seprates flowers into 3 different classes of species.</p>
<ul>
  <li>Setosa</li>
  <li>Versicolor</li>
  <li>Virginica</li>
</ul>
<p>The information about each flower is the following</p>
<ul>
  <li>Sepal length</li>
  <li>Sepal width</li>
  <li>Petal length</li>
  <li>Petal width</li>
</ul>

In [5]:
# Some constants
CSV_COLUMN_NAMES = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species']
SPECIES = ['Setosa', 'Versicolor', 'Virginica']

In [9]:
train_path = tf.keras.utils.get_file("iris_training.csv", "https://raw.githubusercontent.com/Genereux-akotenou/DataScience_Projects/0c5f17d4b70a9f763b30b16a3a3a9c85d2242d67/Iris_Project/iris_training.csv")
eval_path  = tf.keras.utils.get_file("iris_test.csv", "https://raw.githubusercontent.com/Genereux-akotenou/DataScience_Projects/main/Iris_Project/iris_test.csv")

df_train = pd.read_csv(train_path, names=CSV_COLUMN_NAMES, header=0)
df_eval  = pd.read_csv(eval_path, names=CSV_COLUMN_NAMES, header=0)


**Data visualization**

In [14]:
df_train.head()

Unnamed: 0,SepalLength,SepalWidth,PetalLength,PetalWidth
0,6.4,2.8,5.6,2.2
1,5.0,2.3,3.3,1.0
2,4.9,2.5,4.5,1.7
3,4.9,3.1,1.5,0.1
4,5.7,3.8,1.7,0.3


In [10]:
df_eval.head()

Unnamed: 0,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
0,5.9,3.0,4.2,1.5,1
1,6.9,3.1,5.4,2.1,2
2,5.1,3.3,1.7,0.5,0
3,6.0,3.4,4.5,1.6,1
4,5.5,2.5,4.0,1.3,1


**Dataset processing**

In [11]:
# Let pop the species column off and use that as our label
y_train = df_train.pop('Species')
y_eval  = df_eval.pop('Species')

# Here we create our features column
features_column = []
for key in df_train.keys():
    features_column.append(tf.feature_column.numeric_column(key=key))

# Let get a look
print(features_column)    

[NumericColumn(key='SepalLength', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), NumericColumn(key='SepalWidth', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), NumericColumn(key='PetalLength', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), NumericColumn(key='PetalWidth', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)]


**Input function**

In [15]:
def input_fn(features, labels, training=True, batch_size=256):
    # Let convert input to tensforflow Dataset fromat
    ds = tf.data.Dataset.from_tensor_slices((dict(features), labels))
    
    # Shuffle if in training mode
    if training:
        ds = ds.shuffle(2000).repeat()
        
    return ds.batch(batch_size)

**Building the model**
<p>For classification tasks we have a variaty of model we canpick from. Here we'll just choose the DNN as we may not able to find a linear correspondance in our data</p>

In [17]:
# Build a DNN wiith 2 hidden  layers with 30 and 10 hidden nodes each
classifier = tf.estimator.DNNClassifier(
    feature_columns=features_column, 
    hidden_units=[30, 10], 
    n_classes=3)
clear_output()

**Training the model**

In [19]:
classifier.train(
    input_fn=lambda: input_fn(df_train, y_train, training=True), 
    steps=5000)
clear_output()

In [21]:
# eval
eval_result = classifier.evaluate(
    input_fn=lambda: input_fn(df_eval, y_eval, training=False)
)
print('='*30)
print('\nTest set accuracy : {accuracy:0.3f}\n'.format(**eval_result))
print('='*30)


Test set accuracy : 0.967



**Make prediction**

In [30]:
def prediction():
    def input_fn(features, batch_size=256):
        return tf.data.Dataset.from_tensor_slices(dict(features)).batch(batch_size)
    
    features = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth']
    predict = {}
    
    print("Please type numeric value as prompted:")
    for feature in features:
        valid = True
        while valid:
            val = input(feature + ": ")
            if not val.isdigit(): valid = False
            
        predict[feature] = [float(val)]
        
    predictions = classifier.predict(input_fn=lambda: input_fn(predict))
    for prediction in predictions:
        print(prediction)
        class_id = prediction['class_ids'][0]
        probability = prediction['probabilities'][class_id]
        
        print('{}\nPrediction: These features describe "{} at" ({:.1f}%)\n{}'.format('='*115, SPECIES[class_id], 100 * probability, '='*115))

In [31]:
prediction()

Please type numeric value as prompted:
SepalLength: 5.3
SepalWidth: 5
SepalWidth: 5.6
PetalLength: 5.9
PetalWidth: 7.9
{'logits': array([-5.857622  , -3.6272283 ,  0.50430566], dtype=float32), 'probabilities': array([0.00169588, 0.01577763, 0.98252654], dtype=float32), 'class_ids': array([2]), 'classes': array([b'2'], dtype=object), 'all_class_ids': array([0, 1, 2], dtype=int32), 'all_classes': array([b'0', b'1', b'2'], dtype=object)}
Prediction: These features describe "Virginica at" (98.3%)


In [32]:
prediction()

Please type numeric value as prompted:
SepalLength: 6.9
SepalWidth: 3.1
PetalLength: 5.4
PetalWidth: 2.1
{'logits': array([-5.0024223, -1.8255746, -1.0569427], dtype=float32), 'probabilities': array([0.01304254, 0.3126436 , 0.6743139 ], dtype=float32), 'class_ids': array([2]), 'classes': array([b'2'], dtype=object), 'all_class_ids': array([0, 1, 2], dtype=int32), 'all_classes': array([b'0', b'1', b'2'], dtype=object)}
Prediction: These features describe "Virginica at" (67.4%)


In [None]:
#---------------------------------------------
#here we have an test case with espected result
#| SPECIES     | SL   SW   PL   PW  |
#"Sotosa"     => 5.1  3.3  1.7  0.5 
#"Versicolor" => 5.9  3.0  4.2  1.5
#"Virginica"  => 6.9  3.1  5.4  2.1
#---------------------------------------------