# Tensorflow Abstractions 

## Loading the data

In [1]:
from sklearn.datasets import load_wine
wine_data = load_wine()
type(wine_data)

sklearn.utils.Bunch

It is a special type of sklearn dictionary file

In [2]:
wine_data.keys()

dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names'])

In [3]:
print(wine_data['DESCR'])

.. _wine_dataset:

Wine recognition dataset
------------------------

**Data Set Characteristics:**

    :Number of Instances: 178 (50 in each of three classes)
    :Number of Attributes: 13 numeric, predictive attributes and the class
    :Attribute Information:
 		- Alcohol
 		- Malic acid
 		- Ash
		- Alcalinity of ash  
 		- Magnesium
		- Total phenols
 		- Flavanoids
 		- Nonflavanoid phenols
 		- Proanthocyanins
		- Color intensity
 		- Hue
 		- OD280/OD315 of diluted wines
 		- Proline

    - class:
            - class_0
            - class_1
            - class_2
		
    :Summary Statistics:
    
                                   Min   Max   Mean     SD
    Alcohol:                      11.0  14.8    13.0   0.8
    Malic Acid:                   0.74  5.80    2.34  1.12
    Ash:                          1.36  3.23    2.36  0.27
    Alcalinity of Ash:            10.6  30.0    19.5   3.3
    Magnesium:                    70.0 162.0    99.7  14.3
    Total Phenols:                0

3 Classes, which are pretty well balanced

### Splitting the features and targets

In [4]:
feature_data = wine_data['data']
labels = wine_data['target']

## Preprocessing

### Train Test Split

In [5]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(feature_data, labels, test_size=0.3, random_state=0)

### Scaling the Data

In [6]:
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
# Fit only on train data
scaled_X_train = scaler.fit_transform(X_train)
scaled_X_test = scaler.transform(X_test)

## Modeling using Tensorflow EstimatorAPI

In [7]:
import tensorflow as tf
from tensorflow import estimator

In [8]:
X_train.shape

(124, 13)

### Create feature Columns

In [9]:
feat_cols = [tf.feature_column.numeric_column('x', shape=[13])]
feat_cols

[NumericColumn(key='x', shape=(13,), default_value=None, dtype=tf.float32, normalizer_fn=None)]

### Creating the model

In [10]:
deep_model = estimator.DNNClassifier(hidden_units=[13,13,13], feature_columns=feat_cols, n_classes=3,
                                     optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.01))

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpvb6so3my', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f87942d3128>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


### Input Function

In [11]:
input_fn = estimator.inputs.numpy_input_fn(x={'x':scaled_X_train}, y=y_train,
                                           shuffle=True,
                                           batch_size=10,
                                           num_epochs=5
                                           )

### Training the Deep Model

In [12]:
deep_model.train(input_fn=input_fn, steps=500)

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
INFO:tensorflow:Calling model_fn.
Instructions for updating:
Use tf.cast instead.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpvb6so3my/model.ckpt.
INFO:tensorflow:loss = 10.58872, step = 1
INFO:tensorflow:Saving checkpoints for 62 into /tmp/tmpvb6so3my/model.ckpt.
INFO:tensorflow:Loss for final step: 6.3498974.


<tensorflow_estimator.python.estimator.canned.dnn.DNNClassifier at 0x7f87942c3f28>

## Evaluating the Model

### Creating the input evaluation function

In [13]:
input_fn_eval = estimator.inputs.numpy_input_fn(x={'x':scaled_X_test}, shuffle=False)

### Predicting the results

In [14]:
preds = deep_model.predict(input_fn=input_fn_eval)
preds = list(preds)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from /tmp/tmpvb6so3my/model.ckpt-62
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.


<br>
<br>
preds is now a list of dictionary objects as shown below

In [15]:
preds[:5]

[{'logits': array([ 2.7056158,  1.8353522, -1.3289821], dtype=float32),
  'probabilities': array([0.69612   , 0.2915637 , 0.01231631], dtype=float32),
  'class_ids': array([0]),
  'classes': array([b'0'], dtype=object)},
 {'logits': array([-1.0873814,  1.0815885,  2.1700175], dtype=float32),
  'probabilities': array([0.0279868 , 0.24486396, 0.72714925], dtype=float32),
  'class_ids': array([2]),
  'classes': array([b'2'], dtype=object)},
 {'logits': array([ 2.0838027,  1.8165513, -1.0939782], dtype=float32),
  'probabilities': array([0.55335486, 0.42358243, 0.02306275], dtype=float32),
  'class_ids': array([0]),
  'classes': array([b'0'], dtype=object)},
 {'logits': array([ 2.6160345,  1.7516572, -1.2186108], dtype=float32),
  'probabilities': array([0.69303775, 0.2919864 , 0.01497585], dtype=float32),
  'class_ids': array([0]),
  'classes': array([b'0'], dtype=object)},
 {'logits': array([ 1.0592661 ,  1.8826504 , -0.05322557], dtype=float32),
  'probabilities': array([0.27724364, 0.6

In [16]:
predictions = [p['class_ids'][0] for p in preds]
predictions[:5]

[0, 2, 0, 0, 1]

### Confusion Matrix/Classification Report

In [17]:
from sklearn.metrics import confusion_matrix, classification_report

In [18]:
print(classification_report(y_test, predictions))

              precision    recall  f1-score   support

           0       0.82      0.95      0.88        19
           1       0.86      0.82      0.84        22
           2       1.00      0.85      0.92        13

   micro avg       0.87      0.87      0.87        54
   macro avg       0.89      0.87      0.88        54
weighted avg       0.88      0.87      0.87        54



In [19]:
confusion_matrix(y_test, predictions)

array([[18,  1,  0],
       [ 4, 18,  0],
       [ 0,  2, 11]])