# Deep Nets with TF Abstractions

Let's explore a few of the various abstractions that TensorFlow offers. You can check out the tf.contrib documentation for more options.

# The Data

To compare these various abstractions we'll use a dataset easily available from the SciKit Learn library. The data is comprised of the results of a chemical analysis of wines grown in the same region in Italy by three different cultivators. There are thirteen different
measurements taken for different constituents found in the three types of wine. We will use the various TF Abstractions to classify the wine to one of the 3 possible labels.

First let's show you how to get the data:

In [1]:
from sklearn.datasets import load_wine
wine_data = load_wine()
print(type(wine_data))

<class 'sklearn.utils.Bunch'>


The data is a sklearn.utils.Bunch object, which is very similar to a dictionary.

In [2]:
wine_data.keys()

dict_keys(['data', 'target_names', 'DESCR', 'target', 'feature_names'])

In [3]:
print(wine_data.DESCR)

Wine Data Database

Notes
-----
Data Set Characteristics:
    :Number of Instances: 178 (50 in each of three classes)
    :Number of Attributes: 13 numeric, predictive attributes and the class
    :Attribute Information:
 		- 1) Alcohol
 		- 2) Malic acid
 		- 3) Ash
		- 4) Alcalinity of ash  
 		- 5) Magnesium
		- 6) Total phenols
 		- 7) Flavanoids
 		- 8) Nonflavanoid phenols
 		- 9) Proanthocyanins
		- 10)Color intensity
 		- 11)Hue
 		- 12)OD280/OD315 of diluted wines
 		- 13)Proline
        	- class:
                - class_0
                - class_1
                - class_2
		
    :Summary Statistics:
    
                                   Min   Max   Mean     SD
    Alcohol:                      11.0  14.8    13.0   0.8
    Malic Acid:                   0.74  5.80    2.34  1.12
    Ash:                          1.36  3.23    2.36  0.27
    Alcalinity of Ash:            10.6  30.0    19.5   3.3
    Magnesium:                    70.0 162.0    99.7  14.3
    Total Phenols:     

You can get a full description with **print(wine_data.DESCR)**. For now, let's go ahead and grab the features and the labels for the data.

In [4]:
feat_data = wine_data['data']
labels = wine_data['target']

### Train Test Split

As with any machine learning model, you should do some sort of test train split so you can evaluate your model's performance. Because this particular dataset is small, we'll just do a simple 70/30 train test split and we won't have any holdout data set.

Again, we'll use SciKit-Learn here for convienence:

In [5]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(feat_data,
                                                    labels,
                                                    test_size=0.3,
                                                   random_state=101)

### Scale the Data

With Neural Network models, its important to scale the data, again we can do this easily with SciKit Learn (I promise we'll get to TensorFlow soon!)

In [6]:
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()

Keep in mind we only fit the scaler to the training data, we don't want to assume we'll have knowledge of future test data. 

In [7]:
scaled_x_train = scaler.fit_transform(X_train)
scaled_x_test = scaler.transform(X_test)

# Abstractions

With our data set up, its now time to explore some TensorFlow abstractions! Let's start with the Estimator API, its one the abstractions featured in the official documentation tutorials.

## Estimator API

We first start by importing both tensorflow and the estimator API.

In [8]:
#import tensorflow as tf
#from tensorflow import estimator 

The estimator API can perform both Deep Neural Network Classification and Regression, as well as straight Linear Classification and Linear Regression. You can  

In [9]:
#X_train.shape

In [10]:
#X_train

In [11]:
#feat_cols = [tf.feature_column.numeric_column("x", shape=[13])]

In [12]:
#deep_model = estimator.DNNClassifier(hidden_units=[13,13,13],
                            feature_columns=feat_cols,
                            n_classes=3,
                            optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.01) )

IndentationError: unexpected indent (<ipython-input-12-709aa116c036>, line 2)

In [None]:
#input_fn = estimator.inputs.numpy_input_fn(x={'x':scaled_x_train},y=y_train,shuffle=True,batch_size=10,num_epochs=5)

In [None]:
#deep_model.train(input_fn=input_fn,steps=500)

In [None]:
#input_fn_eval = estimator.inputs.numpy_input_fn(x={'x':scaled_x_test},shuffle=False)

In [None]:
#preds = list(deep_model.predict(input_fn=input_fn_eval))

In [None]:
#predictions = [p['class_ids'][0] for p in preds]

In [None]:
#from sklearn.metrics import confusion_matrix,classification_report

In [None]:
#print(classification_report(y_test,predictions))

____________
______________

# TensorFlow Keras

### Create the Model

In [13]:
import tensorflow as tf

In [14]:
from tensorflow.contrib.keras import models

In [15]:
dnn_keras_model = models.Sequential()

### Add Layers to the model

In [16]:
from tensorflow.contrib.keras import layers

In [17]:
dnn_keras_model.add(layers.Dense(units=13,input_dim=13,activation='relu'))

In [18]:
dnn_keras_model.add(layers.Dense(units=13,activation='relu'))
dnn_keras_model.add(layers.Dense(units=13,activation='relu'))

In [19]:
dnn_keras_model.add(layers.Dense(units=3,activation='softmax'))

### Compile the Model

In [20]:
from tensorflow.contrib.keras import losses,optimizers,metrics

In [21]:
# explore these
# losses.

In [22]:
#optimizers.

In [23]:
losses.sparse_categorical_crossentropy

<function tensorflow.python.keras.losses.sparse_categorical_crossentropy(y_true, y_pred)>

In [24]:
dnn_keras_model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

### Train Model

In [25]:
dnn_keras_model.fit(scaled_x_train,y_train,epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<tensorflow.python.keras.callbacks.History at 0x2ad9b5cbb38>

In [29]:
predictions = dnn_keras_model.predict_classes(scaled_x_test)

In [30]:
from sklearn.metrics import classification_report

In [31]:
print(classification_report(predictions,y_test))

             precision    recall  f1-score   support

          0       0.84      0.76      0.80        21
          1       0.68      0.83      0.75        18
          2       1.00      0.87      0.93        15

avg / total       0.83      0.81      0.82        54



# Layers API

https://www.tensorflow.org/tutorials/layers

In [66]:
import tensorflow as tf

In [67]:
import pandas as pd

In [68]:
onehot_y_train = pd.get_dummies(y_train).as_matrix()

In [69]:
onehot_y_test = pd.get_dummies(y_test).as_matrix()

In [70]:
num_feat = 13
num_hidden1 = 13
num_hidden2 = 13
num_outputs = 3

In [71]:
learning_rate = 0.01

In [72]:
from tensorflow.contrib.layers import fully_connected

In [73]:
X = tf.placeholder(tf.float32,shape=[None,num_feat])
y_true = tf.placeholder(tf.float32,shape=[None,3])

In [74]:
actf = tf.nn.relu

In [75]:
hidden1 = fully_connected(X,num_hidden1,activation_fn=actf)

In [76]:
hidden2 = fully_connected(hidden1,num_hidden2,activation_fn=actf)

In [77]:
output = fully_connected(hidden2,num_outputs)

In [78]:
loss = tf.losses.softmax_cross_entropy(onehot_labels=y_true, logits=output)

In [79]:
optimizer = tf.train.AdamOptimizer(learning_rate)
train = optimizer.minimize(loss)

In [80]:
init = tf.global_variables_initializer()

In [84]:
training_steps = 2
with tf.Session() as sess:
    sess.run(init)
    
    for i in range(training_steps):
        sess.run(train,feed_dict={X:scaled_x_train,y_true:onehot_y_train })
        
    # Get Predictions
    logits = output.eval(feed_dict={X:scaled_x_test})
    
    preds = tf.argmax(logits,axis=1)
    
    results = preds.eval()

In [85]:
from sklearn.metrics import confusion_matrix,classification_report
print(classification_report(results,y_test))

             precision    recall  f1-score   support

          0       0.42      0.89      0.57         9
          1       0.91      0.51      0.66        39
          2       0.00      0.00      0.00         6

avg / total       0.73      0.52      0.57        54



## Formating Data

In [32]:
#import pandas as pd
#from sklearn.datasets import load_wine
#from sklearn.model_selection import train_test_split
#from sklearn.preprocessing import MinMaxScaler

In [None]:
#wine_data = load_wine()
#feat_data = wine_data['data']
#labels = wine_data['target']

In [None]:
#X_train, X_test, y_train, y_test = train_test_split(feat_data,
                                                    #labels,
                                                    #test_size=0.3,
                                                   #random_state=101)