___

<a href='http://www.pieriandata.com'> <img src='../Pierian_Data_Logo.png' /></a>
___

# Tensorflow Project Exercise
Let's wrap up this Deep Learning by taking a a quick look at the effectiveness of Neural Nets!

We'll use the [Bank Authentication Data Set](https://archive.ics.uci.edu/ml/datasets/banknote+authentication) from the UCI repository.

The data consists of 5 columns:

* variance of Wavelet Transformed image (continuous)
* skewness of Wavelet Transformed image (continuous)
* curtosis of Wavelet Transformed image (continuous)
* entropy of image (continuous)
* class (integer)

Where class indicates whether or not a Bank Note was authentic.

This sort of task is perfectly suited for Neural Networks and Deep Learning! Just follow the instructions below to get started!

## Get the Data

** Use pandas to read in the bank_note_data.csv file **

In [1]:
import pandas as pd
df = pd.read_csv('bank_note_data.csv')

In [2]:
from sklearn.preprocessing import StandardScaler

**Create a StandardScaler() object called scaler.**

In [3]:
scaler = StandardScaler()

**Fit scaler to the features.**

In [4]:
X = df.drop('Class', axis=1)
y = df['Class']

**Use the .transform() method to transform the features to a scaled version.**

In [5]:
scaler.fit(X)

StandardScaler()

In [6]:
features = scaler.fit_transform(X)

In [7]:
df2 = pd.DataFrame(features, columns = df.columns[:-1])

**Convert the scaled features to a dataframe and check the head of this dataframe to make sure the scaling worked.**

In [8]:
df.head()

Unnamed: 0,Image.Var,Image.Skew,Image.Curt,Entropy,Class
0,3.6216,8.6661,-2.8073,-0.44699,0
1,4.5459,8.1674,-2.4586,-1.4621,0
2,3.866,-2.6383,1.9242,0.10645,0
3,3.4566,9.5228,-4.0112,-3.5944,0
4,0.32924,-4.4552,4.5718,-0.9888,0


In [9]:
df2.head()

Unnamed: 0,Image.Var,Image.Skew,Image.Curt,Entropy
0,1.121806,1.149455,-0.97597,0.354561
1,1.447066,1.064453,-0.895036,-0.128767
2,1.20781,-0.777352,0.122218,0.618073
3,1.063742,1.295478,-1.255397,-1.144029
4,-0.036772,-1.087038,0.73673,0.096587


## Train Test Split

** Create two objects X and y which are the scaled feature values and labels respectively.**

In [21]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df2, y, test_size=0.3)

In [11]:
import tensorflow as tf

** Use SciKit Learn to create training and testing sets of the data as we've done in previous lectures:**

In [12]:
image_var = tf.feature_column.numeric_column('Image.Var')
image_skew = tf.feature_column.numeric_column('Image.Skew')
image_curt = tf.feature_column.numeric_column('Image.Curt')
entropy = tf.feature_column.numeric_column('Entropy')

In [13]:
feat_cols = [image_var, image_skew, image_curt, entropy]

# Tensorflow

** Create a list of feature column objects using tf.feature.numeric_column() as we did in the lecture**

In [14]:
import matplotlib
import tensorflow as tf
import tensorflow_estimator as tfe

In [15]:
classifier = tf.estimator.DNNClassifier(n_classes=2, hidden_units=[10, 20, 10], feature_columns = feat_cols)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': 'C:\\Users\\thecl\\AppData\\Local\\Temp\\tmp_y_ue168', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x0000021C74367C50>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


** Create an object called classifier which is a DNNClassifier from learn. Set it to have 2 classes and a [10,20,10] hidden unit layer structure:**

In [22]:
estimated = tf.estimator.inputs.pandas_input_fn(x=X_train, y=y_train, batch_size = 20, shuffle = True)

** Now create a tf.estimator.pandas_input_fn that takes in your X_train, y_train, batch_size and set shuffle=True. You can play around with the batch_size parameter if you want, but let's start by setting it to 20 since our data isn't very big. **

In [23]:
classifier.train(input_fn=estimated, steps=500)

Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
INFO:tensorflow:Calling model_fn.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Instructions for updating:
Use `tf.cast` instead.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
INFO:tensorflow:Savi

<tensorflow_estimator.python.estimator.canned.dnn.DNNClassifier at 0x21c74367a58>

** Now train classifier to the input function. Use steps=500. You can play around with these values if you want!**

*Note: Ignore any warnings you get, they won't effect your output*

In [24]:
pred_fn = tf.estimator.inputs.pandas_input_fn(x=X_test, batch_size=len(X_test), shuffle=False)

## Model Evaluation

** Create another pandas_input_fn that takes in the X_test data for x. Remember this one won't need any y_test info since we will be using this for the network to create its own predictions. Set shuffle=False since we don't need to shuffle for predictions.**

In [25]:
preds = list(classifier.predict(input_fn=pred_fn))

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\thecl\AppData\Local\Temp\tmp_y_ue168\model.ckpt-48
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.


** Use the predict method from the classifier model to create predictions from X_test **

In [26]:
preds[0]

{'logits': array([-8.901192], dtype=float32),
 'logistic': array([0.00013632], dtype=float32),
 'probabilities': array([9.9986374e-01, 1.3620792e-04], dtype=float32),
 'class_ids': array([0], dtype=int64),
 'classes': array([b'0'], dtype=object),
 'all_class_ids': array([0, 1]),
 'all_classes': array([b'0', b'1'], dtype=object)}

In [27]:
proc_pred = []
for pred in preds:
    proc_pred.append(pred['class_ids'][0])

In [30]:
from sklearn.metrics import classification_report as cr, confusion_matrix as cm

print(cm(y_test, proc_pred))

[[230   2]
 [  1 179]]


** Now create a classification report and a Confusion Matrix. Does anything stand out to you?**

In [31]:
print(cr(y_test, proc_pred))

              precision    recall  f1-score   support

           0       1.00      0.99      0.99       232
           1       0.99      0.99      0.99       180

    accuracy                           0.99       412
   macro avg       0.99      0.99      0.99       412
weighted avg       0.99      0.99      0.99       412



## Optional Comparison

** You should have noticed extremely accurate results from the DNN model. Let's compare this to a Random Forest Classifier for a reality check!**

**Use SciKit Learn to Create a Random Forest Classifier and compare the confusion matrix and classification report to the DNN model**

In [32]:
from sklearn.ensemble import RandomForestClassifier as RFC

In [33]:
rfc = RFC(n_estimators=200)

In [34]:
rfc.fit(X_train, y_train)

RandomForestClassifier(n_estimators=200)

In [35]:
pred = rfc.predict(X_test)

In [36]:
print(cr(y_test, pred))

              precision    recall  f1-score   support

           0       0.99      0.99      0.99       232
           1       0.98      0.99      0.99       180

    accuracy                           0.99       412
   macro avg       0.99      0.99      0.99       412
weighted avg       0.99      0.99      0.99       412



In [37]:
print(cm(y_test, pred))

[[229   3]
 [  2 178]]


** It should have also done very well, possibly perfect! Hopefully you have seen the power of DNN! **

# Great Job!