### Questions
- Tune the hyperparameters using cross-validation and see what precision you can achieve. 
- Now try adding Batch Normalization and compare the learning curves: is it converging faster than before? Does it produce a better model? 
- Is the model overfitting the training set? Try adding dropout to every layer and try again. Does it help?

### Frame
- Create predictor class "DNN_Classifier" using DNN with hyper parameters
    - implement __init__, fit, predict and predict_prob
    - Fit will use class DNN_Helper with methods create_graph(which will create tensor flow graph) and train_dnn (which will train dnn with given hyper-parameters)
    - create_graph will use hyper-parameters like number of neurons, activation function, optimizer class, learning rate etc

### Some Notes About Graph and Session
- Tensorflow computations are represented as graph which indicates operands, operations and their dependencies.
- Graph is run within context of Session which stores current state of the computations
- When creating a tensor or an operation it is automatically added to default graph
- We can also create another graph and make it default using 
       
       ``` python
           with graph.as_default():
               # add tensors and operations
       ```
       
- No computation are run until it is run inside the context of a session
    
    ``` python
        # The session will use current default graph
        with tf.Session() as sess:
            sess.run(op)
        
        # The session will use graph sent as parameter
        with tf.Session(graph=graph) as sess:
            sess.run(op)
        
    ```
    
- ```tf.train.Saver``` can be used to save session and restore it latter when needed
    
    ``` python
    saver.save(session, path)
    saver.restore(session,path)
    ```
    
- Graph must be created before restoring the session. To restore graph also from the checkpoint file use following code which will restore graph as default graph
``` python
    meta_importer = tf.train.import_meta_graph(checkpoint_path+".meta")
    
    # Then restore session
    sess = tf.Session()
    meta_importer.restore(sess, checkpoint_path)
```

- When restoring graph from meta file our tensor and operation variables are not assigned automatically. So we need to find them in graph by name or other way and assign it to variables to easily use them. Some functions are
  - ```graph.get_tensor_by_name("x:0")``` here 0 indicates 1st output of operation x
  - ```graph.get_operation_by_name("is_training")```
  - ```graph.collections``` is list of collection names
  - ```tf.get_collection(collection_name)``` will give all variables in a collection

### Load MINST Data

In [4]:
from sklearn.model_selection import train_test_split
import tensorflow as tf
import numpy as np

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train.astype(np.float32).reshape(-1, 28*28)/255.0
x_test = x_test.astype(np.float32).reshape(-1, 28*28)/255.0

train_indxes_0to4 = y_train<5
train_x_0to4 = x_train[train_indxes_0to4]
train_y_0to4 = y_train[train_indxes_0to4]

test_indxes_0to4 = y_test<5
test_x_0to4 = x_test[test_indxes_0to4]
test_y_0to4 = y_test[test_indxes_0to4]



(train_x_0to4, val_x_0to4, train_y_0to4, val_y_0to4) = \
    train_test_split(train_x_0to4, train_y_0to4, test_size=0.1)

### Utility Functions

In [2]:
import tensorflow as tf
from tensorflow.contrib.layers import variance_scaling_initializer as he_initializer
from tensorflow.nn import sparse_softmax_cross_entropy_with_logits as softmax_xentropy
from tensorflow.layers import dense
import numpy as np


def get_leaky_relu(alpha):
    return lambda z, name=None: tf.maximum(alpha*z,z, name=name)
    

def get_connected_layers(x, n_hidden_layers, n_neurons, n_ouputs, activation=tf.nn.elu,
                                   batch_norm_momentum=None, dropout_rate=None, is_training=None):
    

    initializer = he_initializer()
    
    with tf.name_scope("DNN"):
        inputs = x
        for l in range(n_hidden_layers):
            if dropout_rate is not None:
                ## this function will set inputs to zero with dropout rate probability
                ## and divides remaining inputs with dropout rate
                inputs = tf.layers.dropout(inputs, dropout_rate, training=is_training, 
                                  name=("dropout%d"%l))
                
            inputs = tf.layers.dense(inputs, n_neurons, kernel_initializer=initializer,
                           name="hidden%d"%(l+1), activation=activation)
            
            if batch_norm_momentum is not None:
                inputs = tf.layers.batch_normalization(inputs, momentum=batch_norm_momentum,
                                training=is_training)
            
            inputs = activation(inputs, name="hiden%d_out"%(l+1))
            
        output = tf.layers.dense(inputs, n_ouputs, name="output")
        
    return output
        


def get_softmax_xentropy_loss(logits,y):
    with tf.name_scope("loss"):
        xentropy = softmax_xentropy(labels=y, logits=logits)
        return tf.reduce_mean(xentropy, name="mean_loss")

def get_optimizer_op(optimizer, loss, learning_rate=0.01):
    with tf.name_scope("train"):
        optimizer =  optimizer(learning_rate=learning_rate)
        optimizer_op = optimizer.minimize(loss, name="optimizer_op")
    return optimizer_op

def get_validation_score(logits,y):
    with tf.name_scope("validation"):
        preds = tf.nn.in_top_k(logits,y,1)
        return tf.reduce_mean(tf.cast(preds, dtype=np.float32), name="validation_score")
    
def get_batch(x,y,batch_size):
    n_batches = len(y)//batch_size + 1
    for i in range(n_batches):
        indxes = np.random.choice(len(y), size=batch_size, replace=False)
        yield x[indxes], y[indxes]

    

  from ._conv import register_converters as _register_converters


### Custom Classifier Class

In [3]:
# from imp import reload
# import my_libs
# reload(my_libs.tf_graph_saver)

from sklearn.base import BaseEstimator, TransformerMixin
from my_libs.tf_graph_saver import ScalerGraphSaver2
from sklearn.exceptions import NotFittedError



class DNN_Classifier(BaseEstimator, TransformerMixin):
    def __init__(self, n_hidden_layers=None, n_neurons=None, n_outputs=None, 
                 activation=tf.nn.elu, optimizer=tf.train.AdamOptimizer,  learning_rate=0.01, 
                 batch_norm_momentum=None, batch_size=50, dropout_rate=None):
        self.n_hidden_layers = n_hidden_layers
        self.n_neurons = n_neurons
        self.learning_rate = learning_rate
        self.activation = activation
        self.optimizer = optimizer
        self.batch_norm_momentum = batch_norm_momentum
        self.batch_size = batch_size
        self.dropout_rate = dropout_rate
        self.n_outputs = n_outputs
        self._session = None
        
        
    def _create_graph(self):                      
        
        tf.reset_default_graph()
        self._graph = tf.Graph()
        with self._graph.as_default():
        
            self._x = tf.placeholder(shape=(None, 28*28), dtype=np.float32,name="x")
            self._y = tf.placeholder(shape=(None), dtype=np.int32,name="y")

            self._is_training = tf.placeholder_with_default(False,shape=(), name="is_training")


            self._dnn = get_connected_layers(self._x, self.n_hidden_layers, self.n_neurons, 
                                       self.n_outputs, activation=self.activation, 
                                       batch_norm_momentum=self.batch_norm_momentum, 
                                       dropout_rate=self.dropout_rate, is_training=self._is_training)
            self._loss = get_softmax_xentropy_loss(self._dnn, self._y)
            self._optimizer_op = get_optimizer_op(self.optimizer, self._loss, 
                                                  self.learning_rate)
            self._validation_score = get_validation_score(self._dnn, self._y)

            self._y_proba = tf.nn.softmax(self._dnn, name="y_proba")

            self._batch_norm_update_ops = self._graph.get_collection(tf.GraphKeys.UPDATE_OPS)
            self._saver = tf.train.Saver()
            self._init = tf.global_variables_initializer()
            
        
    def _save_params(self):
        with self._graph.as_default():
            global_vars = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
            
        vars_n_values = {global_var.op.name:value for global_var, value in \
                 zip(global_vars,self._session.run(global_vars))}
        self._saved_params =  vars_n_values
        
    
    def _restore_params(self):
        var_names = list(self._saved_params.keys())
        
        ## get assign operations for all variables
        assign_ops = {var_name:self._graph.get_operation_by_name("%s/Assign"%var_name) 
                      for var_name in var_names}
        ## get initialization values of all variables
        init_values = {var_name: assign_op.inputs[1]  for var_name, assign_op 
                       in assign_ops.items()}
        
        ## get feed_dict for all values
        feed_dict = {init_values[var_name]:self._saved_params[var_name] 
                     for var_name in var_names}
        
        
        self._session.run(assign_ops, feed_dict=feed_dict)
        
    
    def fit(self,x,y,x_val,y_val):
        n_epoches = 500
        max_epoches_wo_progress = 30
        
        self._create_graph()
        
        best_score=np.float("inf")
        best_epoch=0
        if self._session: self._session.close()
        with tf.Session(graph=self._graph).as_default() as sess:
            self._session = sess
            sess.run(self._init)
            
            graph_saver = ScalerGraphSaver2("DNN_GridSearch")
            loss_summary = graph_saver.get_summary_op("loss", self._loss)
            score_summary = graph_saver.get_summary_op("accuracy_score", self._validation_score)
            
            with graph_saver:
        
                for epoch in range(n_epoches):
                    for batch_x, batch_y in get_batch(x,y,self.batch_size):
                        ops = [self._loss, loss_summary, self._optimizer_op]
                        if self._batch_norm_update_ops is not None:
                            ops.append(self._batch_norm_update_ops)

                        results = sess.run(ops , feed_dict={self._x:batch_x, self._y:batch_y, 
                                               self._is_training:True})
                        loss = results[0]
                        loss_summary_text = results[1]



                    score, score_summary_text = sess.run([self._validation_score, score_summary], 
                                     feed_dict={self._x:x_val, self._y:y_val})
                    graph_saver.log_summary(loss_summary_text, epoch)
                    graph_saver.log_summary(score_summary_text, epoch)
                    
                   
                    
                    if epoch%20 == 0:
                        print("epoch %d, score %f, loss %f"%(epoch, score, loss))
                    
                    score=loss
                    if score < best_score:
                        best_score = score
                        best_epoch = epoch
                        self._save_params()
                    elif (epoch - best_epoch)>max_epoches_wo_progress:
                        print("No progress for %d epoches."%max_epoches_wo_progress)
                        break
                
            self._restore_params()
            print("Reverting back to epoch %d \
                    with %f score" %(best_epoch, best_score))
            self._score = best_score 
            return self
            
                    
    
    def predict_proba(self,x):
        if self._session is None:
            raise NotFittedError("%s is not fitted yet" \
                                                    %self.__class__.__name__)
        
        return self._session.run(self._y_proba, feed_dict={self._x:x, 
                                                           self._is_training:False})
            
    
    def predict(self,x):
        return np.argmax(self.predict_proba(x), axis=1)
    
    def score(self, x_val=None, y_val=None):
        
        score=self._session.run(self._validation_score, 
                             feed_dict={self._x:x_val, self._y:y_val})
        print("validation score: %f", score)
        return score
    
    def _get_save_path(self, name):
        return "tf_checkpoints/%s"%name
    
    def save(self,name):
        self._saver.save(self._session, self._get_save_path(name))
    
    def restore(self, name):
        imported_meta = tf.train.import_meta_graph("%s.meta"%self._get_save_path(name))
        graph = tf.get_default_graph()
        self._x = graph.get_tensor_by_name("x:0")
        self._y = graph.get_tensor_by_name("y:0")
        self._loss = graph.get_operation_by_name("mean_loss")
        
        self._validation_score = graph.get_tensor_by_name("validation/validation_score:0")
        self._y_proba = graph.get_tensor_by_name("y_proba:0")
        self._is_training = graph.get_tensor_by_name("is_training:0")
        self._session = tf.Session(graph=graph)
        imported_meta.restore(self._session, self._get_save_path(name))

    

### Fit and Predict

In [6]:
classifier = DNN_Classifier(2, 20, 5, activation=get_leaky_relu(0.01))
classifier.fit(train_x_0to4, train_y_0to4, val_x_0to4, val_y_0to4)

W0905 20:32:21.044908 4513576384 deprecation.py:323] From <ipython-input-4-99b76afb2362>:28: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dense instead.
W0905 20:32:21.751328 4513576384 deprecation.py:506] From /Users/devbhadurkhadka/.pyenv/versions/anaconda3-5.2.0/envs/scikit_practice/lib/python3.6/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0905 20:32:21.811959 4513576384 deprecation.py:323] From /Users/devbhadurkhadka/.pyenv/versions/anaconda3-5.2.0/envs/scikit_practice/lib/python3.6/site-packages/tensorflow/python/ops/math_grad.py:1250: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated

epoch 0, score 0.967647, loss 0.211136
epoch 20, score 0.983987, loss 0.000414
epoch 40, score 0.979085, loss 0.018174
epoch 60, score 0.983987, loss 0.000002
epoch 80, score 0.984314, loss 0.006435
epoch 100, score 0.982026, loss 0.000044
No progress for 30 epoches.
Reverting back to epoch 72                     with 0.000000 score


DNN_Classifier(activation=<function get_leaky_relu.<locals>.<lambda> at 0x1a38098510>,
        batch_norm_momentum=None, batch_size=50, dropout_rate=None,
        learning_rate=0.01, n_hidden_layers=2, n_neurons=20, n_outputs=5,
        optimizer=<class 'tensorflow.python.training.adam.AdamOptimizer'>)

In [8]:
classifier.save("Mnist-0to4")

In [9]:
from sklearn.metrics import accuracy_score

clas_re = DNN_Classifier()
graph = clas_re.restore("Mnist-0to4")
preds = clas_re.predict(test_x_0to4)
accuracy_score(preds, test_y_0to4)

KeyError: "The name 'mean_loss' refers to an Operation not in the graph."

### Learning Rate With and Without Batch Normalization

In [10]:
classifier = DNN_Classifier(2, 20, 5, activation=get_leaky_relu(0.01), batch_norm_momentum=.98)
classifier.fit(train_x_0to4, train_y_0to4, val_x_0to4, val_y_0to4)

W0905 20:35:56.957761 4513576384 deprecation.py:323] From <ipython-input-4-99b76afb2362>:32: batch_normalization (from tensorflow.python.layers.normalization) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.BatchNormalization instead.  In particular, `tf.control_dependencies(tf.GraphKeys.UPDATE_OPS)` should not be used (consult the `tf.keras.layers.batch_normalization` documentation).


epoch 0, score 0.974837, loss 0.055582
epoch 20, score 0.985948, loss 0.001568
epoch 40, score 0.987908, loss 0.279178
epoch 60, score 0.985294, loss 0.000009
epoch 80, score 0.991176, loss 0.000143
epoch 100, score 0.988562, loss 0.000001
No progress for 30 epoches.
Reverting back to epoch 73                     with 0.000000 score


DNN_Classifier(activation=<function get_leaky_relu.<locals>.<lambda> at 0x1a2790ba60>,
        batch_norm_momentum=0.98, batch_size=50, dropout_rate=None,
        learning_rate=0.01, n_hidden_layers=2, n_neurons=20, n_outputs=5,
        optimizer=<class 'tensorflow.python.training.adam.AdamOptimizer'>)

### Find Best Parameters Using RandomizedSearchCV

In [8]:
from sklearn.model_selection import RandomizedSearchCV
from datetime import datetime
from sklearn.externals import joblib


print(datetime.today())
params_dist = {
    "n_neurons":[100],
    "n_hidden_layers": [5],
    "n_outputs": [5],
    "activation": [tf.nn.relu, get_leaky_relu(0.01), tf.nn.elu],
    "batch_size": [100,300]
}

classifier = DNN_Classifier()
fit_params = {"x_val": val_x_0to4, "y_val": val_y_0to4}
rand_search = RandomizedSearchCV(classifier, params_dist, n_iter=5, cv=3,
                                 n_jobs=1, fit_params=fit_params, verbose=3)
rand_search.fit(train_x_0to4, train_y_0to4)

print(datetime.today())

rand_search.best_estimator_.save("Mnist-0to4-best1")

2019-09-07 19:43:08.124990
Fitting 3 folds for each of 5 candidates, totalling 15 fits
[CV] n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=100, activation=<function relu at 0x11aa4aa60> 




epoch 0, score 0.969608, loss 0.209717
epoch 20, score 0.984967, loss 0.033849
epoch 40, score 0.981373, loss 0.004538
epoch 60, score 0.985621, loss 0.011594
epoch 80, score 0.982353, loss 0.019855
epoch 100, score 0.968301, loss 0.060462
No progress for 30 epoches.
Reverting back to epoch 74                     with 0.000000 score
validation score: %f 0.9858372
validation score: %f 0.9993463
[CV]  n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=100, activation=<function relu at 0x11aa4aa60>, score=0.9858372211456299, total= 1.7min
[CV] n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=100, activation=<function relu at 0x11aa4aa60> 


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:  1.7min remaining:    0.0s


epoch 0, score 0.971569, loss 0.021749
epoch 20, score 0.986601, loss 0.016834
epoch 40, score 0.985948, loss 0.000045
epoch 60, score 0.982680, loss 0.001042
No progress for 30 epoches.
Reverting back to epoch 42                     with 0.000003 score
validation score: %f 0.9868177
validation score: %f 0.9988015
[CV]  n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=100, activation=<function relu at 0x11aa4aa60>, score=0.9868177175521851, total= 1.2min
[CV] n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=100, activation=<function relu at 0x11aa4aa60> 


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:  2.9min remaining:    0.0s


epoch 0, score 0.973529, loss 0.121668
epoch 20, score 0.990196, loss 0.133423
epoch 40, score 0.989869, loss 0.013864
epoch 60, score 0.986928, loss 0.055011
No progress for 30 epoches.
Reverting back to epoch 38                     with 0.000000 score
validation score: %f 0.9881238
validation score: %f 0.9986927
[CV]  n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=100, activation=<function relu at 0x11aa4aa60>, score=0.9881237745285034, total= 1.1min
[CV] n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=300, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c485f6d90> 
epoch 0, score 0.974510, loss 0.048159
epoch 20, score 0.984314, loss 0.011303
epoch 40, score 0.982026, loss 0.006482
epoch 60, score 0.989542, loss 0.000003
epoch 80, score 0.989542, loss 0.000000
epoch 100, score 0.988562, loss 0.000000
No progress for 30 epoches.
Reverting back to epoch 71                     with 0.000000 score
validation score: %f 0.9896503
validation score: %f 0.999

epoch 0, score 0.974183, loss 0.072922
epoch 20, score 0.988235, loss 0.000854
epoch 40, score 0.957516, loss 0.331167
No progress for 30 epoches.
Reverting back to epoch 11                     with 0.000185 score
validation score: %f 0.9834387
validation score: %f 0.9941715
[CV]  n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=100, activation=<function elu at 0x11ab252f0>, score=0.9834386706352234, total=  47.3s


[Parallel(n_jobs=1)]: Done  15 out of  15 | elapsed: 21.4min finished


epoch 0, score 0.978105, loss 0.083818
epoch 20, score 0.982026, loss 0.070610
epoch 40, score 0.988235, loss 0.011768
epoch 60, score 0.988235, loss 0.010102
epoch 80, score 0.990850, loss 0.000005
epoch 100, score 0.991830, loss 0.000000
epoch 120, score 0.991503, loss 0.000000
No progress for 30 epoches.
Reverting back to epoch 94                     with 0.000000 score
2019-09-07 20:06:49.660282


In [9]:
rand_search.best_params_

{'n_outputs': 5,
 'n_neurons': 100,
 'n_hidden_layers': 5,
 'batch_size': 300,
 'activation': <function __main__.get_leaky_relu.<locals>.<lambda>(z, name=None)>}

### Batch Normalization

In [11]:
from sklearn.model_selection import RandomizedSearchCV
from datetime import datetime


print(datetime.today())
params_dist = {
    "n_neurons":[120,150],
    "n_hidden_layers": [5],
    "n_outputs": [5],
    "batch_norm_momentum": [.98, .99],
    "activation": [tf.nn.relu, get_leaky_relu(0.01), tf.nn.elu],
    "batch_size": [80,100]
}

classifier = DNN_Classifier()
fit_params = {"x_val": val_x_0to4, "y_val": val_y_0to4}
rand_search = RandomizedSearchCV(classifier, params_dist, n_iter=5, cv=3,
                                 n_jobs=1, fit_params=fit_params, verbose=3)
rand_search.fit(train_x_0to4, train_y_0to4)

print(datetime.today())

rand_search.best_estimator_.save("Mnist-0to4-best_batch_norm")

preds = rand_search.best_estimator_.predict(test_x_0to4)
"test accuracy: %f"%accuracy_score(preds, test_y_0to4)

2019-09-05 20:41:14.616038
Fitting 3 folds for each of 5 candidates, totalling 15 fits
[CV] n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.99, activation=<function elu at 0x1a21e9a1e0> 




epoch 0, score 0.978431, loss 0.051807
epoch 20, score 0.985948, loss 0.004050
epoch 40, score 0.988235, loss 0.000353
epoch 60, score 0.984967, loss 0.000186
No progress for 30 epoches.
Reverting back to epoch 43                     with 0.000006 score
validation score: %f 0.99172026
validation score: %f 0.99983656
[CV]  n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.99, activation=<function elu at 0x1a21e9a1e0>, score=0.9917202591896057, total= 2.3min
[CV] n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.99, activation=<function elu at 0x1a21e9a1e0> 


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:  2.3min remaining:    0.0s


epoch 0, score 0.977124, loss 0.091709
epoch 20, score 0.985294, loss 0.085647
epoch 40, score 0.990850, loss 0.000743
epoch 60, score 0.988562, loss 0.000008
epoch 80, score 0.990523, loss 0.000060
epoch 100, score 0.989869, loss 0.000008
epoch 120, score 0.989869, loss 0.000022
epoch 140, score 0.989542, loss 0.000008
epoch 160, score 0.988562, loss 0.000001
epoch 180, score 0.991176, loss 0.000002
No progress for 30 epoches.
Reverting back to epoch 159                     with 0.000000 score
validation score: %f 0.9922649
validation score: %f 0.9999455
[CV]  n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.99, activation=<function elu at 0x1a21e9a1e0>, score=0.9922649264335632, total= 5.8min
[CV] n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.99, activation=<function elu at 0x1a21e9a1e0> 


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:  8.1min remaining:    0.0s


epoch 0, score 0.974510, loss 0.039386
epoch 20, score 0.987582, loss 0.002258
epoch 40, score 0.988889, loss 0.000175
epoch 60, score 0.985948, loss 0.000104
No progress for 30 epoches.
Reverting back to epoch 31                     with 0.000031 score
validation score: %f 0.9895402
validation score: %f 0.99891055
[CV]  n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.99, activation=<function elu at 0x1a21e9a1e0>, score=0.9895402193069458, total= 1.9min
[CV] n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.98, activation=<function relu at 0x1a21e48950> 
epoch 0, score 0.972222, loss 0.120286
epoch 20, score 0.989216, loss 0.008831
epoch 40, score 0.979739, loss 0.053936
epoch 60, score 0.991503, loss 0.004413
epoch 80, score 0.990523, loss 0.000083
epoch 100, score 0.989542, loss 0.001401
No progress for 30 epoches.
Reverting back to epoch 71                     with 0.000002 score
validation score: %f 0.9913934
v

epoch 80, score 0.988562, loss 0.000094
epoch 100, score 0.989216, loss 0.000007
epoch 120, score 0.988889, loss 0.000014
epoch 140, score 0.990523, loss 0.000010
No progress for 30 epoches.
Reverting back to epoch 127                     with 0.000001 score
validation score: %f 0.98932344
validation score: %f 0.9997276
[CV]  n_outputs=5, n_neurons=120, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.99, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1a2887fae8>, score=0.9893234372138977, total= 4.5min
[CV] n_outputs=5, n_neurons=120, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.99, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1a2887fae8> 
epoch 0, score 0.972222, loss 0.056430
epoch 20, score 0.982353, loss 0.075404
epoch 40, score 0.987255, loss 0.002217
epoch 60, score 0.988235, loss 0.000754
epoch 80, score 0.990523, loss 0.000039
epoch 100, score 0.988235, loss 0.000009
epoch 120, score 0.991503, loss 0.001018
No progress for 30 epoch

[Parallel(n_jobs=1)]: Done  15 out of  15 | elapsed: 51.8min finished


epoch 0, score 0.975163, loss 0.038283
epoch 20, score 0.987582, loss 0.000351
epoch 40, score 0.990850, loss 0.015891
epoch 60, score 0.990523, loss 0.000135
epoch 80, score 0.989542, loss 0.000012
epoch 100, score 0.991503, loss 0.000063
epoch 120, score 0.990850, loss 0.000051
epoch 140, score 0.992157, loss 0.000166
epoch 160, score 0.992157, loss 0.000002
epoch 180, score 0.991830, loss 0.000003
epoch 200, score 0.991176, loss 0.000013
epoch 220, score 0.991830, loss 0.000029
No progress for 30 epoches.
Reverting back to epoch 205                     with 0.000000 score
2019-09-05 21:43:21.864585


'test accuracy: 0.992411'

In [9]:
from sklearn.metrics import accuracy_score
"test accuracy: %f"%accuracy_score(preds, test_y_0to4)

'test accuracy: 0.995524'

In [8]:
rand_search.best_params_

{'n_outputs': 5,
 'n_neurons': 120,
 'n_hidden_layers': 5,
 'batch_size': 100,
 'batch_norm_momentum': 0.98,
 'activation': <function __main__.get_leaky_relu.<locals>.<lambda>(z, name=None)>}

### With dropout

In [36]:
print(datetime.today())
params_dist = {
    "n_neurons":[120],
    "n_hidden_layers": [5],
    "n_outputs": [5],
    "dropout_rate": [.6, .4, .2],
    "activation": [tf.nn.relu, get_leaky_relu(0.01), tf.nn.elu],
    "batch_size": [50,80,100]
}

classifier = DNN_Classifier()
fit_params = {"x_val": val_x_0to4, "y_val": val_y_0to4}
rand_search3 = RandomizedSearchCV(classifier, params_dist, n_iter=5, cv=3,
                                 n_jobs=1, fit_params=fit_params, verbose=3)
rand_search3.fit(train_x_0to4, train_y_0to4)

print(datetime.today())

rand_search3.best_estimator_.save("Mnist-0to4-best_dropoutdsf")

preds = rand_search3.best_estimator_.predict(test_x_0to4)
accuracy_score(preds, test_y_0to4)

2019-08-23 21:00:30.181454
Fitting 3 folds for each of 5 candidates, totalling 15 fits
[CV] n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.6, batch_size=50, activation=<function elu at 0x1a20af31e0> 




epoch 0, score 0.200654, loss 2.511327
epoch 50, score 0.191503, loss 1.673325
epoch 100, score 0.223529, loss 1.784896
No progress for 100 epoches.
Reverting back to epoch 10                     with 0.370261 score
validation score: %f 0.363983
validation score: %f 0.37920138
[CV]  n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.6, batch_size=50, activation=<function elu at 0x1a20af31e0>, score=0.36398300528526306, total= 3.2min
[CV] n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.6, batch_size=50, activation=<function elu at 0x1a20af31e0> 


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:  3.3min remaining:    0.0s


epoch 0, score 0.191503, loss 1.979267
epoch 50, score 0.191503, loss 1.710331
epoch 100, score 0.191830, loss 1.617060
No progress for 100 epoches.
Reverting back to epoch 6                     with 0.223529 score
validation score: %f 0.22083016
validation score: %f 0.21958926
[CV]  n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.6, batch_size=50, activation=<function elu at 0x1a20af31e0>, score=0.22083015739917755, total= 3.6min
[CV] n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.6, batch_size=50, activation=<function elu at 0x1a20af31e0> 


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:  6.8min remaining:    0.0s


epoch 0, score 0.200327, loss 1.459546
epoch 50, score 0.200327, loss 1.686456
epoch 100, score 0.192810, loss 1.711437
No progress for 100 epoches.
Reverting back to epoch 5                     with 0.223529 score
validation score: %f 0.22368708
validation score: %f 0.21816102
[CV]  n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.6, batch_size=50, activation=<function elu at 0x1a20af31e0>, score=0.223687082529068, total= 3.3min
[CV] n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.2, batch_size=100, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c3852d7b8> 
epoch 0, score 0.933333, loss 0.397353
epoch 50, score 0.951307, loss 0.435087
epoch 100, score 0.928105, loss 0.497548
No progress for 100 epoches.
Reverting back to epoch 6                     with 0.972549 score
validation score: %f 0.9633947
validation score: %f 0.97042
[CV]  n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.2, batch_size=100, activation=<function get_leaky_re

[Parallel(n_jobs=1)]: Done  15 out of  15 | elapsed: 47.4min finished


epoch 0, score 0.951961, loss 0.394898
epoch 50, score 0.191830, loss 1.617283
epoch 100, score 0.192810, loss 1.657724
No progress for 100 epoches.
Reverting back to epoch 6                     with 0.976471 score
2019-08-23 21:52:37.437255


0.9778166958552248

In [37]:
rand_search3.best_params_

{'n_outputs': 5,
 'n_neurons': 120,
 'n_hidden_layers': 5,
 'dropout_rate': 0.2,
 'batch_size': 100,
 'activation': <function tensorflow.python.ops.gen_nn_ops.elu(features, name=None)>}