### Questions
- Tune the hyperparameters using cross-validation and see what precision you can achieve. 
- Now try adding Batch Normalization and compare the learning curves: is it converging faster than before? Does it produce a better model? 
- Is the model overfitting the training set? Try adding dropout to every layer and try again. Does it help?

### Frame
- Create predictor class "DNN_Classifier" using DNN with hyper parameters
    - implement __init__, fit, predict and predict_prob
    - Fit will use class DNN_Helper with methods create_graph(which will create tensor flow graph) and train_dnn (which will train dnn with given hyper-parameters)
    - create_graph will use hyper-parameters like number of neurons, activation function, optimizer class, learning rate etc

### Some Notes About Graph and Session
- Tensorflow computations are represented as graph which indicates operands, operations and their dependencies.
- Graph is run within context of Session which stores current state of the computations
- When creating a tensor or an operation it is automatically added to default graph
- We can also create another graph and make it default using 
       
       ``` python
           with graph.as_default():
               # add tensors and operations
       ```
       
- No computation are run until it is run inside the context of a session
    
    ``` python
        # The session will use current default graph
        with tf.Session() as sess:
            sess.run(op)
        
        # The session will use graph sent as parameter
        with tf.Session(graph=graph) as sess:
            sess.run(op)
        
    ```
    
- ```tf.train.Saver``` can be used to save session and restore it latter when needed
    
    ``` python
    saver.save(session, path)
    saver.restore(session,path)
    ```
    
- Graph must be created before restoring the session. To restore graph also from the checkpoint file use following code which will restore graph as default graph
``` python
    meta_importer = tf.train.import_meta_graph(checkpoint_path+".meta")
    
    # Then restore session
    sess = tf.Session()
    meta_importer.restore(sess, checkpoint_path)
```

- When restoring graph from meta file our tensor and operation variables are not assigned automatically. So we need to find them in graph by name or other way and assign it to variables to easily use them. Some functions are
  - ```graph.get_tensor_by_name("x:0")``` here 0 indicates 1st output of operation x
  - ```graph.get_operation_by_name("is_training")```
  - ```graph.collections``` is list of collection names
  - ```tf.get_collection(collection_name)``` will give all variables in a collection

### Load MINST Data

In [1]:
from sklearn.model_selection import train_test_split
import tensorflow as tf

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train.reshape(-1, 28*28)
x_test = x_test.reshape(-1, 28*28)

train_indxes_0to4 = y_train<5
train_x_0to4 = x_train[train_indxes_0to4]
train_y_0to4 = y_train[train_indxes_0to4]

test_indxes_0to4 = y_test<5
test_x_0to4 = x_test[test_indxes_0to4]
test_y_0to4 = y_test[test_indxes_0to4]



(train_x_0to4, val_x_0to4, train_y_0to4, val_y_0to4) = \
    train_test_split(train_x_0to4, train_y_0to4, test_size=0.1)

  from ._conv import register_converters as _register_converters


### Utility Functions

In [33]:
import tensorflow as tf
from tensorflow.contrib.layers import variance_scaling_initializer as he_initializer
from tensorflow.nn import sparse_softmax_cross_entropy_with_logits as softmax_xentropy
from tensorflow.layers import dense
import numpy as np


def get_leaky_relu(alpha):
    return lambda z, name=None: tf.maximum(alpha*z,z, name=name)
    

def get_connected_layers(x, n_hidden_layers, n_neurons, n_ouputs, activation=tf.nn.elu,
                                   batch_norm_momentum=None, dropout_rate=None, is_training=None):
    

    initializer = he_initializer()
    
    with tf.name_scope("DNN"):
        inputs = x
        for l in range(n_hidden_layers):
            if dropout_rate is not None:
                ## this function will set inputs to zero with dropout rate probability
                ## and divides remaining inputs with dropout rate
                inputs = tf.layers.dropout(inputs, dropout_rate, training=is_training, 
                                  name=("dropout%d"%l))
                
            inputs = tf.layers.dense(inputs, n_neurons, kernel_initializer=initializer,
                           name="hidden%d"%(l+1), activation=activation)
            
            if batch_norm_momentum is not None:
                inputs = tf.layers.batch_normalization(inputs, momentum=batch_norm_momentum,
                                training=is_training)
            
            inputs = activation(inputs, name="hiden%d_out"%(l+1))
            
        output = tf.layers.dense(inputs, n_ouputs, name="output")
        
    return output
        


def get_softmax_xentropy_loss(logits,y):
    with tf.name_scope("loss"):
        xentropy = softmax_xentropy(labels=y, logits=logits)
        return tf.reduce_mean(xentropy, name="mean_loss")

def get_optimizer_op(optimizer, loss, learning_rate=0.01):
    with tf.name_scope("train"):
        optimizer =  optimizer(learning_rate=learning_rate)
        optimizer_op = optimizer.minimize(loss, name="optimizer_op")
    return optimizer_op

def get_validation_score(logits,y):
    with tf.name_scope("validation"):
        preds = tf.nn.in_top_k(logits,y,1)
        return tf.reduce_mean(tf.cast(preds, dtype=np.float32), name="validation_score")
    
def get_batch(x,y,batch_size):
    n_batches = len(y)//batch_size + 1
    for i in range(n_batches):
        indxes = np.random.choice(len(y), size=batch_size, replace=False)
        yield x[indxes], y[indxes]

    

### Custom Classifier Class

In [65]:
# from imp import reload
# import my_libs
# reload(my_libs.tf_graph_saver)

from sklearn.base import BaseEstimator, TransformerMixin
from my_libs.tf_graph_saver import ScalerGraphSaver2
from sklearn.exceptions import NotFittedError



class DNN_Classifier(BaseEstimator, TransformerMixin):
    def __init__(self, n_hidden_layers=None, n_neurons=None, n_outputs=None, 
                 activation=tf.nn.elu, optimizer=tf.train.AdamOptimizer,  learning_rate=0.01, 
                 batch_norm_momentum=None, batch_size=50, dropout_rate=None):
        self.n_hidden_layers = n_hidden_layers
        self.n_neurons = n_neurons
        self.learning_rate = learning_rate
        self.activation = activation
        self.optimizer = optimizer
        self.batch_norm_momentum = batch_norm_momentum
        self.batch_size = batch_size
        self.dropout_rate = dropout_rate
        self.n_outputs = n_outputs
        self._session = None
        
        
    def _create_graph(self):                      
        
        tf.reset_default_graph()
        self._graph = tf.Graph()
        with self._graph.as_default():
        
            self._x = tf.placeholder(shape=(None, 28*28), dtype=np.float32,name="x")
            self._y = tf.placeholder(shape=(None), dtype=np.int32,name="y")

            self._is_training = tf.placeholder_with_default(False,shape=(), name="is_training")


            self._dnn = get_connected_layers(self._x, self.n_hidden_layers, self.n_neurons, 
                                       self.n_outputs, activation=self.activation, 
                                       batch_norm_momentum=self.batch_norm_momentum, 
                                       dropout_rate=self.dropout_rate, is_training=self._is_training)
            self._loss = get_softmax_xentropy_loss(self._dnn, self._y)
            self._optimizer_op = get_optimizer_op(self.optimizer, self._loss, 
                                                  self.learning_rate)
            self._validation_score = get_validation_score(self._dnn, self._y)

            self._y_proba = tf.nn.softmax(self._dnn, name="y_proba")

            self._batch_norm_update_ops = self._graph.get_collection(tf.GraphKeys.UPDATE_OPS)
            self._saver = tf.train.Saver()
            self._init = tf.global_variables_initializer()
            
        
    def _save_params(self):
        with self._graph.as_default():
            global_vars = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
            
        vars_n_values = {global_var.op.name:value for global_var, value in \
                 zip(global_vars,self._session.run(global_vars))}
        self._saved_params =  vars_n_values
        
    
    def _restore_params(self):
        var_names = list(self._saved_params.keys())
        
        ## get assign operations for all variables
        assign_ops = {var_name:self._graph.get_operation_by_name("%s/Assign"%var_name) 
                      for var_name in var_names}
        ## get initialization values of all variables
        init_values = {var_name: assign_op.inputs[1]  for var_name, assign_op 
                       in assign_ops.items()}
        
        ## get feed_dict for all values
        feed_dict = {init_values[var_name]:self._saved_params[var_name] 
                     for var_name in var_names}
        
        
        self._session.run(assign_ops, feed_dict=feed_dict)
        
    
    def fit(self,x,y,x_val,y_val):
        n_epoches = 500
        max_epoches_wo_progress = 100
        
        self._create_graph()
        
        best_score=0
        best_epoch=0
        if self._session: self._session.close()
        with tf.Session(graph=self._graph).as_default() as sess:
            self._session = sess
            sess.run(self._init)
            
            graph_saver = ScalerGraphSaver2("DNN_GridSearch")
            loss_summary = graph_saver.get_summary_op("loss", self._loss)
            score_summary = graph_saver.get_summary_op("accuracy_score", self._validation_score)
            
            with graph_saver:
        
                for epoch in range(n_epoches):
                    for batch_x, batch_y in get_batch(x,y,self.batch_size):
                        ops = [self._loss, loss_summary, self._optimizer_op]
                        if self._batch_norm_update_ops is not None:
                            ops.append(self._batch_norm_update_ops)

                        results = sess.run(ops , feed_dict={self._x:batch_x, self._y:batch_y, 
                                               self._is_training:True})
                        loss = results[0]
                        loss_summary_text = results[1]



                    score, score_summary_text = sess.run([self._validation_score, score_summary], 
                                     feed_dict={self._x:x_val, self._y:y_val})
                    graph_saver.log_summary(loss_summary_text, epoch)
                    graph_saver.log_summary(score_summary_text, epoch)
                    
                    if epoch%50 == 0:
                        print("epoch %d, score %f, loss %f"%(epoch, score, loss))

                    if score > best_score:
                        best_score = score
                        best_epoch = epoch
                        self._save_params()
                    elif (epoch - best_epoch)>max_epoches_wo_progress:
                        print("No progress for %d epoches."%max_epoches_wo_progress)
                        break
                
            self._restore_params()
            print("Reverting back to epoch %d \
                    with %f score" %(best_epoch, best_score))
            self._score = best_score 
            return self
            
                    
    
    def predict_proba(self,x):
        if self._session is None:
            raise NotFittedError("%s is not fitted yet" \
                                                    %self.__class__.__name__)
        
        return self._session.run(self._y_proba, feed_dict={self._x:x, 
                                                           self._is_training:False})
            
    
    def predict(self,x):
        return np.argmax(self.predict_proba(x), axis=1)
    
    def score(self, x_val=None, y_val=None):
        
        score=self._session.run(self._validation_score, 
                             feed_dict={self._x:x_val, self._y:y_val})
        print("validation score: %f", score)
        return score
    
    def _get_save_path(self, name):
        return "tf_checkpoints/%s"%name
    
    def save(self,name):
        self._saver.save(self._session, self._get_save_path(name))
    
    def restore(self, name):
        imported_meta = tf.train.import_meta_graph("%s.meta"%self._get_save_path(name))
        graph = tf.get_default_graph()
        self._x = graph.get_tensor_by_name("x:0")
        self._y = graph.get_tensor_by_name("y:0")
        self._loss = graph.get_operation_by_name("mean_loss")
        
        self._validation_score = graph.get_tensor_by_name("validation/validation_score:0")
        self._y_proba = graph.get_tensor_by_name("y_proba:0")
        self._is_training = graph.get_tensor_by_name("is_training:0")
        self._session = tf.Session(graph=graph)
        imported_meta.restore(self._session, self._get_save_path(name))

    

### Fit and Predict

In [66]:
classifier = DNN_Classifier(2, 20, 5, activation=get_leaky_relu(0.01))
classifier.fit(train_x_0to4, train_y_0to4, val_x_0to4, val_y_0to4)

epoch 0, score 0.927778, loss 0.590107
epoch 50, score 0.958170, loss 0.096847
epoch 100, score 0.979085, loss 0.025066
epoch 150, score 0.977124, loss 0.000035
No progress for 100 epoches.
Reverting back to epoch 67                     with 0.982026 score


DNN_Classifier(activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c39591048>,
        batch_norm_momentum=None, batch_size=50, dropout_rate=None,
        learning_rate=0.01, n_hidden_layers=2, n_neurons=20, n_outputs=5,
        optimizer=<class 'tensorflow.python.training.adam.AdamOptimizer'>)

In [9]:
classifier.save("Mnist-0to4")

In [10]:
from sklearn.metrics import accuracy_score

clas_re = DNN_Classifier()
graph = clas_re.restore("Mnist-0to4")
preds = clas_re.predict(test_x_0to4)
accuracy_score(preds, test_y_0to4)

0.9820976843743919

### Learning Rate With and Without Batch Normalization

In [67]:
classifier = DNN_Classifier(2, 20, 5, activation=get_leaky_relu(0.01), batch_norm_momentum=.98)
classifier.fit(train_x_0to4, train_y_0to4, val_x_0to4, val_y_0to4)

epoch 0, score 0.978431, loss 0.035092
epoch 50, score 0.987582, loss 0.001325
epoch 100, score 0.989542, loss 0.000238
epoch 150, score 0.986274, loss 0.000030
No progress for 100 epoches.
Reverting back to epoch 88                     with 0.991830 score


DNN_Classifier(activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c395919d8>,
        batch_norm_momentum=0.98, batch_size=50, dropout_rate=None,
        learning_rate=0.01, n_hidden_layers=2, n_neurons=20, n_outputs=5,
        optimizer=<class 'tensorflow.python.training.adam.AdamOptimizer'>)

### Find Best Parameters Using RandomizedSearchCV

In [12]:
from sklearn.model_selection import RandomizedSearchCV
from datetime import datetime
from sklearn.externals import joblib


print(datetime.today())
params_dist = {
    "n_neurons":[80,100, 120],
    "n_hidden_layers": [5],
    "n_outputs": [5],
    "activation": [tf.nn.relu, get_leaky_relu(0.01), tf.nn.elu],
    "batch_size": [50,100]
}

classifier = DNN_Classifier()
fit_params = {"x_val": val_x_0to4, "y_val": val_y_0to4}
rand_search = RandomizedSearchCV(classifier, params_dist, n_iter=5, cv=3,
                                 n_jobs=1, fit_params=fit_params, verbose=3)
rand_search.fit(train_x_0to4, train_y_0to4)

print(datetime.today())

rand_search.best_estimator_.save("Mnist-0to4-best1")

2019-08-23 17:38:25.754383
Fitting 3 folds for each of 5 candidates, totalling 15 fits
[CV] n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=50, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c360bb268> 




epoch 0, score 0.950327
epoch 50, score 0.861765
epoch 100, score 0.895425
No progress for 100 epoches.
Reverting back to epoch 8                     with 0.981699 score
validation score: %f 0.9753786
validation score: %f 0.98343956
[CV]  n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=50, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c360bb268>, score=0.9753785729408264, total= 3.3min
[CV] n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=50, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c360bb268> 


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:  3.3min remaining:    0.0s


epoch 0, score 0.965686
epoch 50, score 0.953922
epoch 100, score 0.385948
No progress for 100 epoches.
Reverting back to epoch 5                     with 0.980065 score
validation score: %f 0.9723281
validation score: %f 0.97804654
[CV]  n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=50, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c360bb268>, score=0.9723281264305115, total= 3.0min
[CV] n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=50, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c360bb268> 


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:  6.3min remaining:    0.0s


epoch 0, score 0.942810
epoch 50, score 0.594771
epoch 100, score 0.931046
No progress for 100 epoches.
Reverting back to epoch 7                     with 0.980719 score
validation score: %f 0.97494006
validation score: %f 0.9826234
[CV]  n_outputs=5, n_neurons=100, n_hidden_layers=5, batch_size=50, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c360bb268>, score=0.9749400615692139, total= 3.4min
[CV] n_outputs=5, n_neurons=120, n_hidden_layers=5, batch_size=100, activation=<function relu at 0x1a20a79950> 
epoch 0, score 0.966340
epoch 50, score 0.223529
epoch 100, score 0.223529
No progress for 100 epoches.
Reverting back to epoch 26                     with 0.983660 score
validation score: %f 0.98213315
validation score: %f 0.9954241
[CV]  n_outputs=5, n_neurons=120, n_hidden_layers=5, batch_size=100, activation=<function relu at 0x1a20a79950>, score=0.982133150100708, total= 2.1min
[CV] n_outputs=5, n_neurons=120, n_hidden_layers=5, batch_size=100, activation=<function 

[Parallel(n_jobs=1)]: Done  15 out of  15 | elapsed: 39.6min finished


epoch 0, score 0.962092
epoch 50, score 0.223529
epoch 100, score 0.223529
No progress for 100 epoches.
Reverting back to epoch 15                     with 0.983987 score
2019-08-23 18:21:32.915014


In [13]:
rand_search.best_params_

{'n_outputs': 5,
 'n_neurons': 120,
 'n_hidden_layers': 5,
 'batch_size': 100,
 'activation': <function tensorflow.python.ops.gen_nn_ops.relu(features, name=None)>}

### Batch Normalization

In [38]:
print(datetime.today())
params_dist = {
    "n_neurons":[120,150],
    "n_hidden_layers": [5],
    "n_outputs": [5],
    "batch_norm_momentum": [.98, .99],
    "activation": [tf.nn.relu, get_leaky_relu(0.01), tf.nn.elu],
    "batch_size": [80,100]
}

classifier = DNN_Classifier()
fit_params = {"x_val": val_x_0to4, "y_val": val_y_0to4}
rand_search = RandomizedSearchCV(classifier, params_dist, n_iter=5, cv=3,
                                 n_jobs=1, fit_params=fit_params, verbose=3)
rand_search.fit(train_x_0to4, train_y_0to4)

print(datetime.today())

rand_search.best_estimator_.save("Mnist-0to4-best_batch_norm")

preds = rand_search.best_estimator_.predict(test_x_0to4)
"test accuracy: %f"%accuracy_score(preds, test_y_0to4)

2019-08-23 22:01:10.884544
Fitting 3 folds for each of 5 candidates, totalling 15 fits
[CV] n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.98, activation=<function elu at 0x1a20af31e0> 




epoch 0, score 0.981373, loss 0.014723
epoch 50, score 0.987908, loss 0.070573
epoch 100, score 0.990850, loss 0.000025
epoch 150, score 0.991503, loss 0.000003
epoch 200, score 0.992484, loss 0.000005
epoch 250, score 0.992157, loss 0.000001
epoch 300, score 0.993464, loss 0.000026
epoch 350, score 0.992484, loss 0.000001
No progress for 100 epoches.
Reverting back to epoch 279                     with 0.994444 score
validation score: %f 0.9886698
validation score: %f 0.9997276
[CV]  n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.98, activation=<function elu at 0x1a20af31e0>, score=0.9886698126792908, total=13.8min
[CV] n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.98, activation=<function elu at 0x1a20af31e0> 


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed: 13.8min remaining:    0.0s


epoch 0, score 0.979412, loss 0.154295
epoch 50, score 0.992484, loss 0.000030
epoch 100, score 0.993137, loss 0.000002
epoch 150, score 0.988889, loss 0.022350
epoch 200, score 0.992810, loss 0.000008
epoch 250, score 0.988235, loss 0.005229
No progress for 100 epoches.
Reverting back to epoch 187                     with 0.994118 score
validation score: %f 0.9907397
validation score: %f 0.9994553
[CV]  n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.98, activation=<function elu at 0x1a20af31e0>, score=0.9907397031784058, total= 9.5min
[CV] n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.98, activation=<function elu at 0x1a20af31e0> 


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed: 23.3min remaining:    0.0s


epoch 0, score 0.983660, loss 0.087213
epoch 50, score 0.989869, loss 0.012622
epoch 100, score 0.993791, loss 0.000024
epoch 150, score 0.993464, loss 0.000002
epoch 200, score 0.992484, loss 0.000375
No progress for 100 epoches.
Reverting back to epoch 104                     with 0.994118 score
validation score: %f 0.9928089
validation score: %f 0.9999455
[CV]  n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.98, activation=<function elu at 0x1a20af31e0>, score=0.9928088784217834, total= 6.6min
[CV] n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=80, batch_norm_momentum=0.99, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c3681c488> 
epoch 0, score 0.966340, loss 0.023025
epoch 50, score 0.990523, loss 0.039574
epoch 100, score 0.991830, loss 0.001379
epoch 150, score 0.989542, loss 0.000008
No progress for 100 epoches.
Reverting back to epoch 89                     with 0.994444 score
validation score: %f 0.9904129
validati

validation score: %f 1.0
[CV]  n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.99, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c3681c488>, score=0.9907397031784058, total= 6.8min
[CV] n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.99, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c3681c488> 
epoch 0, score 0.982353, loss 0.154885
epoch 50, score 0.989869, loss 0.001396
epoch 100, score 0.992484, loss 0.000057
epoch 150, score 0.993464, loss 0.000037
epoch 200, score 0.992810, loss 0.000011
epoch 250, score 0.992157, loss 0.000003
No progress for 100 epoches.
Reverting back to epoch 191                     with 0.995098 score
validation score: %f 0.9928089
validation score: %f 0.99983656
[CV]  n_outputs=5, n_neurons=150, n_hidden_layers=5, batch_size=100, batch_norm_momentum=0.99, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c3681c488>, score=0.9928088784217834, total=

[Parallel(n_jobs=1)]: Done  15 out of  15 | elapsed: 369.5min finished


epoch 0, score 0.973529, loss 0.060896
epoch 50, score 0.992157, loss 0.000057
epoch 100, score 0.992810, loss 0.000002
epoch 150, score 0.993791, loss 0.000002
epoch 200, score 0.992484, loss 0.000000
epoch 250, score 0.993791, loss 0.000000
epoch 300, score 0.992484, loss 0.058179
epoch 350, score 0.994771, loss 0.000000
epoch 400, score 0.994118, loss 0.000000
No progress for 100 epoches.
Reverting back to epoch 308                     with 0.996732 score
2019-08-24 04:34:12.752879


'test accuracy: 0.996108'

In [29]:
rand_search.best_params_

{'n_outputs': 5,
 'n_neurons': 120,
 'n_hidden_layers': 5,
 'batch_size': 100,
 'batch_norm_momentum': 0.99,
 'activation': <function tensorflow.python.ops.gen_nn_ops.relu(features, name=None)>}

In [36]:
print(datetime.today())
params_dist = {
    "n_neurons":[120],
    "n_hidden_layers": [5],
    "n_outputs": [5],
    "dropout_rate": [.6, .4, .2],
    "activation": [tf.nn.relu, get_leaky_relu(0.01), tf.nn.elu],
    "batch_size": [50,80,100]
}

classifier = DNN_Classifier()
fit_params = {"x_val": val_x_0to4, "y_val": val_y_0to4}
rand_search3 = RandomizedSearchCV(classifier, params_dist, n_iter=5, cv=3,
                                 n_jobs=1, fit_params=fit_params, verbose=3)
rand_search3.fit(train_x_0to4, train_y_0to4)

print(datetime.today())

rand_search3.best_estimator_.save("Mnist-0to4-best_dropoutdsf")

preds = rand_search3.best_estimator_.predict(test_x_0to4)
accuracy_score(preds, test_y_0to4)

2019-08-23 21:00:30.181454
Fitting 3 folds for each of 5 candidates, totalling 15 fits
[CV] n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.6, batch_size=50, activation=<function elu at 0x1a20af31e0> 




epoch 0, score 0.200654, loss 2.511327
epoch 50, score 0.191503, loss 1.673325
epoch 100, score 0.223529, loss 1.784896
No progress for 100 epoches.
Reverting back to epoch 10                     with 0.370261 score
validation score: %f 0.363983
validation score: %f 0.37920138
[CV]  n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.6, batch_size=50, activation=<function elu at 0x1a20af31e0>, score=0.36398300528526306, total= 3.2min
[CV] n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.6, batch_size=50, activation=<function elu at 0x1a20af31e0> 


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:  3.3min remaining:    0.0s


epoch 0, score 0.191503, loss 1.979267
epoch 50, score 0.191503, loss 1.710331
epoch 100, score 0.191830, loss 1.617060
No progress for 100 epoches.
Reverting back to epoch 6                     with 0.223529 score
validation score: %f 0.22083016
validation score: %f 0.21958926
[CV]  n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.6, batch_size=50, activation=<function elu at 0x1a20af31e0>, score=0.22083015739917755, total= 3.6min
[CV] n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.6, batch_size=50, activation=<function elu at 0x1a20af31e0> 


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:  6.8min remaining:    0.0s


epoch 0, score 0.200327, loss 1.459546
epoch 50, score 0.200327, loss 1.686456
epoch 100, score 0.192810, loss 1.711437
No progress for 100 epoches.
Reverting back to epoch 5                     with 0.223529 score
validation score: %f 0.22368708
validation score: %f 0.21816102
[CV]  n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.6, batch_size=50, activation=<function elu at 0x1a20af31e0>, score=0.223687082529068, total= 3.3min
[CV] n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.2, batch_size=100, activation=<function get_leaky_relu.<locals>.<lambda> at 0x1c3852d7b8> 
epoch 0, score 0.933333, loss 0.397353
epoch 50, score 0.951307, loss 0.435087
epoch 100, score 0.928105, loss 0.497548
No progress for 100 epoches.
Reverting back to epoch 6                     with 0.972549 score
validation score: %f 0.9633947
validation score: %f 0.97042
[CV]  n_outputs=5, n_neurons=120, n_hidden_layers=5, dropout_rate=0.2, batch_size=100, activation=<function get_leaky_re

[Parallel(n_jobs=1)]: Done  15 out of  15 | elapsed: 47.4min finished


epoch 0, score 0.951961, loss 0.394898
epoch 50, score 0.191830, loss 1.617283
epoch 100, score 0.192810, loss 1.657724
No progress for 100 epoches.
Reverting back to epoch 6                     with 0.976471 score
2019-08-23 21:52:37.437255


0.9778166958552248

In [37]:
rand_search3.best_params_

{'n_outputs': 5,
 'n_neurons': 120,
 'n_hidden_layers': 5,
 'dropout_rate': 0.2,
 'batch_size': 100,
 'activation': <function tensorflow.python.ops.gen_nn_ops.elu(features, name=None)>}