<pre>
1. Download the data from <a href='https://drive.google.com/file/d/15dCNcmKskcFVjs7R0ElQkR61Ex53uJpM/view?usp=sharing'>here</a>

2. Code the model to classify data like below image

<img src='https://i.imgur.com/33ptOFy.png'>

3. Write your own callback function, that has to print the micro F1 score and AUC score after each epoch.

4. Save your model at every epoch if your validation accuracy is improved from previous epoch. 

5. you have to decay learning based on below conditions 
        Cond1. If your validation accuracy at that epoch is less than previous epoch accuracy, you have to decrese the
               learning rate by 10%. 
        Cond2. For every 3rd epoch, decay your learning rate by 5%.
        
6. If you are getting any NaN values(either weigths or loss) while training, you have to terminate your training. 

7. You have to stop the training if your validation accuracy is not increased in last 2 epochs.

8. Use tensorboard for every model and analyse your gradients. (you need to upload the screenshots for each model for evaluation)

9. use cross entropy as loss function

10. Try the architecture params as given below. 
</pre>

In [2]:
# importing liberaries
import tensorflow as tf
import keras
from tensorflow.python.keras.callbacks import Callback, ModelCheckpoint, EarlyStopping
import numpy as np
import pandas as pd 
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score, roc_auc_score


In [3]:
# importing data
data = pd.read_csv('data.csv')
data.label = data.label.apply(lambda x: int(x))
x = data[['f1', 'f2']]
y = data.label
print("Data shape: ", x.shape, y.shape)
data.head()

x_train, x_cv, y_train, y_cv = train_test_split(x, y, test_size=0.3, random_state=42)


print("Train data shape: ", x_train.shape, y_train.shape)
print("CV data shape:    ", x_cv.shape, y_cv.shape)
data.head()

Data shape:  (20000, 2) (20000,)
Train data shape:  (14000, 2) (14000,)
CV data shape:     (6000, 2) (6000,)


Unnamed: 0,f1,f2,label
0,0.450564,1.074305,0
1,0.085632,0.967682,0
2,0.117326,0.971521,1
3,0.982179,-0.380408,0
4,-0.720352,0.95585,0


In [4]:
# checking wether the data is balanced or not
data.label.value_counts()

0    10000
1    10000
Name: label, dtype: int64

In [13]:
def lr_schedule(epoch, lr, pre_vcc, cur_val_acc):
    """Helper function to retrieve the scheduled learning rate based on epoch."""
    if epoch == 0:
        return lr
    else:
        if cur_val_acc < pre_vcc:
            lr = 0.9 * lr
        if (epoch+1) % 3 == 0:
            lr = 0.95 * lr
    return lr

# callback to find metrics on epoch end
class Metrics(Callback):
    def __init__(self, x, y):
        self.x = x
        self.y = y
        self.history = {'epoch':[], 'learning_rate':[], 'loss': [], 'acc': [], 'val_loss': [], 'val_acc': [], 'auc': [], 'f1_micro': []}

    def on_epoch_end(self, epoch, logs={}):
        y_hat_pred = np.asarray(self.model.predict(self.x))
        y_hat = np.where(y_hat_pred > 0.5, 1, 0)
        
        self.history['epoch'].append(epoch+1)
        
        # Terminating the training if loss is NaN
        if np.isnan(logs.get('loss', np.nan)):
            print('model stoped training, because loss found to be NaN...')
            sef.model.stop_training = True
        else:
            self.history['loss'].append(round(logs.get('loss'), 4))
        
        self.history['acc'].append(round(logs.get('acc'), 4))
        
        if logs.get('val_loss', -1) != -1:
            self.history['val_loss'].append(round(logs.get('val_loss'), 4))
        
        
        if logs.get('val_acc', -1) != -1:
            self.history['val_acc'].append(round(logs.get('val_acc'), 4))
         
        # finding auc and micro f1_score
        auc = round(roc_auc_score(self.y, y_hat_pred), 4)
        f1_micro = round(f1_score(self.y, y_hat, average='micro'), 4)
        self.history['auc'].append(auc)
        self.history['f1_micro'].append(f1_micro)
        print('\nauc: {}    f1_micro: {}'.format(auc, f1_micro))
        
        if not hasattr(self.model.optimizer, "lr"):
            raise ValueError('Optimizer must have a "lr" attribute.')
        
        # Get the current learning rate from model's optimizer.
        lr = float(tf.keras.backend.get_value(self.model.optimizer.learning_rate))
        self.history['learning_rate'].append(lr)
        
        # Call schedule function to get the scheduled learning rate.
        if epoch != 0:
            scheduled_lr = lr_schedule(epoch, lr, self.history['acc'][-2], self.history['val_acc'][-1])
        else:
            scheduled_lr = lr
        
        # Set the value back to the optimizer before this epoch starts
        tf.keras.backend.set_value(self.model.optimizer.lr, scheduled_lr)
        print("\nLearning rate is %6.4f." % (scheduled_lr))
        
        # Terminating the training if any of the weight are NaN
        for weights in self.model.get_weights():
            if np.isnan(np.sum(weights)):
                print("model stoped training, because any of the weight found to be NaN...")
                sef.model.stop_training = True
            
        return

<b>Model-1</b>
<pre>
1. Use tanh as an activation for every layer except output layer.
2. use SGD with momentum as optimizer.
3. use RandomUniform(0,1) as initilizer.
3. Analyze your output and training process. 
</pre>


In [15]:
def create_model():
    return tf.keras.models.Sequential([
        # input layer
        tf.keras.layers.Flatten(input_shape=(2,)),
        # Hidden layars
        tf.keras.layers.Dense(4, activation='tanh', kernel_initializer=tf.keras.initializers.RandomUniform(0,1)),
        tf.keras.layers.Dense(4, activation='tanh', kernel_initializer=tf.keras.initializers.RandomUniform(0,1)),
        tf.keras.layers.Dense(4, activation='tanh', kernel_initializer=tf.keras.initializers.RandomUniform(0,1)),
        tf.keras.layers.Dense(4, activation='tanh', kernel_initializer=tf.keras.initializers.RandomUniform(0,1)),
        tf.keras.layers.Dense(2, activation='tanh', kernel_initializer=tf.keras.initializers.RandomUniform(0,1)),
        # output layer
        tf.keras.layers.Dense(1, activation='sigmoid', kernel_initializer=tf.keras.initializers.RandomUniform(0,1))
  ])

binary_model = create_model()
binary_model.compile(optimizer = tf.keras.optimizers.SGD(learning_rate=0.1, momentum=0.9),
                     loss = 'binary_crossentropy',
                     metrics = ['accuracy'])

filepath="model_save/model1/weights-{epoch:02d}-{val_acc:.4f}.hdf5"
checkpoint = ModelCheckpoint(filepath=filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='auto')

earlystop = EarlyStopping(monitor='val_loss', min_delta=0, patience=2, verbose=1)

metrics_binary = Metrics(x_cv, y_cv)


log_dir="logs\\model1\\"
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1, write_graph=True, write_grads=True)

callbacks_list = [metrics_binary, checkpoint, earlystop]#, tensorboard_callback]

binary_model.fit(x, y, epochs=50, validation_data=(x_cv, y_cv), callbacks=callbacks_list)


pd.DataFrame(metrics_binary.history)

Train on 20000 samples, validate on 6000 samples
Epoch 1/50
auc: 0.5058    f1_micro: 0.5242

Learning rate is 0.1000.

Epoch 00001: val_acc improved from -inf to 0.52417, saving model to model_save/model1/weights-01-0.5242.hdf5
Epoch 2/50
auc: 0.5057    f1_micro: 0.5318

Learning rate is 0.1000.

Epoch 00002: val_acc improved from 0.52417 to 0.53183, saving model to model_save/model1/weights-02-0.5318.hdf5
Epoch 3/50
auc: 0.5071    f1_micro: 0.5405

Learning rate is 0.0950.

Epoch 00003: val_acc improved from 0.53183 to 0.54050, saving model to model_save/model1/weights-03-0.5405.hdf5
Epoch 4/50
auc: 0.5203    f1_micro: 0.5032

Learning rate is 0.0855.

Epoch 00004: val_acc did not improve from 0.54050
Epoch 00004: early stopping


Unnamed: 0,epoch,learning_rate,loss,acc,val_loss,val_acc,auc,f1_micro
0,1,0.1,0.693,0.5209,0.6881,0.5242,0.5058,0.5242
1,2,0.1,0.6877,0.5228,0.6842,0.5318,0.5057,0.5318
2,3,0.1,0.687,0.526,0.691,0.5405,0.5071,0.5405
3,4,0.095,0.6855,0.5349,0.6936,0.5032,0.5203,0.5032


In [16]:
%load_ext tensorboard
%tensorboard --logdir logs/model
# # %tensorboard --logdir logs/model1 --host localhost

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


Reusing TensorBoard on port 6006 (pid 8592), started 0:41:08 ago. (Use '!kill 8592' to kill it.)

<b>Model-2</b>
<pre>
1. Use relu as an activation for every layer except output layer.
2. use SGD with momentum as optimizer.
3. use RandomUniform(0,1) as initilizer.
3. Anamyze your output and training process. 
</pre>

In [12]:
def create_model():
    return tf.keras.models.Sequential([
        # input layer
        tf.keras.layers.Flatten(input_shape=(2,)),
        # Hidden layars
        tf.keras.layers.Dense(4, activation='relu', kernel_initializer=tf.keras.initializers.RandomUniform(0,1)),
        tf.keras.layers.Dense(4, activation='relu', kernel_initializer=tf.keras.initializers.RandomUniform(0,1)),
        tf.keras.layers.Dense(4, activation='relu', kernel_initializer=tf.keras.initializers.RandomUniform(0,1)),
        tf.keras.layers.Dense(4, activation='relu', kernel_initializer=tf.keras.initializers.RandomUniform(0,1)),
        tf.keras.layers.Dense(2, activation='relu', kernel_initializer=tf.keras.initializers.RandomUniform(0,1)),
        # output layer
        tf.keras.layers.Dense(1, activation='sigmoid', kernel_initializer=tf.keras.initializers.RandomUniform(0,1))
  ])

binary_model = create_model()
binary_model.compile(optimizer = tf.keras.optimizers.SGD(learning_rate=0.1, momentum=0.9),
                     loss = 'binary_crossentropy',
                     metrics = ['accuracy'])


filepath="model_save/model2/weights-{epoch:02d}-{val_acc:.4f}.hdf5"
checkpoint = ModelCheckpoint(filepath=filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='auto')


earlystop = EarlyStopping(monitor='val_loss', min_delta=0, patience=2, verbose=1)


metrics_binary = Metrics(x_cv, y_cv)


log_dir="logs\\model2\\"
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1, write_graph=True, write_grads=True)

callbacks_list = [metrics_binary, checkpoint, earlystop, tensorboard_callback]

binary_model.fit(x, y, epochs=50, validation_data=(x_cv, y_cv), callbacks=callbacks_list)


pd.DataFrame(metrics_binary.history)

Train on 20000 samples, validate on 6000 samples
Epoch 1/50

InvalidArgumentError: You must feed a value for placeholder tensor 'flatten_input' with dtype float and shape [?,2]
	 [[{{node flatten_input}}]]

In [None]:
%load_ext tensorboard
%tensorboard --logdir logs/sgd_moment
# # %tensorboard --logdir logs/sgd_moment --host localhost

<b>Model-3</b>
<pre>
1. Use relu as an activation for every layer except output layer.
2. use SGD with momentum as optimizer.
3. use he_uniform() as initilizer.
3. Analyze your output and training process. 
</pre>

In [16]:
def create_model():
    return tf.keras.models.Sequential([
        # input layer
        tf.keras.layers.Flatten(input_shape=(2,)),
        # Hidden layars
        tf.keras.layers.Dense(4, activation='relu', kernel_initializer=tf.keras.initializers.he_uniform()),
        tf.keras.layers.Dense(4, activation='relu', kernel_initializer=tf.keras.initializers.he_uniform()),
        tf.keras.layers.Dense(4, activation='relu', kernel_initializer=tf.keras.initializers.he_uniform()),
        tf.keras.layers.Dense(4, activation='relu', kernel_initializer=tf.keras.initializers.he_uniform()),
        tf.keras.layers.Dense(2, activation='relu', kernel_initializer=tf.keras.initializers.he_uniform()),
        # output layer
        tf.keras.layers.Dense(1, activation='sigmoid', kernel_initializer=tf.keras.initializers.he_uniform())
  ])

binary_model = create_model()
binary_model.compile(optimizer = tf.keras.optimizers.SGD(learning_rate=0.1, momentum=0.9),
                     loss = 'binary_crossentropy',
                     metrics = ['accuracy'])

filepath="model_save/model3/weights-{epoch:02d}-{val_acc:.4f}.hdf5"
checkpoint = ModelCheckpoint(filepath=filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='auto')

earlystop = EarlyStopping(monitor='val_loss', min_delta=0, patience=2, verbose=1)

metrics_binary = Metrics(x_cv, y_cv)


log_dir="logs\\model3\\"
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1, write_graph=True, write_grads=True)

callbacks_list = [metrics_binary, checkpoint, earlystop, tensorboard_callback]

binary_model.fit(x, y, epochs=50, validation_data=(x_cv, y_cv), callbacks=callbacks_list)


pd.DataFrame(metrics_binary.history)

Train on 20000 samples, validate on 6000 samples
Epoch 1/50
auc: 0.5    f1_micro: 0.4968

Learning rate is 0.1000.

Epoch 00001: val_acc improved from -inf to 0.49683, saving model to model_save/weights-01-0.4968.hdf5
Epoch 2/50
auc: 0.5    f1_micro: 0.4968

Learning rate is 0.0900.

Epoch 00002: val_acc did not improve from 0.49683
Epoch 3/50
auc: 0.5    f1_micro: 0.4968

Learning rate is 0.0770.

Epoch 00003: val_acc did not improve from 0.49683
Epoch 4/50
auc: 0.5    f1_micro: 0.5032

Learning rate is 0.0693.

Epoch 00004: val_acc improved from 0.49683 to 0.50317, saving model to model_save/weights-04-0.5032.hdf5
Epoch 5/50
auc: 0.5    f1_micro: 0.5032

Learning rate is 0.0693.

Epoch 00005: val_acc did not improve from 0.50317
Epoch 6/50
auc: 0.5    f1_micro: 0.5032

Learning rate is 0.0658.

Epoch 00006: val_acc did not improve from 0.50317
Epoch 00006: early stopping


Unnamed: 0,learning_rate,loss,acc,val_loss,val_acc,auc,f1_micro
1,0.1,0.6949,0.4983,0.6946,0.4968,0.5,0.4968
2,0.1,0.6955,0.5023,0.6932,0.4968,0.5,0.4968
3,0.09,0.6943,0.5044,0.6961,0.4968,0.5,0.4968
4,0.07695,0.6945,0.5023,0.6931,0.5032,0.5,0.5032
5,0.069255,0.6946,0.4971,0.6953,0.5032,0.5,0.5032
6,0.069255,0.6943,0.4963,0.6934,0.5032,0.5,0.5032


<b>Model-4</b>
<pre>
1. Try with any values to get better accuracy/f1 score.  
</pre>
</pre>