<a href="https://colab.research.google.com/github/2003MADHAV/Deeplearning_project_beased_experiments-/blob/main/008_Experimenting_with_different_optimizers.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##### Experimenting with different optimizers

The most popular and well known optimizer is Stochastic Gradient Descent (SGD).

This technique is widely used in other machine learning models as well.

SGD is a method to find minima or maxima by iteration.

There are many popular variants of SGD that try to speed up convergence and less tuning by using an adaptive learning rate.

The following table is an overview of the most commonly used optimizers in deep learning:
SGD, Adagrad, AdaDelta, Adam, RMSprop, Momemtum, NEstrov Accelerated Gradient(NAG)

The choice of the optimizer is arbitrary and largely depends on the users ability to tune the optimizer.

There is no best solution that performs best for all problems.

SGD gives the user the ability to avoid local optima by picking a small learning rate, but the downside is that the training time takes significantly longer.

Today we will train our network with different optimizers and compare the results.

In [None]:
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras.optimizers import SGD, Adadelta, Adam, RMSprop, Adagrad, Nadam, Adamax

SEED = 2022

In [None]:
# Data can be downloaded at https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv

In [None]:
data = pd.read_csv('C:\\Users\\ifsrk\\Documents\\01 Deep Learning\\winequality-red.csv', sep=';')
y = data['quality']
X = data.drop(['quality'], axis=1)

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=SEED)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=SEED)

In [None]:
SEED

2022

In [None]:
print(np.any(np.isnan(X_test)))
print(np.any(np.isinf(X_test)))

False
False


In [None]:
print(np.any(np.isnan(X_train)))
print(np.any(np.isinf(X_train)))

False
False


In [None]:
print(np.any(np.isnan(y_test)))
print(np.any(np.isinf(y_test)))

False
False


In [None]:
print(np.any(np.isnan(y_train)))
print(np.any(np.isinf(y_train)))

False
False


In [None]:
def create_model(opt):
    model = Sequential()
    model.add(Dense(100, input_dim=X_train.shape[1],
    activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(25, activation='relu'))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(1, activation='linear'))
    return model

Patience is an important parameter of the Early Stopping Callback.
If the patience parameter is set to X number of epochs or iterations, then the training will terminate only if there is no improvement in the monitor performance measure for X epochs or iterations in a row.

Early Stopping is done to avoid the model to overfit

In [None]:
def create_callbacks(opt):
    callbacks = [
    EarlyStopping(monitor='accuracy', patience=50, verbose=2),
    ModelCheckpoint('checkpoints/optimizers_best_' + opt + '.h5', monitor='accuracy', save_best_only=True, verbose=1)
    ]
    return callbacks

In [None]:
opts = dict({
    'sgd': SGD(),
     'sgd-0001': SGD(learning_rate=0.0001, decay=0.00001),
     'adam': Adam(),
     'adadelta': Adadelta(),
     'rmsprop': RMSprop(),
     'rmsprop-0001': RMSprop(learning_rate=0.0001),
     'nadam': Nadam(),
     'adamax': Adamax()
    })

In [None]:
X_train.values

array([[10.6  ,  0.42 ,  0.48 , ...,  3.21 ,  0.87 , 11.3  ],
       [10.   ,  0.49 ,  0.2  , ...,  3.16 ,  0.69 ,  9.2  ],
       [ 9.5  ,  0.56 ,  0.33 , ...,  3.28 ,  0.73 , 11.8  ],
       ...,
       [11.8  ,  0.38 ,  0.55 , ...,  3.11 ,  0.62 , 10.8  ],
       [ 7.4  ,  0.785,  0.19 , ...,  3.16 ,  0.52 ,  9.6  ],
       [11.6  ,  0.23 ,  0.57 , ...,  3.14 ,  0.7  ,  9.9  ]])

In [None]:
batch_size = 128
n_epochs = 1000

results = []
# Loop through the optimizers
for opt in opts:
    model = create_model(opt)
    callbacks = create_callbacks(opt)
    model.compile(loss='mse', optimizer=opts[opt], metrics=['accuracy'])
#   model.compile(loss='mse', optimizer=opts[opt], metrics=['mean_squared_error'])
    hist = model.fit(X_train.values, y_train, batch_size=batch_size, epochs=n_epochs, validation_data=(X_val.values, y_val), verbose=1,
    callbacks=callbacks)
    print(hist.history)
    best_epoch = np.argmax(hist.history['accuracy'])
    print(best_epoch)
    best_acc = hist.history['accuracy'][best_epoch]
    print(best_acc)
    best_model = create_model(opt)
    best_model.summary()
    # Load the model weights with the highest validation accuracy
    best_model.load_weights('checkpoints/optimizers_best_' + opt + '.h5')
    best_model.compile(loss='mse', optimizer=opts[opt], metrics=['accuracy'])
    score = best_model.evaluate(X_test.values, y_test, verbose=0)
    results.append([opt, best_epoch, best_acc, score[1]])

Epoch 1/1000
1/8 [==>...........................] - ETA: 14s - loss: 58.6803 - accuracy: 0.0000e+00
Epoch 1: accuracy improved from -inf to 0.00000, saving model to checkpoints\optimizers_best_sgd.h5
Epoch 2/1000
1/8 [==>...........................] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
Epoch 2: accuracy did not improve from 0.00000
Epoch 3/1000
1/8 [==>...........................] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
Epoch 3: accuracy did not improve from 0.00000
Epoch 4/1000
1/8 [==>...........................] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
Epoch 4: accuracy did not improve from 0.00000
Epoch 5/1000
1/8 [==>...........................] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
Epoch 5: accuracy did not improve from 0.00000
Epoch 6/1000
1/8 [==>...........................] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
Epoch 6: accuracy did not improve from 0.00000
Epoch 7/1000
1/8 [==>...........................] - ETA: 0s - loss: nan - accuracy: 0.0000

Epoch 31/1000
1/8 [==>...........................] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
Epoch 31: accuracy did not improve from 0.00000
Epoch 32/1000
1/8 [==>...........................] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
Epoch 32: accuracy did not improve from 0.00000
Epoch 33/1000
1/8 [==>...........................] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
Epoch 33: accuracy did not improve from 0.00000
Epoch 34/1000
1/8 [==>...........................] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
Epoch 34: accuracy did not improve from 0.00000
Epoch 35/1000
1/8 [==>...........................] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
Epoch 35: accuracy did not improve from 0.00000
Epoch 36/1000
1/8 [==>...........................] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
Epoch 36: accuracy did not improve from 0.00000
Epoch 37/1000
1/8 [==>...........................] - ETA: 0s - loss: nan - accuracy: 0.0000e+00
Epoch 37: accuracy did not improve from 

Epoch 2/1000
1/8 [==>...........................] - ETA: 0s - loss: 2.3407 - accuracy: 0.0000e+00
Epoch 2: accuracy did not improve from 0.00000
Epoch 3/1000
1/8 [==>...........................] - ETA: 0s - loss: 2.8173 - accuracy: 0.0000e+00
Epoch 3: accuracy did not improve from 0.00000
Epoch 4/1000
1/8 [==>...........................] - ETA: 0s - loss: 1.9318 - accuracy: 0.0000e+00
Epoch 4: accuracy did not improve from 0.00000
Epoch 5/1000
1/8 [==>...........................] - ETA: 0s - loss: 1.7958 - accuracy: 0.0000e+00
Epoch 5: accuracy did not improve from 0.00000
Epoch 6/1000
1/8 [==>...........................] - ETA: 0s - loss: 1.4958 - accuracy: 0.0000e+00
Epoch 6: accuracy did not improve from 0.00000
Epoch 7/1000
1/8 [==>...........................] - ETA: 0s - loss: 1.2554 - accuracy: 0.0000e+00
Epoch 7: accuracy did not improve from 0.00000
Epoch 8/1000
1/8 [==>...........................] - ETA: 0s - loss: 1.2371 - accuracy: 0.0000e+00
Epoch 8: accuracy did not improv

1/8 [==>...........................] - ETA: 0s - loss: 0.7878 - accuracy: 0.0000e+00
Epoch 31: accuracy did not improve from 0.00000
Epoch 32/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.7712 - accuracy: 0.0000e+00
Epoch 32: accuracy did not improve from 0.00000
Epoch 33/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.8238 - accuracy: 0.0000e+00
Epoch 33: accuracy did not improve from 0.00000
Epoch 34/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.6987 - accuracy: 0.0000e+00
Epoch 34: accuracy did not improve from 0.00000
Epoch 35/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.6164 - accuracy: 0.0000e+00
Epoch 35: accuracy did not improve from 0.00000
Epoch 36/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.6569 - accuracy: 0.0000e+00
Epoch 36: accuracy did not improve from 0.00000
Epoch 37/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.6965 - accuracy: 0.0000e+00
Epoch 37: accuracy did not improv

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_20 (Dense)            (None, 100)               1200      
                                                                 
 dense_21 (Dense)            (None, 50)                5050      
                                                                 
 dense_22 (Dense)            (None, 25)                1275      
                                                                 
 dense_23 (Dense)            (None, 10)                260       
                                                                 
 dense_24 (Dense)            (None, 1)                 11        
                                                                 
Total params: 7,796
Trainable params: 7,796
Non-trainable params: 0
_________________________________________________________________
Epoch 1/1000
1/8 [==>...........................] - 

1/8 [==>...........................] - ETA: 0s - loss: 0.3775 - accuracy: 0.0000e+00
Epoch 26: accuracy did not improve from 0.00000
Epoch 27/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.4362 - accuracy: 0.0000e+00
Epoch 27: accuracy did not improve from 0.00000
Epoch 28/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.4305 - accuracy: 0.0000e+00
Epoch 28: accuracy did not improve from 0.00000
Epoch 29/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.4282 - accuracy: 0.0000e+00
Epoch 29: accuracy did not improve from 0.00000
Epoch 30/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.5465 - accuracy: 0.0000e+00
Epoch 30: accuracy did not improve from 0.00000
Epoch 31/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.3988 - accuracy: 0.0000e+00
Epoch 31: accuracy did not improve from 0.00000
Epoch 32/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.5021 - accuracy: 0.0000e+00
Epoch 32: accuracy did not improv

Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_30 (Dense)            (None, 100)               1200      
                                                                 
 dense_31 (Dense)            (None, 50)                5050      
                                                                 
 dense_32 (Dense)            (None, 25)                1275      
                                                                 
 dense_33 (Dense)            (None, 10)                260       
                                                                 
 dense_34 (Dense)            (None, 1)                 11        
                                                                 
Total params: 7,796
Trainable params: 7,796
Non-trainable params: 0
_________________________________________________________________
Epoch 1/1000
1/8 [==>...........................] - 

Epoch 26/1000
1/8 [==>...........................] - ETA: 0s - loss: 144.2351 - accuracy: 0.0000e+00
Epoch 26: accuracy did not improve from 0.00000
Epoch 27/1000
1/8 [==>...........................] - ETA: 0s - loss: 144.5144 - accuracy: 0.0000e+00
Epoch 27: accuracy did not improve from 0.00000
Epoch 28/1000
1/8 [==>...........................] - ETA: 0s - loss: 140.5397 - accuracy: 0.0000e+00
Epoch 28: accuracy did not improve from 0.00000
Epoch 29/1000
1/8 [==>...........................] - ETA: 0s - loss: 158.3648 - accuracy: 0.0000e+00
Epoch 29: accuracy did not improve from 0.00000
Epoch 30/1000
1/8 [==>...........................] - ETA: 0s - loss: 153.6864 - accuracy: 0.0000e+00
Epoch 30: accuracy did not improve from 0.00000
Epoch 31/1000
1/8 [==>...........................] - ETA: 0s - loss: 153.6566 - accuracy: 0.0000e+00
Epoch 31: accuracy did not improve from 0.00000
Epoch 32/1000
1/8 [==>...........................] - ETA: 0s - loss: 140.1255 - accuracy: 0.0000e+00
Epoch

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_40 (Dense)            (None, 100)               1200      
                                                                 
 dense_41 (Dense)            (None, 50)                5050      
                                                                 
 dense_42 (Dense)            (None, 25)                1275      
                                                                 
 dense_43 (Dense)            (None, 10)                260       
                                                                 
 dense_44 (Dense)            (None, 1)                 11        
                                                                 
Total params: 7,796
Trainable params: 7,796
Non-trainable params: 0
_________________________________________________________________
Epoch 1/1000
1/8 [==>...........................] - ETA: 4s - loss: 20.045

1/8 [==>...........................] - ETA: 0s - loss: 0.4642 - accuracy: 0.0000e+00
Epoch 26: accuracy did not improve from 0.00000
Epoch 27/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.7084 - accuracy: 0.0000e+00
Epoch 27: accuracy did not improve from 0.00000
Epoch 28/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.4832 - accuracy: 0.0000e+00
Epoch 28: accuracy did not improve from 0.00000
Epoch 29/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.4697 - accuracy: 0.0000e+00
Epoch 29: accuracy did not improve from 0.00000
Epoch 30/1000
1/8 [==>...........................] - ETA: 0s - loss: 1.1630 - accuracy: 0.0000e+00
Epoch 30: accuracy did not improve from 0.00000
Epoch 31/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.5795 - accuracy: 0.0000e+00
Epoch 31: accuracy did not improve from 0.00000
Epoch 32/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.5472 - accuracy: 0.0000e+00
Epoch 32: accuracy did not improv

Model: "sequential_10"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_50 (Dense)            (None, 100)               1200      
                                                                 
 dense_51 (Dense)            (None, 50)                5050      
                                                                 
 dense_52 (Dense)            (None, 25)                1275      
                                                                 
 dense_53 (Dense)            (None, 10)                260       
                                                                 
 dense_54 (Dense)            (None, 1)                 11        
                                                                 
Total params: 7,796
Trainable params: 7,796
Non-trainable params: 0
_________________________________________________________________
Epoch 1/1000
1/8 [==>...........................] -

1/8 [==>...........................] - ETA: 0s - loss: 0.5144 - accuracy: 0.0000e+00
Epoch 26: accuracy did not improve from 0.00000
Epoch 27/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.5628 - accuracy: 0.0000e+00
Epoch 27: accuracy did not improve from 0.00000
Epoch 28/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.5867 - accuracy: 0.0000e+00
Epoch 28: accuracy did not improve from 0.00000
Epoch 29/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.7435 - accuracy: 0.0000e+00
Epoch 29: accuracy did not improve from 0.00000
Epoch 30/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.5829 - accuracy: 0.0000e+00
Epoch 30: accuracy did not improve from 0.00000
Epoch 31/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.7252 - accuracy: 0.0000e+00
Epoch 31: accuracy did not improve from 0.00000
Epoch 32/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.7312 - accuracy: 0.0000e+00
Epoch 32: accuracy did not improv

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_60 (Dense)            (None, 100)               1200      
                                                                 
 dense_61 (Dense)            (None, 50)                5050      
                                                                 
 dense_62 (Dense)            (None, 25)                1275      
                                                                 
 dense_63 (Dense)            (None, 10)                260       
                                                                 
 dense_64 (Dense)            (None, 1)                 11        
                                                                 
Total params: 7,796
Trainable params: 7,796
Non-trainable params: 0
_________________________________________________________________
Epoch 1/1000
1/8 [==>...........................] - ETA: 4s - loss: 42.402

1/8 [==>...........................] - ETA: 0s - loss: 0.4381 - accuracy: 0.0000e+00
Epoch 26: accuracy did not improve from 0.00000
Epoch 27/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.3659 - accuracy: 0.0000e+00
Epoch 27: accuracy did not improve from 0.00000
Epoch 28/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.4011 - accuracy: 0.0000e+00
Epoch 28: accuracy did not improve from 0.00000
Epoch 29/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.5265 - accuracy: 0.0000e+00
Epoch 29: accuracy did not improve from 0.00000
Epoch 30/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.7251 - accuracy: 0.0000e+00
Epoch 30: accuracy did not improve from 0.00000
Epoch 31/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.7080 - accuracy: 0.0000e+00
Epoch 31: accuracy did not improve from 0.00000
Epoch 32/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.5258 - accuracy: 0.0000e+00
Epoch 32: accuracy did not improv

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_70 (Dense)            (None, 100)               1200      
                                                                 
 dense_71 (Dense)            (None, 50)                5050      
                                                                 
 dense_72 (Dense)            (None, 25)                1275      
                                                                 
 dense_73 (Dense)            (None, 10)                260       
                                                                 
 dense_74 (Dense)            (None, 1)                 11        
                                                                 
Total params: 7,796
Trainable params: 7,796
Non-trainable params: 0
_________________________________________________________________
Epoch 1/1000
1/8 [==>...........................] - ETA: 2s - loss: 28.615

1/8 [==>...........................] - ETA: 0s - loss: 0.4653 - accuracy: 0.0000e+00
Epoch 26: accuracy did not improve from 0.00000
Epoch 27/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.5072 - accuracy: 0.0000e+00
Epoch 27: accuracy did not improve from 0.00000
Epoch 28/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.5192 - accuracy: 0.0000e+00
Epoch 28: accuracy did not improve from 0.00000
Epoch 29/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.4131 - accuracy: 0.0000e+00
Epoch 29: accuracy did not improve from 0.00000
Epoch 30/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.3424 - accuracy: 0.0000e+00
Epoch 30: accuracy did not improve from 0.00000
Epoch 31/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.4565 - accuracy: 0.0000e+00
Epoch 31: accuracy did not improve from 0.00000
Epoch 32/1000
1/8 [==>...........................] - ETA: 0s - loss: 0.6046 - accuracy: 0.0000e+00
Epoch 32: accuracy did not improv

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_80 (Dense)            (None, 100)               1200      
                                                                 
 dense_81 (Dense)            (None, 50)                5050      
                                                                 
 dense_82 (Dense)            (None, 25)                1275      
                                                                 
 dense_83 (Dense)            (None, 10)                260       
                                                                 
 dense_84 (Dense)            (None, 1)                 11        
                                                                 
Total params: 7,796
Trainable params: 7,796
Non-trainable params: 0
_________________________________________________________________


In [None]:
res = pd.DataFrame(results)
res

Unnamed: 0,0,1,2,3
0,sgd,0,0.0,0.0
1,sgd-0001,0,0.0,0.0
2,adam,0,0.0,0.0
3,adadelta,0,0.0,0.0
4,rmsprop,0,0.0,0.0
5,rmsprop-0001,0,0.0,0.0
6,nadam,0,0.0,0.0
7,adamax,0,0.0,0.0


In [None]:

res.columns = ['optimizer', 'epochs', 'val_accuracy', 'test_accuracy']
res

Unnamed: 0,optimizer,epochs,val_accuracy,test_accuracy
0,rmsprop,216,0.574219,0.571875
1,adamax,251,0.585938,0.603125
2,sgd-0001,167,0.5625,0.571875
3,nadam,133,0.582031,0.553125
4,adam,139,0.578125,0.58125
5,sgd,0,0.0,0.0
6,rmsprop-0001,62,0.550781,0.565625
7,adadelta,208,0.578125,0.575
