# Lift Performance with Learning Rate Schedules

- Learning rate schedules are of two types:
  - time based learning schedule
  - drop based learning schedule


- Time based learning schedule:
  - With each echo the learning rate is dropped from a higer value
  - $ Learning rate = Learning rate * \frac {1}{(1 + decay * epoch)}$


 - Drop based learning schedule:
   - With each echo the learning rate is dropped by half
   - $ Learning rate = Learning rate * Drop rate ^ {floor(\frac{(1+epoch)}{epoch drop}} $


This project classifies 34 features of 351 samples collected from radar to distinguish between 'Good' ionoshere and 'Bad' ionosphere. This problem is used to demonstrate how can we lift performance with the two learning rate schedules.

Data source:
- https://raw.githubusercontent.com/jbrownlee/Datasets/master/ionosphere.csv or
- https://archive.ics.uci.edu/ml/datasets/Ionosphere

Tips:
- Set initial learning rate to a high value as it decreases
- Using a larger momentum value will help the optimization algorithm continue to make updates in the right direction when your learning rate shrinks to small values.
- It will not be clear which learning rate schedule to use so try a few with different configuration options and see what works best on your problem. Also try schedules that change exponentially and even schedules that respond to the accuracy of your model on the training or test datasets.

In [1]:
import pandas as pd
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from tensorflow.keras.optimizers import SGD
from sklearn.preprocessing import LabelEncoder

In [2]:
dataframe = pd.read_csv('Data/ionosphere.csv', header=None)
dataset = dataframe.values

X = dataset[:, 0:34].astype(float)
y = dataset[:, 34]

encoder = LabelEncoder()
encoder.fit(y)
encoded_y = encoder.transform(y)

model = Sequential()
model.add(Dense(34, input_dim=34, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

__Baseline accuracy__

In [3]:
epochs = 50
sgd = SGD()
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.fit(X, encoded_y, validation_split=0.33, epochs=epochs, batch_size=28, verbose=2)

Train on 235 samples, validate on 116 samples
Epoch 1/50
235/235 - 1s - loss: 0.7222 - accuracy: 0.5489 - val_loss: 0.6154 - val_accuracy: 0.7586
Epoch 2/50
235/235 - 0s - loss: 0.7077 - accuracy: 0.5617 - val_loss: 0.6015 - val_accuracy: 0.7586
Epoch 3/50
235/235 - 0s - loss: 0.6962 - accuracy: 0.5787 - val_loss: 0.5847 - val_accuracy: 0.7586
Epoch 4/50
235/235 - 0s - loss: 0.6857 - accuracy: 0.5872 - val_loss: 0.5705 - val_accuracy: 0.7672
Epoch 5/50
235/235 - 0s - loss: 0.6758 - accuracy: 0.5872 - val_loss: 0.5630 - val_accuracy: 0.7672
Epoch 6/50
235/235 - 0s - loss: 0.6660 - accuracy: 0.5957 - val_loss: 0.5587 - val_accuracy: 0.7672
Epoch 7/50
235/235 - 0s - loss: 0.6575 - accuracy: 0.6043 - val_loss: 0.5523 - val_accuracy: 0.7672
Epoch 8/50
235/235 - 0s - loss: 0.6496 - accuracy: 0.6213 - val_loss: 0.5387 - val_accuracy: 0.7672
Epoch 9/50
235/235 - 0s - loss: 0.6412 - accuracy: 0.6298 - val_loss: 0.5316 - val_accuracy: 0.7672
Epoch 10/50
235/235 - 0s - loss: 0.6336 - accuracy: 0.

<tensorflow.python.keras.callbacks.History at 0x1bab27961c8>

__Time based learning schedule__

In [4]:
epochs = 50
learning_rate = 0.1
decay_rate = learning_rate / epochs
momentum = 0.8

sgd = SGD(lr=learning_rate, momentum=momentum, decay=decay_rate, nesterov=False)    
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])

model.fit(X, encoded_y, validation_split=0.33, epochs=epochs, batch_size=28, verbose=2)

Train on 235 samples, validate on 116 samples
Epoch 1/50
235/235 - 1s - loss: 0.4374 - accuracy: 0.8723 - val_loss: 0.3918 - val_accuracy: 0.9655
Epoch 2/50
235/235 - 0s - loss: 0.4340 - accuracy: 0.8723 - val_loss: 0.3912 - val_accuracy: 0.9655
Epoch 3/50
235/235 - 0s - loss: 0.4308 - accuracy: 0.8766 - val_loss: 0.3857 - val_accuracy: 0.9655
Epoch 4/50
235/235 - 0s - loss: 0.4276 - accuracy: 0.8723 - val_loss: 0.3799 - val_accuracy: 0.9655
Epoch 5/50
235/235 - 0s - loss: 0.4244 - accuracy: 0.8766 - val_loss: 0.3750 - val_accuracy: 0.9655
Epoch 6/50
235/235 - 0s - loss: 0.4219 - accuracy: 0.8723 - val_loss: 0.3746 - val_accuracy: 0.9655
Epoch 7/50
235/235 - 0s - loss: 0.4180 - accuracy: 0.8809 - val_loss: 0.3731 - val_accuracy: 0.9655
Epoch 8/50
235/235 - 0s - loss: 0.4157 - accuracy: 0.8809 - val_loss: 0.3725 - val_accuracy: 0.9655
Epoch 9/50
235/235 - 0s - loss: 0.4123 - accuracy: 0.8851 - val_loss: 0.3687 - val_accuracy: 0.9655
Epoch 10/50
235/235 - 0s - loss: 0.4093 - accuracy: 0.

<tensorflow.python.keras.callbacks.History at 0x1babe1f8208>

__Drop based learning schedule__

In [6]:
from tensorflow.keras.callbacks import LearningRateScheduler
import math

def step_decay(epoch):
    initial_rate = 0.1
    drop = 0.5
    epochs_drop = 10.0
    lrate = initial_rate * math.pow(drop, math.floor((1+epoch)/epochs_drop))

    return lrate

sgd = SGD(lr=0.0, momentum=0.9)
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy'])

lrate = LearningRateScheduler(step_decay)
callbacks_list = [lrate]

model.fit(X, encoded_y, validation_split=0.33, epochs=50, batch_size=28, callbacks=callbacks_list, verbose=2)

Train on 235 samples, validate on 116 samples
Epoch 1/50
235/235 - 1s - loss: 0.3103 - accuracy: 0.9064 - val_loss: 0.1960 - val_accuracy: 0.9655
Epoch 2/50
235/235 - 0s - loss: 0.2527 - accuracy: 0.9191 - val_loss: 0.1651 - val_accuracy: 0.9741
Epoch 3/50
235/235 - 0s - loss: 0.2040 - accuracy: 0.9319 - val_loss: 0.1555 - val_accuracy: 0.9741
Epoch 4/50
235/235 - 0s - loss: 0.1537 - accuracy: 0.9532 - val_loss: 0.1585 - val_accuracy: 0.9483
Epoch 5/50
235/235 - 0s - loss: 0.1391 - accuracy: 0.9447 - val_loss: 0.1288 - val_accuracy: 0.9569
Epoch 6/50
235/235 - 0s - loss: 0.1281 - accuracy: 0.9574 - val_loss: 0.1218 - val_accuracy: 0.9828
Epoch 7/50
235/235 - 0s - loss: 0.1024 - accuracy: 0.9702 - val_loss: 0.0623 - val_accuracy: 0.9828
Epoch 8/50
235/235 - 0s - loss: 0.1171 - accuracy: 0.9447 - val_loss: 0.0830 - val_accuracy: 0.9828
Epoch 9/50
235/235 - 0s - loss: 0.1187 - accuracy: 0.9617 - val_loss: 0.2544 - val_accuracy: 0.9138
Epoch 10/50
235/235 - 0s - loss: 0.1265 - accuracy: 0.

<tensorflow.python.keras.callbacks.History at 0x1babfab0608>