<a href="https://colab.research.google.com/github/jiwonlee-0218/colab/blob/main/Learning_Rate_Decay.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Learning Rate Scheduler

-----

# 0. 기존 작업

In [None]:
import random
import numpy as np
import os
import tensorflow as tf

def seed_everything(seed: int = 42):
    random.seed(seed)
    np.random.seed(seed)
    os.environ["PYTHONHASHSEED"] = str(seed)
    tf.random.set_seed(seed)
    
seed_everything()

In [None]:
from tensorflow.keras.datasets import fashion_mnist

(x, y), (x_test, y_test) = fashion_mnist.load_data()


from tensorflow.keras.utils import to_categorical

x = x.astype('float32') 
x_test = x_test.astype('float32')

x /= 255
x_test /= 255

y = to_categorical(y, 10)
y_test = to_categorical(y_test, 10)


from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.optimizers import SGD

-----

In [None]:
lr = 0.01
momentum = 0.9

---

In [None]:
def create_model(learning_rate=lr, momentum = momentum):
    model = Sequential()
    model.add(Flatten(input_shape=(28, 28)))
    model.add(Dense(units=32, activation='relu'))
    model.add(Dense(64, 'relu'))
    model.add(Dense(128, 'relu'))
    model.add(Dense(units=10, activation='softmax'))
    
    sgd = SGD(learning_rate=learning_rate, momentum=momentum, nesterov=False)
    model.compile(optimizer=sgd, 
                  loss='categorical_crossentropy',
                 metrics=['accuracy'])

    return model

model = create_model()
model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
flatten_4 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_16 (Dense)             (None, 32)                25120     
_________________________________________________________________
dense_17 (Dense)             (None, 64)                2112      
_________________________________________________________________
dense_18 (Dense)             (None, 128)               8320      
_________________________________________________________________
dense_19 (Dense)             (None, 10)                1290      
Total params: 36,842
Trainable params: 36,842
Non-trainable params: 0
_________________________________________________________________


----

# 1. schedule 함수 만들기

epoch의 값에 따라 learning_rate가 변하는 함수를 만들면 됩니다.

In [None]:
def my_schedule(epoch, learning_rate=lr):
    if epoch < 5:
        return lr
    else:
        return float(lr * tf.math.exp(0.1 * (5- epoch)))

In [None]:
from tensorflow.keras.callbacks import LearningRateScheduler

In [None]:
lr_schedule_custom = LearningRateScheduler(my_schedule, verbose=1)

In [None]:
model = create_model()

model.fit(x, y,  epochs = 20, validation_split = 1/6, callbacks = [lr_schedule_custom], batch_size=512, shuffle=False)

Epoch 1/20

Epoch 00001: LearningRateScheduler reducing learning rate to 0.01.
Epoch 2/20

Epoch 00002: LearningRateScheduler reducing learning rate to 0.01.
Epoch 3/20

Epoch 00003: LearningRateScheduler reducing learning rate to 0.01.
Epoch 4/20

Epoch 00004: LearningRateScheduler reducing learning rate to 0.01.
Epoch 5/20

Epoch 00005: LearningRateScheduler reducing learning rate to 0.01.
Epoch 6/20

Epoch 00006: LearningRateScheduler reducing learning rate to 0.009999999776482582.
Epoch 7/20

Epoch 00007: LearningRateScheduler reducing learning rate to 0.009048374369740486.
Epoch 8/20

Epoch 00008: LearningRateScheduler reducing learning rate to 0.008187307976186275.
Epoch 9/20

Epoch 00009: LearningRateScheduler reducing learning rate to 0.007408181671053171.
Epoch 10/20

Epoch 00010: LearningRateScheduler reducing learning rate to 0.006703200284391642.
Epoch 11/20

Epoch 00011: LearningRateScheduler reducing learning rate to 0.006065306719392538.
Epoch 12/20

Epoch 00012: Learnin

<tensorflow.python.keras.callbacks.History at 0x170e32935e0>

# 2. tensorflow의 scheduler 사용

https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/schedules

**def** decayed_learning_rate(step):

  > return initial_learning_rate * decay_rate ^ (step / decay_steps)
  
  
`decay_steps` 마다 `decay_rate`의 비율로 감소

In [None]:
from tensorflow.keras.optimizers.schedules import ExponentialDecay

In [None]:
lr_scheduler_exp = ExponentialDecay(lr, decay_steps=10000, decay_rate=0.96, staircase=False, name=None)

In [None]:
def exp_model(learning_rate=lr, momentum = momentum):
    model = Sequential()
    model.add(Flatten(input_shape=(28, 28)))
    model.add(Dense(units=32, activation='relu'))
    model.add(Dense(64, 'relu'))
    model.add(Dense(128, 'relu'))
    model.add(Dense(units=10, activation='softmax'))

    return model

In [None]:
model = exp_model()
sgd = SGD(learning_rate=lr_scheduler_exp, momentum=momentum, nesterov=False)
model.compile(optimizer=sgd, 
                  loss='categorical_crossentropy',
                 metrics=['accuracy'])


model.fit(x, y,  epochs = 20, validation_split = 1/6, batch_size=512)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<tensorflow.python.keras.callbacks.History at 0x170e21976a0>

In [None]:
model.optimizer.lr(100000) # 0.96^10

<tf.Tensor: shape=(), dtype=float32, numpy=0.0066483244>

# 3. ReduceLRonPlateau

- monitor에 'val_loss'를 입력하면 val_loss가 더이상 감소되지 않을 경우 ReduceLROnPlateau을 적용합니다.

- factor은 Learning rate를 얼마나 감소시킬 지 정하는 인자값입니다.
새로운 learning rate는 기존 learning rate * factor입니다.

- patience는 patience는 3이고, 30에폭에 정확도가 99%였을 때,
만약 31번째에 정확도 98%, 32번째에 98.5%, 33번째에 98%라면 모델의 개선이 (patience=3)동안 개선이 없었기에,  ReduceLROnPlateau 콜백함수를 실행합니다.

- verbose는 화면에 적용되었다고 나타냅니다.



In [None]:
from tensorflow.keras.callbacks import ReduceLROnPlateau

In [None]:
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=2, verbose=1)

In [None]:
model = create_model()

model.fit(x, y,  epochs = 30, validation_split = 1/6, callbacks = [reduce_lr], batch_size=512, shuffle=False)

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30

Epoch 00029: ReduceLROnPlateau reducing learning rate to 0.0009999999776482583.
Epoch 30/30


<tensorflow.python.keras.callbacks.History at 0x170e34eb8e0>