## Deep Learning Tutorial 03: Advanced MLP - Learning Rate Scheduling (Ch17)

from Deep Learning with Python by Jason Brownlee (2016)
[e-book](https://machinelearningmastery.com/deep-learning-with-python/)
[요약](http://machinelearningmastery.com/introduction-python-deep-learning-library-keras/)

# Chapter 17 Lift Performance With Learning Rate Schedules

## 17.2 Ionosphere Classification Dataset
[데이터셋 홈페이지1](http://www.is.umk.pl/projects/datasets.html#Ionosphere), 
[데이터셋 홈페이지2](https://archive.ics.uci.edu/ml/datasets/Ionosphere), 
[데이터 파일](http://archive.ics.uci.edu/ml/machine-learning-databases/ionosphere/ionosphere.data)

\#Features: 34  
Class: Good/Bad

In [1]:
# 데이터를 다운받으려면 아래 주석을 지우고 실행하세요.
#!curl -o ~/Downloads/ionosphere.csv http://archive.ics.uci.edu/ml/machine-learning-databases/ionosphere/ionosphere.data

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 76467  100 76467    0     0  62837      0  0:00:01  0:00:01 --:--:-- 83937


In [2]:
!ls -al ~/Downloads/

total 421264
drwxr-xr-x  2 kikim kikim      4096  7월  7 16:07 .
drwxr-xr-x 33 kikim kikim      4096  7월  6 20:28 ..
-rw-rw-r--  1 kikim kikim 418188731  7월  2 14:29 Anaconda2-4.1.0-Linux-x86_64.sh
-rw-r--r--  1 kikim kikim    876581  7월  5 10:01 county_facts.csv
-rw-rw-r--  1 kikim kikim     49082  7월  7 10:16 housing.csv
-rw-rw-r--  1 kikim kikim     76467  7월  7 16:07 ionosphere.csv
-rw-rw-r--  1 kikim kikim      4551  7월  7 08:57 iris.csv
-rw-rw-r--  1 kikim kikim     23279  7월  6 18:30 pima-indians-diabetes.csv
-rw-rw-r--  1 kikim kikim     87776  7월  7 10:15 sonar.csv
-rw-rw-r--  1 kikim kikim  12027438  7월  1 17:01 tfk-notebooks-master.zip
-rw-rw-r--  1 kikim kikim      1361  7월  5 14:57 uk_rain_2014.csv


In [3]:
import pandas as pd
df = pd.read_csv('~/Downloads/ionosphere.csv', header=None)
df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,25,26,27,28,29,30,31,32,33,34
0,1,0,0.99539,-0.05889,0.85243,0.02306,0.83398,-0.37708,1.00000,0.03760,...,-0.51171,0.41078,-0.46168,0.21266,-0.34090,0.42267,-0.54487,0.18641,-0.45300,g
1,1,0,1.00000,-0.18829,0.93035,-0.36156,-0.10868,-0.93597,1.00000,-0.04549,...,-0.26569,-0.20468,-0.18401,-0.19040,-0.11593,-0.16626,-0.06288,-0.13738,-0.02447,b
2,1,0,1.00000,-0.03365,1.00000,0.00485,1.00000,-0.12062,0.88965,0.01198,...,-0.40220,0.58984,-0.22145,0.43100,-0.17365,0.60436,-0.24180,0.56045,-0.38238,g
3,1,0,1.00000,-0.45161,1.00000,1.00000,0.71216,-1.00000,0.00000,0.00000,...,0.90695,0.51613,1.00000,1.00000,-0.20099,0.25682,1.00000,-0.32382,1.00000,b
4,1,0,1.00000,-0.02401,0.94140,0.06531,0.92106,-0.23255,0.77152,-0.16399,...,-0.65158,0.13290,-0.53206,0.02431,-0.62197,-0.05707,-0.59573,-0.04608,-0.65697,g
5,1,0,0.02337,-0.00592,-0.09924,-0.11949,-0.00763,-0.11824,0.14706,0.06637,...,-0.01535,-0.03240,0.09223,-0.07859,0.00732,0.00000,0.00000,-0.00039,0.12011,b
6,1,0,0.97588,-0.10602,0.94601,-0.20800,0.92806,-0.28350,0.85996,-0.27342,...,-0.81634,0.13659,-0.82510,0.04606,-0.82395,-0.04262,-0.81318,-0.13832,-0.80975,g
7,0,0,0.00000,0.00000,0.00000,0.00000,1.00000,-1.00000,0.00000,0.00000,...,1.00000,1.00000,1.00000,0.00000,0.00000,1.00000,1.00000,0.00000,0.00000,b
8,1,0,0.96355,-0.07198,1.00000,-0.14333,1.00000,-0.21313,1.00000,-0.36174,...,-0.65440,0.57577,-0.69712,0.25435,-0.63919,0.45114,-0.72779,0.38895,-0.73420,g
9,1,0,-0.01864,-0.08459,0.00000,0.00000,0.00000,0.00000,0.11470,-0.26810,...,-0.01326,0.20645,-0.02294,0.00000,0.00000,0.16595,0.24086,-0.08208,0.38065,b


## 17.3 Time-Based Learning Rate Schedule

In [5]:
# Time Based Learning Rate Decay
import pandas
import numpy
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
from sklearn.preprocessing import LabelEncoder
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataframe = pandas.read_csv("~/Downloads/ionosphere.csv", header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:34].astype(float)
Y = dataset[:,34]
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
Y = encoder.transform(Y)
# create model
model = Sequential()
model.add(Dense(34, input_dim=34, init='normal', activation='relu'))
model.add(Dense(1, init='normal', activation='sigmoid'))
# Compile model
epochs = 50
learning_rate = 0.1

decay_rate = learning_rate / epochs

momentum = 0.8
sgd = SGD(lr=learning_rate, momentum=momentum, decay=decay_rate, nesterov=False)
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy'])
# Fit the model
model.fit(X, Y, validation_split=0.33, nb_epoch=epochs, batch_size=28, verbose=2)

Train on 235 samples, validate on 116 samples
Epoch 1/50
0s - loss: 0.6756 - acc: 0.7277 - val_loss: 0.6029 - val_acc: 0.8621
Epoch 2/50
0s - loss: 0.6179 - acc: 0.7787 - val_loss: 0.4956 - val_acc: 0.8793
Epoch 3/50
0s - loss: 0.5326 - acc: 0.8170 - val_loss: 0.4504 - val_acc: 0.9483
Epoch 4/50
0s - loss: 0.4405 - acc: 0.8298 - val_loss: 0.4003 - val_acc: 0.9397
Epoch 5/50
0s - loss: 0.3678 - acc: 0.8681 - val_loss: 0.4080 - val_acc: 0.8793
Epoch 6/50
0s - loss: 0.3104 - acc: 0.8979 - val_loss: 0.2980 - val_acc: 0.9397
Epoch 7/50
0s - loss: 0.2732 - acc: 0.9234 - val_loss: 0.1971 - val_acc: 0.9569
Epoch 8/50
0s - loss: 0.2313 - acc: 0.9106 - val_loss: 0.2190 - val_acc: 0.9397
Epoch 9/50
0s - loss: 0.2148 - acc: 0.9191 - val_loss: 0.1976 - val_acc: 0.9483
Epoch 10/50
0s - loss: 0.1930 - acc: 0.9319 - val_loss: 0.2361 - val_acc: 0.9138
Epoch 11/50
0s - loss: 0.2077 - acc: 0.9319 - val_loss: 0.1159 - val_acc: 0.9741
Epoch 12/50
0s - loss: 0.1728 - acc: 0.9447 - val_loss: 0.1710 - val_acc

<keras.callbacks.History at 0x7f698450e410>

## 17.4 Drop-Based Learning Rate Schedule

In [7]:
# Drop-Based Learning Rate Decay
import pandas
import pandas
import numpy
import math
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
from sklearn.preprocessing import LabelEncoder
from keras.callbacks import LearningRateScheduler

# learning rate schedule
def step_decay(epoch):
    initial_lrate = 0.1
    drop = 0.5
    epochs_drop = 10.0
    lrate = initial_lrate * math.pow(drop, math.floor((1+epoch)/epochs_drop))
    return lrate

# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataframe = pandas.read_csv("~/Downloads/ionosphere.csv", header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:34].astype(float)
Y = dataset[:,34]
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
Y = encoder.transform(Y)
# create model
model = Sequential()
model.add(Dense(34, input_dim=34, init='normal', activation='relu'))
model.add(Dense(1, init='normal', activation='sigmoid'))
# Compile model
sgd = SGD(lr=0.0, momentum=0.9, decay=0.0, nesterov=False)
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy'])


# learning schedule callback
lrate = LearningRateScheduler(step_decay)
callbacks_list = [lrate]


# Fit the model
model.fit(X, Y, validation_split=0.33, nb_epoch=50, batch_size=28, callbacks=callbacks_list, verbose=2)

Train on 235 samples, validate on 116 samples
Epoch 1/50
0s - loss: 0.6742 - acc: 0.7277 - val_loss: 0.5805 - val_acc: 0.8707
Epoch 2/50
0s - loss: 0.5959 - acc: 0.7745 - val_loss: 0.4430 - val_acc: 0.8879
Epoch 3/50
0s - loss: 0.4660 - acc: 0.8043 - val_loss: 0.3540 - val_acc: 0.9397
Epoch 4/50
0s - loss: 0.3386 - acc: 0.8766 - val_loss: 0.3263 - val_acc: 0.8879
Epoch 5/50
0s - loss: 0.2724 - acc: 0.8851 - val_loss: 0.3341 - val_acc: 0.8879
Epoch 6/50
0s - loss: 0.2235 - acc: 0.9191 - val_loss: 0.2656 - val_acc: 0.8966
Epoch 7/50
0s - loss: 0.1990 - acc: 0.9362 - val_loss: 0.1792 - val_acc: 0.9741
Epoch 8/50
0s - loss: 0.1568 - acc: 0.9532 - val_loss: 0.1228 - val_acc: 0.9828
Epoch 9/50
0s - loss: 0.1439 - acc: 0.9489 - val_loss: 0.1625 - val_acc: 0.9655
Epoch 10/50
0s - loss: 0.1204 - acc: 0.9617 - val_loss: 0.1006 - val_acc: 0.9828
Epoch 11/50
0s - loss: 0.1344 - acc: 0.9574 - val_loss: 0.0684 - val_acc: 0.9828
Epoch 12/50
0s - loss: 0.1145 - acc: 0.9574 - val_loss: 0.1125 - val_acc

<keras.callbacks.History at 0x7f6973dbda90>

## 17.5 Tips for Using Learning Rate Schedules

- Large learning rate
- Large momentum
- 다양한 스케쥴로 테스트하고 비교