Skip to content

Latest commit

 

History

History
245 lines (211 loc) · 10.9 KB

Readme.md

File metadata and controls

245 lines (211 loc) · 10.9 KB

CNN1l/Readme.md

Summary

  • Strategy ; get higher accuracy and lower loss by increasing parameters of neural network
  • Based on CNN1k
  • In order to increase parameters, 1st Conv2D is changed as follows;
    • Before ; model.add(layers.Conv2D(64, (5, 5), activation='relu', input_shape=(28, 28, 1)))
    • After ; model.add(layers.Conv2D(64, (7, 7), activation='relu', padding='same', input_shape=(28, 28, 1)))

Training conditions and Results of score

No batch_size Lr BatchNomalization Dropout Min of val_loss Max of val_accuracy Score
00 32 default No No 0.03399 (epochs=20) 0.99298 (epochs=34) 0.99092 (epochs=20)
01 32 reducing No No 0.02485 (epochs=47) 0.99417 (epochs=47) 0.99389 (epochs=47)
02 32 reducing (initial=0.004257) No No 0.04623 (epochs=68) 0.98798 (epochs=68)
03 32 reducing No Yes (0.4) 0.02212 (epochs=79) 0.99476 (epochs=75) 0.99450 (epochs=63)
04 32 reducing No Yes (0.4) 0.02218 (epochs=55) 0.99452 (epochs=73) 0.99407 (epochs=55)
05 32 reducing No Yes (0.7) 0.10244 (epochs=62) 0.99167 (epochs=33)
06 32 reducing No Yes (0.4) 0.02138 (epochs=65) 0.99512 (epochs=68) 0.99507 (epochs=62)
07 32 reducing No Yes (0.4) 0.02393 (epochs=75) 0.99429 (epochs=44) 0.99407 (epochs=75)
08 32 reducing No Yes (0.4) 0.02300 (epochs=75) 0.99452 (epochs=53)
09 32 reducing No Yes (0.4) 0.02332 (epochs=80) 0.99452 (epochs=74)
10 32 reducing No Yes (0.4) 0.99432 (epochs=73)

00 ; standard condition

Standard condition of CNN1l.

01 ; Learning Rate reducing

keras.callbacks.ReduceLROnPlateau is used to reduce learning rate. Parameters are as follow.

  • monitor='val_loss'
  • factor=0.47
  • patience=5
  • min_lr=0.00001

Initial learning rate of Adam optimizer is 0.001. So learning rate will change 0.001 -> 0.00047 -> 0.0002209 -> 0.000103823 -> 0.00004879681 -> 0.0000229345007 -> 0.00001077921532 .

epochs are set to 100.

02 ; Learning Rate reducing (2)

  • Initial learning rate ; 0.004257 (larger than default)

  • Learning rate will change 0.004257 -> 0.0020 -> 0.00094 -> 0.00044 -> 0.00020 -> 0.000098 -> 0.000005

  • Parameters of keras.callbacks.ReduceLROnPlateau are

    • monitor='val_loss',
    • factor=0.47,
    • patience=5,
    • min_lr=0.00005,
    • verbose=1

03 ; Learning Rate reducing + Dropout

  • Based on 01 (Learning Rate reducing starting lr=0.001 (default))
  • Set Dropout(0.4) after every Conv2D

04 ; Changing 2nd Conv2D layer

  • Based on 03
  • 2nd Conv2D layer
    • Before ; model.add(layers.Conv2D(128, (3, 3), activation='relu'))
    • After ; model.add(layers.Conv2D(128, (5, 5), activation='relu'))

05 ; Learning Rate reducing + Dropout

  • Based on 03, set Dropout(0.7) instead of Dropout(0.4)

06 ; Learning Rate reducing + Dropout + dobble channels of Conv2D (Saved as Version 14 on Kaggle)

  • Based on 03, channels of Conv2D are doubled

Before

Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 28, 28, 64)        3200      
_________________________________________________________________
dropout (Dropout)            (None, 28, 28, 64)        0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 64)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 12, 12, 128)       73856     
_________________________________________________________________
dropout_1 (Dropout)          (None, 12, 12, 128)       0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 128)         0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 4, 4, 128)         147584    
_________________________________________________________________
dropout_2 (Dropout)          (None, 4, 4, 128)         0         

After

Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 28, 28, 128)       6400      
_________________________________________________________________
dropout (Dropout)            (None, 28, 28, 128)       0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 128)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 12, 12, 256)       295168    
_________________________________________________________________
dropout_1 (Dropout)          (None, 12, 12, 256)       0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 256)         0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 4, 4, 256)         590080    
_________________________________________________________________
dropout_2 (Dropout)          (None, 4, 4, 256)         0         

07 ; again double channels of Conv2D

  • Based on 06, again channels of Conv2D are doubled

After

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 28, 28, 256)       12800     
_________________________________________________________________
dropout (Dropout)            (None, 28, 28, 256)       0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 256)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 12, 12, 512)       1180160   
_________________________________________________________________
dropout_1 (Dropout)          (None, 12, 12, 512)       0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 512)         0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 4, 4, 512)         2359808   
_________________________________________________________________
dropout_2 (Dropout)          (None, 4, 4, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 8192)              0         
_________________________________________________________________
dense (Dense)                (None, 256)               2097408   
_________________________________________________________________
dense_1 (Dense)              (None, 10)                2570      
=================================================================
Total params: 5,652,746
Trainable params: 5,652,746
Non-trainable params: 0
_________________________________________________________________

08 ; change parameters of ImageDataGenerator

  • Based on 07; parameters of ImageDataGenerator are changed

Before

datagen = ImageDataGenerator(rotation_range=30,
                             width_shift_range=0.20,
                             height_shift_range=0.20,
                             shear_range=0.2,
                             zoom_range=0.2,
                             fill_mode='nearest')

After

datagen = ImageDataGenerator(rotation_range=35,
                             width_shift_range=0.25,
                             height_shift_range=0.20,
                             shear_range=2,
                             zoom_range=0.2,
                             fill_mode='nearest')

09 ; changer filter of the 1st Cond2D

  • Based on 07, filter of the 1st Cond2D is changed from (7,7) to (9,9)

10 ; same as 07, but no validation_data

  • Based on 07, trained with no varlidation data. All data are used as train data.

Results of score

  • 00, epochs=20 ; 0.99092
  • 01, epochs=47 ; 0.99389
  • 03
    • epochs=79 ; 0.99392
    • epochs=63 ; 0.99450 (316 / 2326 = 0.1358)
    • epochs=50 ; 0.99410
    • epochs=43 ; 0.99392
  • 04
    • epochs=55 ; 0.99407
    • epochs=41 ; 0.99382
  • 06
    • epochs=65 ; 0.99492 (271 / 2053 = 0.1320)
    • epochs=62 ; 0.99507 (260 / 2069 = 0.1257)
  • 07
    • epochs=75 ; 0.99407
    • epochs=43 ; 0.99403
  • 10
    • epochs=73 ; 0.99432

Files

Graphs

00 ; standard

graphs of accuracy and loss

  • According to train data, accuracy is higher and loss is lower than CNN1h/00.
  • "val_accuracy" and "val_loss" are not stable. The values of them are similar to those of CNN1h/00.

01 ; Learning Rate reducing

graphs of accuracy and loss

  • It seems learning rate under 10^-4 does not work. In the begining, it might be larger learning rate is preferable (?)
  • For the stable reduction of loss or val_loss, BatchNormalization may be work well.

02 ; Learning Rate reducing (2)

graphs of accuracy and loss

  • Bad (accuracy is low and loss is high.)

03 ; Learning Rate reducing + Dropout(0.4)

graphs of accuracy and loss

  • val_loss is smaller than that of 01.

04 ; Changing 2nd Conv2D layer

graphs of accuracy and loss

  • accuracy is getting higher and loss is getting lower, which agrees with the increase of parameter.
  • It's better to try "grid search" to find appropriate parameters, isn't it? Or, try same cases with more parameters.
  • There are more parameters in this case than that of previous one, so the rate of Dropout should be higher(?)

05 ; Learning Rate reducing + Dropout(0.7)

graphs of accuracy and loss

  • Bad (accuracy is low and loss is high.)

06 ; earning Rate reducing + Dropout + dobble channels of Conv2D

graphs of accuracy and loss

  • Seems good
  • val_loss is lower and val_accuracy is higher than those of 03.

07 ; again double channels of Conv2D

graphs of accuracy and loss

  • loss and accuracy are better than 06, but val_loss and val_accuracy are seems same as those of 06.
  • I think, this CNN can be trained more than 06, but cannot predict test data due to variation of train date is not enough. So parameters of ImageDataGenerator should be changed (?)

08 ; change parameters of ImageDataGenerator

graphs of accuracy and loss

  • Comparing to 07, accuracy and loss is worse, but val_accuracy and val_loss are the same

09 ; changer filter of the 1st Cond2D from (7,7) to (9,9)

graphs of accuracy and loss

  • Comparing to 07, accuracy and loss is worse, but val_accuracy and val_loss are the same
  • Seems same as 08

10 ; same as 07, but no validation_data

graphs of accuracy and loss