CNN1l/Readme.md

Summary

Strategy ; get higher accuracy and lower loss by increasing parameters of neural network
Based on CNN1k
In order to increase parameters, 1st Conv2D is changed as follows;
- Before ; model.add(layers.Conv2D(64, (5, 5), activation='relu', input_shape=(28, 28, 1)))
- After ; model.add(layers.Conv2D(64, (7, 7), activation='relu', padding='same', input_shape=(28, 28, 1)))

Training conditions and Results of score

No	batch_size	Lr	BatchNomalization	Dropout	Min of val_loss	Max of val_accuracy	Score
00	32	default	No	No	0.03399 (epochs=20)	0.99298 (epochs=34)	0.99092 (epochs=20)
01	32	reducing	No	No	0.02485 (epochs=47)	0.99417 (epochs=47)	0.99389 (epochs=47)
02	32	reducing (initial=0.004257)	No	No	0.04623 (epochs=68)	0.98798 (epochs=68)
03	32	reducing	No	Yes (0.4)	0.02212 (epochs=79)	0.99476 (epochs=75)	0.99450 (epochs=63)
04	32	reducing	No	Yes (0.4)	0.02218 (epochs=55)	0.99452 (epochs=73)	0.99407 (epochs=55)
05	32	reducing	No	Yes (0.7)	0.10244 (epochs=62)	0.99167 (epochs=33)
06	32	reducing	No	Yes (0.4)	0.02138 (epochs=65)	0.99512 (epochs=68)	0.99507 (epochs=62)
07	32	reducing	No	Yes (0.4)	0.02393 (epochs=75)	0.99429 (epochs=44)	0.99407 (epochs=75)
08	32	reducing	No	Yes (0.4)	0.02300 (epochs=75)	0.99452 (epochs=53)
09	32	reducing	No	Yes (0.4)	0.02332 (epochs=80)	0.99452 (epochs=74)
10	32	reducing	No	Yes (0.4)			0.99432 (epochs=73)

00 ; standard condition

Standard condition of CNN1l.

01 ; Learning Rate reducing

keras.callbacks.ReduceLROnPlateau is used to reduce learning rate. Parameters are as follow.

monitor='val_loss'
factor=0.47
patience=5
min_lr=0.00001

Initial learning rate of Adam optimizer is 0.001. So learning rate will change 0.001 -> 0.00047 -> 0.0002209 -> 0.000103823 -> 0.00004879681 -> 0.0000229345007 -> 0.00001077921532 .

epochs are set to 100.

02 ; Learning Rate reducing (2)

Initial learning rate ; 0.004257 (larger than default)
Learning rate will change 0.004257 -> 0.0020 -> 0.00094 -> 0.00044 -> 0.00020 -> 0.000098 -> 0.000005
Parameters of keras.callbacks.ReduceLROnPlateau are
- monitor='val_loss',
- factor=0.47,
- patience=5,
- min_lr=0.00005,
- verbose=1

03 ; Learning Rate reducing + Dropout

Based on 01 (Learning Rate reducing starting lr=0.001 (default))
Set Dropout(0.4) after every Conv2D

04 ; Changing 2nd Conv2D layer

Based on 03
2nd Conv2D layer
- Before ; model.add(layers.Conv2D(128, (3, 3), activation='relu'))
- After ; model.add(layers.Conv2D(128, (5, 5), activation='relu'))

05 ; Learning Rate reducing + Dropout

Based on 03, set Dropout(0.7) instead of Dropout(0.4)

06 ; Learning Rate reducing + Dropout + dobble channels of Conv2D (Saved as Version 14 on Kaggle)

Based on 03, channels of Conv2D are doubled

Before

Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 28, 28, 64)        3200      
_________________________________________________________________
dropout (Dropout)            (None, 28, 28, 64)        0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 64)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 12, 12, 128)       73856     
_________________________________________________________________
dropout_1 (Dropout)          (None, 12, 12, 128)       0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 128)         0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 4, 4, 128)         147584    
_________________________________________________________________
dropout_2 (Dropout)          (None, 4, 4, 128)         0

After

Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 28, 28, 128)       6400      
_________________________________________________________________
dropout (Dropout)            (None, 28, 28, 128)       0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 128)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 12, 12, 256)       295168    
_________________________________________________________________
dropout_1 (Dropout)          (None, 12, 12, 256)       0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 256)         0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 4, 4, 256)         590080    
_________________________________________________________________
dropout_2 (Dropout)          (None, 4, 4, 256)         0

07 ; again double channels of Conv2D

Based on 06, again channels of Conv2D are doubled

After

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 28, 28, 256)       12800     
_________________________________________________________________
dropout (Dropout)            (None, 28, 28, 256)       0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 256)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 12, 12, 512)       1180160   
_________________________________________________________________
dropout_1 (Dropout)          (None, 12, 12, 512)       0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 512)         0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 4, 4, 512)         2359808   
_________________________________________________________________
dropout_2 (Dropout)          (None, 4, 4, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 8192)              0         
_________________________________________________________________
dense (Dense)                (None, 256)               2097408   
_________________________________________________________________
dense_1 (Dense)              (None, 10)                2570      
=================================================================
Total params: 5,652,746
Trainable params: 5,652,746
Non-trainable params: 0
_________________________________________________________________

08 ; change parameters of ImageDataGenerator

Based on 07; parameters of ImageDataGenerator are changed

Before

datagen = ImageDataGenerator(rotation_range=30,
                             width_shift_range=0.20,
                             height_shift_range=0.20,
                             shear_range=0.2,
                             zoom_range=0.2,
                             fill_mode='nearest')

After

datagen = ImageDataGenerator(rotation_range=35,
                             width_shift_range=0.25,
                             height_shift_range=0.20,
                             shear_range=2,
                             zoom_range=0.2,
                             fill_mode='nearest')

09 ; changer filter of the 1st Cond2D

Based on 07, filter of the 1st Cond2D is changed from (7,7) to (9,9)

10 ; same as 07, but no validation_data

Based on 07, trained with no varlidation data. All data are used as train data.

Results of score

00, epochs=20 ; 0.99092
01, epochs=47 ; 0.99389
03
- epochs=79 ; 0.99392
- epochs=63 ; 0.99450 (316 / 2326 = 0.1358)
- epochs=50 ; 0.99410
- epochs=43 ; 0.99392
04
- epochs=55 ; 0.99407
- epochs=41 ; 0.99382
06
- epochs=65 ; 0.99492 (271 / 2053 = 0.1320)
- epochs=62 ; 0.99507 (260 / 2069 = 0.1257)
07
- epochs=75 ; 0.99407
- epochs=43 ; 0.99403
10
- epochs=73 ; 0.99432

Files

Graphs

00 ; standard

According to train data, accuracy is higher and loss is lower than CNN1h/00.
"val_accuracy" and "val_loss" are not stable. The values of them are similar to those of CNN1h/00.

01 ; Learning Rate reducing

It seems learning rate under 10^-4 does not work. In the begining, it might be larger learning rate is preferable (?)
For the stable reduction of loss or val_loss, BatchNormalization may be work well.

02 ; Learning Rate reducing (2)

Bad (accuracy is low and loss is high.)

03 ; Learning Rate reducing + Dropout(0.4)

val_loss is smaller than that of 01.

04 ; Changing 2nd Conv2D layer

accuracy is getting higher and loss is getting lower, which agrees with the increase of parameter.
It's better to try "grid search" to find appropriate parameters, isn't it? Or, try same cases with more parameters.
There are more parameters in this case than that of previous one, so the rate of Dropout should be higher(?)

05 ; Learning Rate reducing + Dropout(0.7)

Bad (accuracy is low and loss is high.)

06 ; earning Rate reducing + Dropout + dobble channels of Conv2D

Seems good
val_loss is lower and val_accuracy is higher than those of 03.

07 ; again double channels of Conv2D

loss and accuracy are better than 06, but val_loss and val_accuracy are seems same as those of 06.
I think, this CNN can be trained more than 06, but cannot predict test data due to variation of train date is not enough. So parameters of ImageDataGenerator should be changed (?)

08 ; change parameters of ImageDataGenerator

Comparing to 07, accuracy and loss is worse, but val_accuracy and val_loss are the same

09 ; changer filter of the 1st Cond2D from (7,7) to (9,9)

Comparing to 07, accuracy and loss is worse, but val_accuracy and val_loss are the same
Seems same as 08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Readme.md

Readme.md

CNN1l/Readme.md

Summary

Training conditions and Results of score

00 ; standard condition

01 ; Learning Rate reducing

02 ; Learning Rate reducing (2)

03 ; Learning Rate reducing + Dropout

04 ; Changing 2nd Conv2D layer

05 ; Learning Rate reducing + Dropout

06 ; Learning Rate reducing + Dropout + dobble channels of Conv2D (Saved as Version 14 on Kaggle)

Before

After

07 ; again double channels of Conv2D

After

08 ; change parameters of ImageDataGenerator

Before

After

09 ; changer filter of the 1st Cond2D

10 ; same as 07, but no validation_data

Results of score

Files

Graphs

00 ; standard

01 ; Learning Rate reducing

02 ; Learning Rate reducing (2)

03 ; Learning Rate reducing + Dropout(0.4)

04 ; Changing 2nd Conv2D layer

05 ; Learning Rate reducing + Dropout(0.7)

06 ; earning Rate reducing + Dropout + dobble channels of Conv2D

07 ; again double channels of Conv2D

08 ; change parameters of ImageDataGenerator

09 ; changer filter of the 1st Cond2D from (7,7) to (9,9)

10 ; same as 07, but no validation_data

Files

Readme.md

Latest commit

History

Readme.md

File metadata and controls

CNN1l/Readme.md

Summary

Training conditions and Results of score

00 ; standard condition

01 ; Learning Rate reducing

02 ; Learning Rate reducing (2)

03 ; Learning Rate reducing + Dropout

04 ; Changing 2nd Conv2D layer

05 ; Learning Rate reducing + Dropout

06 ; Learning Rate reducing + Dropout + dobble channels of Conv2D (Saved as Version 14 on Kaggle)

Before

After

07 ; again double channels of Conv2D

After

08 ; change parameters of ImageDataGenerator

Before

After

09 ; changer filter of the 1st Cond2D

10 ; same as 07, but no validation_data

Results of score

Files

Graphs

00 ; standard

01 ; Learning Rate reducing

02 ; Learning Rate reducing (2)

03 ; Learning Rate reducing + Dropout(0.4)

04 ; Changing 2nd Conv2D layer

05 ; Learning Rate reducing + Dropout(0.7)

06 ; earning Rate reducing + Dropout + dobble channels of Conv2D

07 ; again double channels of Conv2D

08 ; change parameters of ImageDataGenerator

09 ; changer filter of the 1st Cond2D from (7,7) to (9,9)

10 ; same as 07, but no validation_data