Training details about different sizes #27

LucyLu-LX · 2018-09-03T07:06:27Z

python preprocess.py, resize image to 256256,384384,512512,640640,736*736, and train respectively could speed up training process.

I am kinda confused the meaning of train respectively?
Does this mean to train the network in a coarse-to-fine process, which initals the network from 256x256 and then finetunes it on larger sizes?
Does this accelerate the converge of the network than train it on size 736x736 directly?

huoyijie · 2018-09-04T04:59:23Z

1.Does this mean to train the network in a coarse-to-fine process, which initals the network from 256x256 and then finetunes it on larger sizes?
Yes
2.Does this accelerate the converge of the network than train it on size 736x736 directly?
Yes, because direct training on size 736 is very slow

my own training method:
set cfg.train_task_id = '2T256'
set patience 5(between 2~6)
python preprocess.py && python label.py && python advanced_east.py

when end of training, copy the best saved weights file(.h5) to initials the training of size 384, modify cfg.train_task_id = '2T384' and cfg.initial_epoch="the ending epoch" and cfg.load_weights=True and continue train.

then train 512 and so on. You could try this method, maybe there are better ways.

hcnhatnam · 2019-03-15T05:16:22Z

@huoyijie whether the network still remembers what they learned in 256 while training 736?

globalmaster · 2019-03-19T04:44:53Z

Hi,
I download tianchi ICPR dataset,set cfg.train_task_id = '3T256',run python3 preprocess.py && python3 label.py && python3 advanced_east.py. But I get this error. The output information is shown below. How can I fix this error? Can you help me?@LucyLu-LX @huoyijie @hcnhatnam

Epoch 00008: val_loss improved from 0.43569 to 0.42750, saving model to model/weights_3T256.008-0.427.h5
Epoch 9/24
1125/1125 [==============================] - 157s 139ms/step - loss: 0.2762 - val_loss: 0.4373

Epoch 00009: val_loss did not improve from 0.42750
Epoch 10/24
1125/1125 [==============================] - 156s 139ms/step - loss: 0.2579 - val_loss: 0.4435

Epoch 00010: val_loss did not improve from 0.42750
Epoch 11/24
1125/1125 [==============================] - 156s 139ms/step - loss: 0.2466 - val_loss: 0.4710

Epoch 00011: val_loss did not improve from 0.42750
Epoch 12/24
1125/1125 [==============================] - 156s 139ms/step - loss: 0.2342 - val_loss: 0.4633

Epoch 00012: val_loss did not improve from 0.42750
Epoch 13/24
1125/1125 [==============================] - 156s 139ms/step - loss: 0.2228 - val_loss: 0.4724

Epoch 00013: val_loss did not improve from 0.42750
Epoch 00013: early stopping

hcnhatnam · 2019-03-20T10:56:21Z

@globalmaster it isn't error. The training is stopped early(early stopping) to avoid overfit. Looks like the model is not converging and this is still my problem. @globalmaster, Can you share for me dataset with google driver link?

huoyijie mentioned this issue Sep 10, 2018

quad invalid with vertex num less then 4 #28

Closed

moyemoji mentioned this issue Apr 10, 2019

训练时设置load_weights = True，是否会在之前训练的基础上继续训练？ #68

Open

peter-peng-w mentioned this issue May 20, 2019

icdar2015上训练效果 #70

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training details about different sizes #27

Training details about different sizes #27

LucyLu-LX commented Sep 3, 2018 •

edited

Loading

huoyijie commented Sep 4, 2018

hcnhatnam commented Mar 15, 2019

globalmaster commented Mar 19, 2019

hcnhatnam commented Mar 20, 2019 •

edited

Loading

Training details about different sizes #27

Training details about different sizes #27

Comments

LucyLu-LX commented Sep 3, 2018 • edited Loading

huoyijie commented Sep 4, 2018

hcnhatnam commented Mar 15, 2019

globalmaster commented Mar 19, 2019

hcnhatnam commented Mar 20, 2019 • edited Loading

LucyLu-LX commented Sep 3, 2018 •

edited

Loading

hcnhatnam commented Mar 20, 2019 •

edited

Loading