Get worse result on test set while training with the proposed curriculum learning algorithm #10

junyi-tiger · 2021-11-11T09:47:20Z

Hi, I evaluate t5-base, training with the proposed curriculum learning algorithm, on the dataset namely ACE05-EN+ and get worse test result:
test_trigger-F1 = 65.9436
test_trigger-P = 61.0442
test_trigger-R = 71.6981
test_role-F1 = 49.6
test_role-P = 45.8693
test_role-R = 53.9913
While evaluating the model trained without curriculum learning algorithm, I get:
test_trigger-F1 = 68.8863
test_trigger-P = 67.1141
test_trigger-R = 70.7547
test_role-F1 = 49.0647
test_role-P = 48.6448
test_role-R = 49.492

The performance drops when training with the curriculum learning algorithm. The test-trigger-P drops much if +CL.
Is there anything wrong?
Here is my training args:
epoch: 5+30, batch_size=32, metric_for_best_model=eval_role_F1, label_smoothing=0.2, model: t5-base, dataset: ACE05-EN+
Looking forward for your reply, thank you!

luyaojie · 2021-11-17T09:12:12Z

Hi,

Thanks for your interest and reporting this issue.
All pre-trained models are trained with the curriculum learning algorithm (5 epochs), and you can find the training arguments in the pre-trained models' folders.
The experiments of T5-base for ACE05-EN is using label_smoothing=0.0, warmup_steps=2000.

junyi-tiger · 2021-12-20T06:41:30Z

Thanks for your reply!
But when I try training the 'dyiepp_ace2005_en_t5_large' model using the training arguments as same as the pre-trained model, I get
best:checkpoint-26500:
eval_trigger-F1 = 73.979 eval_role-F1 = 52.5855
test_trigger-F1 = 70.9288 test_role-F1 = 50.449
, and
best:checkpoint-16000:
eval_trigger-F1 = 73.6142 eval_role-F1 = 55.2486
test_trigger-F1 = 69.2049 test_role-F1 = 47.0416 if add '--wo_constraint_decoding'.
Here's the command used (GPU: NVIDIA GeForce RTX 3090):
bash run_seq2seq_with_pretrain.bash -m t5-large -b 8 --lr 5e-5 --data dyiepp_ace2005_subtype --warmup_steps 2000 --lr_scheduler linear --label_smoothing 0.2 --wo_constraint_decoding
while evaluating your pre-trained model, I get
eval_trigger-F1 = 71.09 eval_role-F1 = 54.7556
test_trigger-F1 = 72.7273 test_role-F1 = 53.4903
I am so confused. Is there any special training skill?

luyaojie · 2021-12-20T07:11:59Z

Hi,

There is no special training skill.
The training command of four pre-trained models is as follows:

bash run_seq2seq_with_pretrain.bash -d 0 -k 3 -f tree -m t5-base --label_smoothing 0 -l 1e-4 --lr_scheduler linear --warmup_steps 2000 -b 16 -i dyiepp_ace2005_subtype

bash run_seq2seq_with_pretrain.bash -d 0 -k 3 -f tree -m t5-large --label_smoothing 0.2 -l 5e-5 --lr_scheduler linear --warmup_steps 2000 -b 8 -i dyiepp_ace2005_subtype

bash run_seq2seq_with_pretrain.bash -d 0 -k 3 -f tree -m t5-large --label_smoothing 0.2 -l 5e-5 --lr_scheduler linear --warmup_steps 2000 -b 8 -i one_ie_ace2005_subtype

bash run_seq2seq_with_pretrain.bash -d 0 -k 3 -f tree -m t5-large --label_smoothing 0.2 -l 5e-5 --lr_scheduler linear --warmup_steps 2000 -b 8 -i one_ie_ere_en_subtype

junyi-tiger · 2021-12-20T07:20:34Z

Hi, I trained with exactly the same arguments except for '-k 3', what is this?

luyaojie · 2021-12-20T07:28:32Z

Sorry about that.

K is the number of running in my raw code.
I ran each experiment three times and report the average performance.
run1/run2/run3 in the model config.json in the released models mean different run seeds.

K=1 means seed=421
K=2 means seed=421 422
K=3 means seed=421 422 423

junyi-tiger · 2021-12-20T07:31:25Z

got it. Thank you!

luyaojie closed this as completed Nov 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get worse result on test set while training with the proposed curriculum learning algorithm #10

Get worse result on test set while training with the proposed curriculum learning algorithm #10

junyi-tiger commented Nov 11, 2021

luyaojie commented Nov 17, 2021

junyi-tiger commented Dec 20, 2021

luyaojie commented Dec 20, 2021

junyi-tiger commented Dec 20, 2021

luyaojie commented Dec 20, 2021

junyi-tiger commented Dec 20, 2021

Get worse result on test set while training with the proposed curriculum learning algorithm #10

Get worse result on test set while training with the proposed curriculum learning algorithm #10

Comments

junyi-tiger commented Nov 11, 2021

luyaojie commented Nov 17, 2021

junyi-tiger commented Dec 20, 2021

luyaojie commented Dec 20, 2021

junyi-tiger commented Dec 20, 2021

luyaojie commented Dec 20, 2021

junyi-tiger commented Dec 20, 2021