Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get worse result on test set while training with the proposed curriculum learning algorithm #10

Closed
junyi-tiger opened this issue Nov 11, 2021 · 6 comments

Comments

@junyi-tiger
Copy link

Hi, I evaluate t5-base, training with the proposed curriculum learning algorithm, on the dataset namely ACE05-EN+ and get worse test result:
test_trigger-F1 = 65.9436
test_trigger-P = 61.0442
test_trigger-R = 71.6981
test_role-F1 = 49.6
test_role-P = 45.8693
test_role-R = 53.9913
While evaluating the model trained without curriculum learning algorithm, I get:
test_trigger-F1 = 68.8863
test_trigger-P = 67.1141
test_trigger-R = 70.7547
test_role-F1 = 49.0647
test_role-P = 48.6448
test_role-R = 49.492

The performance drops when training with the curriculum learning algorithm. The test-trigger-P drops much if +CL.
Is there anything wrong?
Here is my training args:
epoch: 5+30, batch_size=32, metric_for_best_model=eval_role_F1, label_smoothing=0.2, model: t5-base, dataset: ACE05-EN+
Looking forward for your reply, thank you!

@luyaojie
Copy link
Owner

Hi,

Thanks for your interest and reporting this issue.
All pre-trained models are trained with the curriculum learning algorithm (5 epochs), and you can find the training arguments in the pre-trained models' folders.
The experiments of T5-base for ACE05-EN is using label_smoothing=0.0, warmup_steps=2000.

@junyi-tiger
Copy link
Author

Thanks for your reply!
But when I try training the 'dyiepp_ace2005_en_t5_large' model using the training arguments as same as the pre-trained model, I get
best:checkpoint-26500:
eval_trigger-F1 = 73.979 eval_role-F1 = 52.5855
test_trigger-F1 = 70.9288 test_role-F1 = 50.449
, and
best:checkpoint-16000:
eval_trigger-F1 = 73.6142 eval_role-F1 = 55.2486
test_trigger-F1 = 69.2049 test_role-F1 = 47.0416 if add '--wo_constraint_decoding'.
Here's the command used (GPU: NVIDIA GeForce RTX 3090):
bash run_seq2seq_with_pretrain.bash -m t5-large -b 8 --lr 5e-5 --data dyiepp_ace2005_subtype --warmup_steps 2000 --lr_scheduler linear --label_smoothing 0.2 --wo_constraint_decoding
while evaluating your pre-trained model, I get
eval_trigger-F1 = 71.09 eval_role-F1 = 54.7556
test_trigger-F1 = 72.7273 test_role-F1 = 53.4903
I am so confused. Is there any special training skill?

@luyaojie
Copy link
Owner

Hi,

There is no special training skill.
The training command of four pre-trained models is as follows:

bash run_seq2seq_with_pretrain.bash -d 0 -k 3 -f tree -m t5-base --label_smoothing 0 -l 1e-4 --lr_scheduler linear --warmup_steps 2000 -b 16 -i dyiepp_ace2005_subtype

bash run_seq2seq_with_pretrain.bash -d 0 -k 3 -f tree -m t5-large --label_smoothing 0.2 -l 5e-5 --lr_scheduler linear --warmup_steps 2000 -b 8 -i dyiepp_ace2005_subtype

bash run_seq2seq_with_pretrain.bash -d 0 -k 3 -f tree -m t5-large --label_smoothing 0.2 -l 5e-5 --lr_scheduler linear --warmup_steps 2000 -b 8 -i one_ie_ace2005_subtype

bash run_seq2seq_with_pretrain.bash -d 0 -k 3 -f tree -m t5-large --label_smoothing 0.2 -l 5e-5 --lr_scheduler linear --warmup_steps 2000 -b 8 -i one_ie_ere_en_subtype

@junyi-tiger
Copy link
Author

Hi, I trained with exactly the same arguments except for '-k 3', what is this?

@luyaojie
Copy link
Owner

Sorry about that.

K is the number of running in my raw code.
I ran each experiment three times and report the average performance.
run1/run2/run3 in the model config.json in the released models mean different run seeds.

K=1 means seed=421
K=2 means seed=421 422
K=3 means seed=421 422 423

@junyi-tiger
Copy link
Author

got it. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants