Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ask about the result #9

Closed
Smile0524 opened this issue Dec 2, 2021 · 5 comments
Closed

Ask about the result #9

Smile0524 opened this issue Dec 2, 2021 · 5 comments

Comments

@Smile0524
Copy link

Hello!
I read the paper you published and i also git clone the code .I run the code successfully.But the result I got is far from your result.
[wikidev.jsonl, epoch 4] overall:76.1, agg:89.1, sel:96.3, wn:97.1, wc:92.1, op:98.4, val:91.1.
I find that the where_col result and the where_val is worse than other.
What do you think about these issues?

@lyuqin
Copy link
Owner

lyuqin commented Dec 2, 2021

Could you run experiment using docker with the DockerFile in repo? The trained model is uploaded in release section, and you can evaluate it to see if number matches.

@1456416403
Copy link

have you solved your problem after set a bigger batch size? I have the same problem now,my batch size is 24,but my result is lf(eg)=78,ex(eg)=82,Thank you!!

@lyuqin
Copy link
Owner

lyuqin commented Mar 27, 2022

have you solved your problem after set a bigger batch size? I have the same problem now,my batch size is 24,but my result is lf(eg)=78,ex(eg)=82,Thank you!!

Hi, could you refer to #4 , where larger batch size should improve the accuracy. In addition, in the post, the subtask accuracy at epoch 0 was good with batch size 32 (except for val acc which had bug before). If your subtask accuracy was significant lower with similar batch size, please double check package version, especially pytorch and transformers.

@lyuqin
Copy link
Owner

lyuqin commented Mar 28, 2022

Hi @Smile0524 @1456416403 , I re-run a few experiments with smaller batch size, results are listed below. It shows that larger batch size gives better result. I recommend to train with batch size > 64, which should give similar results as best model trained with batch size 256.

batch size 24
[wikidev.jsonl, epoch 0] overall:74.0, agg:89.1, sel:97.2, wn:98.1, wc:94.0, op:98.9, val:89.1
[wikitest.jsonl, epoch 0] overall:73.0, agg:89.5, sel:97.0, wn:97.4, wc:92.7, op:98.7, val:89.0
[wikidev.jsonl, epoch 1] overall:80.8, agg:90.1, sel:97.4, wn:98.1, wc:94.3, op:99.0, val:96.2
[wikitest.jsonl, epoch 1] overall:80.4, agg:90.3, sel:97.2, wn:97.7, wc:93.8, op:98.9, val:96.1
[wikidev.jsonl, epoch 2] overall:81.5, agg:90.0, sel:97.5, wn:98.4, wc:95.0, op:99.1, val:96.8
[wikitest.jsonl, epoch 2] overall:81.4, agg:90.6, sel:97.3, wn:97.9, wc:94.1, op:99.0, val:96.5
[wikidev.jsonl, epoch 3] overall:82.4, agg:90.6, sel:97.6, wn:98.5, wc:95.0, op:99.2, val:96.9
[wikitest.jsonl, epoch 3] overall:82.2, agg:91.0, sel:97.4, wn:97.9, wc:94.2, op:99.1, val:96.9
[wikidev.jsonl, epoch 4] overall:83.0, agg:91.1, sel:97.6, wn:98.5, wc:95.0, op:99.1, val:97.1
[wikitest.jsonl, epoch 4] overall:82.5, agg:91.2, sel:97.4, wn:97.9, wc:94.3, op:99.1, val:96.9

batch size 64
[wikidev.jsonl, epoch 0] overall:78.9, agg:88.0, sel:97.5, wn:98.3, wc:94.8, op:98.9, val:96.2
[wikitest.jsonl, epoch 0] overall:78.4, agg:88.5, sel:97.1, wn:97.8, wc:93.8, op:98.8, val:95.9
[wikidev.jsonl, epoch 1] overall:81.9, agg:90.6, sel:97.6, wn:98.5, wc:95.2, op:98.6, val:96.7
[wikitest.jsonl, epoch 1] overall:81.4, agg:90.6, sel:97.4, wn:98.0, wc:94.4, op:98.5, val:96.8
[wikidev.jsonl, epoch 2] overall:82.9, agg:91.0, sel:97.6, wn:98.5, wc:95.4, op:99.1, val:97.1
[wikitest.jsonl, epoch 2] overall:82.3, agg:90.8, sel:97.5, wn:98.2, wc:94.9, op:99.1, val:96.9
[wikidev.jsonl, epoch 3] overall:83.9, agg:91.3, sel:97.9, wn:98.6, wc:95.6, op:99.0, val:97.5
[wikitest.jsonl, epoch 3] overall:83.4, agg:91.4, sel:97.6, wn:98.2, wc:94.9, op:99.1, val:97.3

batch size 128
[wikidev.jsonl, epoch 0] overall:80.0, agg:89.4, sel:97.4, wn:98.1, wc:94.5, op:99.1, val:96.3
[wikitest.jsonl, epoch 0] overall:79.7, agg:89.7, sel:97.6, wn:97.7, wc:93.9, op:98.9, val:95.7
[wikidev.jsonl, epoch 1] overall:82.6, agg:90.6, sel:97.8, wn:98.4, wc:95.3, op:99.1, val:97.2
[wikitest.jsonl, epoch 1] overall:82.0, agg:90.6, sel:97.5, wn:98.0, wc:94.7, op:99.0, val:96.9
[wikidev.jsonl, epoch 2] overall:83.4, agg:91.1, sel:97.8, wn:98.4, wc:95.4, op:99.2, val:97.3
[wikitest.jsonl, epoch 2] overall:82.9, agg:91.2, sel:97.6, wn:98.0, wc:94.7, op:99.1, val:97.2
[wikidev.jsonl, epoch 3] overall:83.8, agg:91.2, sel:97.7, wn:98.7, wc:95.6, op:99.3, val:97.6
[wikitest.jsonl, epoch 3] overall:83.4, agg:91.3, sel:97.6, wn:98.2, wc:95.0, op:99.2, val:97.3

[wikidev.jsonl, epoch 4] overall:83.7, agg:91.1, sel:97.7, wn:98.6, wc:95.5, op:99.2, val:97.5
[wikitest.jsonl, epoch 4] overall:83.2, agg:91.2, sel:97.6, wn:98.1, wc:94.7, op:99.2, val:97.4

@Smile0524
Copy link
Author

Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants