-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistency with BOND paper #83
Comments
Thank you for your report. For a short answer, you can reproduce the results by downgrading to the original version used by the benchmark v0.3.1 via After our invistigation, we found that the difference in the performance is mainly caused by the changes of parameter Also, current benchmark script is out of date, ignoring the selection of
If you have any further questions, please feel free to let us know. |
Hi @kayzliu, I have some questions:
does it mean Please correct me, if I am wrong. |
Hi Partha! For removing the heuristic selection, I think you only need to change the following lines. Remove the if-else and set the alpha to [0.8, 0.5, 0.2] without condition. But note that it may not reproduce the results in Table 3. For exact reproduction of the results, please downgrade to 0.3.1. Lines 37 to 41 in 987776c
|
Hi @kayzliu ,
Yes, I got your point. I will come back after removing if-else block.
Yes, you may right. I need to downgrade to 0.3.1. On 1.0.0, I already start testing on all kinds of possible following combinations for DOMINANT model on inj_coar dataset.
But I did not find any AUC >=0.80 on inj_coar dataset, maximum is ~0.7691. You can check on My target is to get the generalized hyper-parameter range for every model with every dataset, which could be version independent. I don't know, is it quite possible or not? It is quite uncertain for me, until now. What do you think? Will it be effective or not? Should I continue or stop this testing? Sorry, for this long comment. |
fixed in #86. |
I run main.py multiple times with DOMINANT (from https://github.com/pygod-team/pygod/tree/main/benchmark).
I find out that although the hyperparameter setting is consistent with the BOND paper (https://arxiv.org/pdf/2206.10071.pdf), the results on inj_cora (AUC: 0.7566±0.0332 (0.7751)) and inj_amazon (AUC: 0.7147±0.0006 (0.7152)) are significantly different from what you show in table 3 from the BOND paper (https://arxiv.org/pdf/2206.10071.pdf), which are 82.7±5.6 (84.3) on inj_cora and 81.3±1.0 (82.2) for inj_amazon.
Is there any advice that you can provide about how to reproduce the results of the BOND paper?
The text was updated successfully, but these errors were encountered: