-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
train from scratch on molecule datasets #39
Comments
Good question. While we have not tested Graphormer on MoleculeNet by training from scratch, the unsatisfactory performance is in expectation. Graphormer is built upon a standard Transformer model, which has very powerful expresiveness. This would be valuable for more challenging large-scale dataset, but will hurt the performance on small benchmark due to the crazy overfitting. Just imagine training ViT or SWin on MNIST or Cifar10 (although Transformer-based models already have been the de-facto standard on image processing). If someone insist to get a good performance on those extremely small datasets such as MoleculeNet, e.g., less than 100K molecules, here is some tips which may be helpful:
|
Thank you for the information! That makes sense, we did observe crazy overfitting for some datasets, and for others the training was quite unstable. Btw, do you plan to upload the pretrained model used in the paper? Thus we can apply it and save some computational costs. Thanks! |
In our latest plan, all the pre-trained checkpoint models will be released together with the new efficient framework of Graphormer in the next release. Please stay tuned. |
Sounds great! Thank you! |
Hello, I am trying to use Graphormer on other commonly used datasets from MoleculeNet (https://moleculenet.org/datasets-1) to check the performance, such as BACE, BBBP, etc. I have used the default hparams in the script of molhiv, but the results are horrible...
Thank you.
The text was updated successfully, but these errors were encountered: