train from scratch on molecule datasets #39

hehuanma · 2021-11-30T00:29:54Z

Hello, I am trying to use Graphormer on other commonly used datasets from MoleculeNet (https://moleculenet.org/datasets-1) to check the performance, such as BACE, BBBP, etc. I have used the default hparams in the script of molhiv, but the results are horrible...

May I know have you tried your model on these datasets without pretrained model? And do you have any suggestions on the hparams for these datasets if we want to train from scratch? I am trying to find out why the results are so bad...
For molhiv without pretrained model, I have tried with the provided script in the examples folder, with not adding the "checkpoint_path" argument, and train for 100 epochs. But the best val score is only around 0.763 and the corresponding test score is only 0.636... I don't know what goes wrong... May I know have you tried to use Graphormer directly on molhiv without pretrained model? How is the performance?
Thank you.

zhengsx · 2021-11-30T02:02:46Z

Good question. While we have not tested Graphormer on MoleculeNet by training from scratch, the unsatisfactory performance is in expectation. Graphormer is built upon a standard Transformer model, which has very powerful expresiveness. This would be valuable for more challenging large-scale dataset, but will hurt the performance on small benchmark due to the crazy overfitting. Just imagine training ViT or SWin on MNIST or Cifar10 (although Transformer-based models already have been the de-facto standard on image processing).

If someone insist to get a good performance on those extremely small datasets such as MoleculeNet, e.g., less than 100K molecules, here is some tips which may be helpful:

Reduce the parameter size of Graphormer like what we do on ZINC.
Add strong regularization techniques like what we do on hiv and pcba.
Use pretrained model which is a very good method to overcome overfitting.

hehuanma · 2021-11-30T04:31:27Z

Thank you for the information! That makes sense, we did observe crazy overfitting for some datasets, and for others the training was quite unstable. Btw, do you plan to upload the pretrained model used in the paper? Thus we can apply it and save some computational costs. Thanks!

zhengsx · 2021-11-30T12:35:12Z

In our latest plan, all the pre-trained checkpoint models will be released together with the new efficient framework of Graphormer in the next release. Please stay tuned.

hehuanma · 2021-11-30T16:47:02Z

Sounds great! Thank you!

hehuanma closed this as completed Nov 30, 2021

zhengsx mentioned this issue Dec 5, 2021

Performance of Graphormer on traditional GNN benchmarks #44

Closed

zhengsx added the good first issue Good for newcomers label Dec 24, 2021

zhengsx reopened this Dec 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train from scratch on molecule datasets #39

train from scratch on molecule datasets #39

hehuanma commented Nov 30, 2021

zhengsx commented Nov 30, 2021 •

edited

Loading

hehuanma commented Nov 30, 2021

zhengsx commented Nov 30, 2021

hehuanma commented Nov 30, 2021

train from scratch on molecule datasets #39

train from scratch on molecule datasets #39

Comments

hehuanma commented Nov 30, 2021

zhengsx commented Nov 30, 2021 • edited Loading

hehuanma commented Nov 30, 2021

zhengsx commented Nov 30, 2021

hehuanma commented Nov 30, 2021

zhengsx commented Nov 30, 2021 •

edited

Loading