New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Predictor setting in inference for ogbl-vessel #359
Comments
Hi @skepsun, I am very grateful that you have discovered and shared this with us! Thank you so much! Can you share the exact settings (e.g. If you want you can implement the bugfix in a PR, otherwise I am happy to do it :) |
Hi @jqmcginnis ,
|
I also discovered that exchanging positive and negative edges in training (by using |
@weihua916 do you see this as a viable option? I am also unsure how to deal with this situation in the best possible way. As you are very experienced and accomplished in this field, I would very much appreciate your thoughts on this. To provide some more background, let us consider the output of the following script:
Using the non-inverted training process we obtain:
and the same script and settings for inverted loss @skepsun recommended yields the following output:
|
Interesting. It looks like the model optimization is quickly stuck in the local minima. I'd suggest you at least make sure your model overfits (nearly 1.0 ROC-AUC) towards the end of the training. Also, I feel the absolute 3D coordinate is not appropriate as input to your model. Using the relative displacements (x1,y1,z1) - (x2,y2,z2) as edge features makes more sense. In any case, this is just a baseline; for now, please make sure that the dataset itself is correct, and the community will figure out the best way to tackle this problem. |
I discussed this with other students in my group, and we decided against employing this trick for the leaderboard submissions due to the following reasons:
Lastly, we love hearing your ideas and tricks to improve ogbl-vessel and its algorithms, and are happy to decide any questions you might have. Thank you very much for the feedback! Cheers, |
Thank you Julian! It'd be cool to see on the leaderboard how SEAL performs. Also, we should keep in mind that ROC-AUC is often an optimistic measure for link prediction. You can get 99.9% ROC-AUC while achieving only 10% Hits@50. The score really depends on how difficult the negative examples are. Just good to keep this in mind when we assess the ROC-AUC score. |
@weihua916 thank you very much for your comment! I am still waiting for the final SEAL results (with 10 runs), the algorithm is comparatively slow but we're getting there 🙂 Thank you very much for bringing the decision of ROC-AUC score as an evaluation metric to our attention again, we're eager to look into all these topics with ogbl-vessel, and are curious what the community thinks and implements! |
@jqmcginnis Thanks for updating SEAL results! I have a question about the train scores during training. I tried GCN without any tricks and reached (for several times, not always) 70% val/test ROC-AUC with ~35% train ROC-AUC. I also tried to implement GCN+NeighborSampler with DGL and can stably reach 73% val/test ROC-AUC with ~35% train ROC-AUC. I am very curious about whether SEAL reached 80% val/test ROC-AUC with <50% training ROC-AUC. |
@skepsun happy to hear you are still working on this! 🙂 Yes, SEAL_OGB is able to perform similarly well on the train set, e.g. this is the report after the first training epoch: Command line input:
The SEAL version on the SEAL_OGB master branch does not automatically compute the training scores, however, if you would like to run it yourself and also track the training process, feel free to use the custom implementation I've implemented, i.e. I've also noticed that the OGB leaderboard has received another submission ("SAGE+JKNet") which seems to achieve similar ROC-AUC scores, so I do think the simplicity of GCN and SAGE might be the problem. Happy to hear your feedback! |
ogb/examples/linkproppred/vessel/gnn.py
Line 139 in f5534d9
I found that the predictor is not set to
predictor.eval()
in the test function in gnn.py, which may result in the poor performance for GNN on this dataset. Ifpredictor.eval()
is added, even with hidden size 3, the test ROC-AUC of GCN may reach 70+%, although sometimes it is stuck at 50%.The text was updated successfully, but these errors were encountered: