New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of APPNP depart from original design #1
Comments
Thank you for raising this question. We implement APPNP with 2 weight matrices to make generalization comparison fair between different baselines. Please notice that X@W_1@W_2 is similar to X@W as they are linear combinations. In terms of empirical performance, if I recall correctly, X@W_1@W_2 performs better on the challenging datasets (e.g., OGB datasets), but it sometimes overfits on easy datasets (e.g., Cora). |
Thanks for the timely reply! Yet, my question is that original APPNP use 2 layers MLP, which has non-linearity in it (i.e. Thanks, |
Sorry, I misunderstood your last message. I just tested the non-linear version on the Arxiv dataset it gives me around Train: 71.96%, Valid: 70.58% Test: 70.21% using K=10 and alpha=0.1 using PyG without dropout and regularization (weight decay). The performance is a lot better than the linear version, but it still underfit as the normal training accuracy on the Arxiv dataset is around 85% |
I see. It would be great to add a note to your paper/code about this. I'll probably also check it on the other datasets. Nevertheless, thanks for your reply! I'll close the issue :). |
Hi,
I just find that the implementation of APPNP is not the same as the standard design.
https://github.com/CongWeilin/DGCN/blob/master/model.py#L243
In your implementation, the node features only go through a linear transform (self.linear_in), then apply PPR propagation. Finally, it followed by another linear transformation (self.linear_out).
However, this departs from the original APPNP design.
https://github.com/klicperajo/ppnp/blob/master/ppnp/pytorch/ppnp.py
In the original design, the authors of APPNP first transform node features into C dimensional hidden representation using MLP (C is the number of classes). Then they propagate based on PPR. We can also clearly see from their illustration figure in https://github.com/klicperajo/ppnp that your implementation is not the same. I wonder how will this affect the results in your paper?
Thanks,
Eli
The text was updated successfully, but these errors were encountered: