Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference between TF and Pytorch version code #69

Closed
BonitoW opened this issue Nov 17, 2020 · 5 comments
Closed

Difference between TF and Pytorch version code #69

BonitoW opened this issue Nov 17, 2020 · 5 comments

Comments

@BonitoW
Copy link

BonitoW commented Nov 17, 2020

No description provided.

@BonitoW
Copy link
Author

BonitoW commented Nov 17, 2020

Hi Thomas,

Thanks for your excellent work. However, I am still confused about the difference between the TF and PyTorch version code. As you have mentioned before, the two major differences are that firstly, the Pytorch version code did not use the first layer Dropout, and secondly, the Pytorch version code uses a different way to normalize the adjacent matrix.

I change your Pytorch code to the following form:
r_inv = np.power(rowsum, -0.5).flatten()
mx = mx.dot(r_mat_inv).transpose().dot(r_mat_inv)

and add one dropout layer in the forward function,
x = F.dropout(x, self.dropout, training=self.training)

However, the experiment result still looks quite different. Did I miss some important points?

Thanks for your time!

@BonitoW BonitoW changed the title Difference between Difference between TF and Pytorch version code Nov 17, 2020
@tkipf
Copy link
Owner

tkipf commented Nov 17, 2020 via email

@BonitoW
Copy link
Author

BonitoW commented Nov 18, 2020

Maybe you're using different dataset splits? Note that the default dataset loaders are different in both repositories (which, in hindsight, was an unfortunate choice).

On Tue, Nov 17, 2020 at 6:42 PM BonitoW @.***> wrote: Hi Thomas, Thanks for your excellent work. However, I am still confused about the difference between the TF and PyTorch version code. As you have mentioned before, the two major differences are that firstly, the Pytorch version code did not use the first layer Dropout, and secondly, the Pytorch version code uses a different way to normalize the adjacent matrix. I change your Pytorch code to the following form: r_inv = np.power(rowsum, -0.5).flatten() mx = mx.dot(r_mat_inv).transpose().dot(r_mat_inv) and add one dropout layer in the forward function, x = F.dropout(x, self.dropout, training=self.training) However, the experiment result still looks quite different. Did I miss some important points? Thanks for your time! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#69 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYBYYHMSNNYTJ24CAZO37TSQKYWTANCNFSM4TY36O7A .

Thanks for your reply! In fact, I noticed that your default split function is different. Actually, I use the split function reported in another paper FastGCN (https://arxiv.org/abs/1801.10247), which means for the Cora dataset, the first 1208 samples for training and the last 1000 samples for testing. However, the result of the TF version is 0.86, while the Pytorch version code is only 0.82.

@BonitoW
Copy link
Author

BonitoW commented Nov 18, 2020

Maybe you're using different dataset splits? Note that the default dataset loaders are different in both repositories (which, in hindsight, was an unfortunate choice).

On Tue, Nov 17, 2020 at 6:42 PM BonitoW @.***> wrote: Hi Thomas, Thanks for your excellent work. However, I am still confused about the difference between the TF and PyTorch version code. As you have mentioned before, the two major differences are that firstly, the Pytorch version code did not use the first layer Dropout, and secondly, the Pytorch version code uses a different way to normalize the adjacent matrix. I change your Pytorch code to the following form: r_inv = np.power(rowsum, -0.5).flatten() mx = mx.dot(r_mat_inv).transpose().dot(r_mat_inv) and add one dropout layer in the forward function, x = F.dropout(x, self.dropout, training=self.training) However, the experiment result still looks quite different. Did I miss some important points? Thanks for your time! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#69 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYBYYHMSNNYTJ24CAZO37TSQKYWTANCNFSM4TY36O7A .

Hi! Thanks for your reply. Do you mean the sequence of the two data files is different? I try to print out the feature matrix and found they are different.

@BonitoW
Copy link
Author

BonitoW commented Nov 18, 2020

Hi Thomas! Thanks for your reply! In fact, I found that the problem is just that the sequence of the two data files is different. I use the data file reported in your TF version code and the final result comes out as the same as the one in the TF version. Besides, the performance will be better if I use the adjacent matrix preprocessing method reported in your paper, which is like a kind of hyperparameter tuning.

@BonitoW BonitoW closed this as completed Nov 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants