Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaling to larger datasets #8

Closed
sunisfighting opened this issue Jun 22, 2022 · 3 comments
Closed

Scaling to larger datasets #8

sunisfighting opened this issue Jun 22, 2022 · 3 comments

Comments

@sunisfighting
Copy link

sunisfighting commented Jun 22, 2022

Thanks for your awesome work! I am trying to apply GRACE to larger datasets, but according to your code, the training process is conducted in a full-batch way which hinders the scalability. In your paper, it is mentioned that EIGHT GPUs are used, could you please kindly share the way you implement it? As far as I know, PyG only supports multi-graph distributed computation. Also, it deserves many thanks if you could provide me with other suggestions! Looking forward to your reply!!

@SXKDZ
Copy link
Member

SXKDZ commented Jun 22, 2022

Thanks for your interest on our work! Actually we use 8 GPUs in parallel. That being said, we use one GPU for one dataset. For the support of multiple GPUs, I think it is fairly easy to adopt existing libraries 😄

@sunisfighting
Copy link
Author

sunisfighting commented Jun 24, 2022

Thanks for your kind reply.
Here is one more question.
In GRACE/model, you first double the hidden layer dimension (e.g. 256) in the middle GCNConv layers and then scale to the original dimension (e.g. 128) in the last layer. I also noticed that in your PyGCL library, the hidden dimensions are kept unchanged (i.e. 128 for all hidden layer). And when run GRACE, it may bring 1-2% points of acc drop if keep dimension as 128. Which operation is more standard and fairer for comparison with competitors?

@SXKDZ
Copy link
Member

SXKDZ commented Nov 10, 2022

That's a good question. Theoretically, every hidden dimension could be a tunable hyperparameter, so doubling the size of hidden vectors is acceptable in my opinion. I think for a fair comparison with other models, you need to make sure the encoder part is the same. In case of the performance drop as you mentioned, it may be attributed to relatively small size of the dataset.

@SXKDZ SXKDZ closed this as completed Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants