Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving LapPE instead of precomputing every run #7

Closed
rish-16 opened this issue Jul 28, 2022 · 2 comments
Closed

Saving LapPE instead of precomputing every run #7

rish-16 opened this issue Jul 28, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@rish-16
Copy link

rish-16 commented Jul 28, 2022

Hello Ladislav!

You may remember me from the LoGaG talk! Thanks for the session :D

I have been playing around with the GraphGPS configs and realised the LapPE precomputing process takes place from scratch every time I run the pcqm4m-GPS.yaml config with main.py. Would it be possible to add a patch that saves this pre-computed information locally so I can quickly access it without having to run the same operation again?

I'm benchmarking such models for my research so I'll be putting it up for training regularly, so was hoping to find ways to avoid the precomputing every time.

Appreciate your consideration, enjoyed reading the paper!

@rampasek rampasek added the enhancement New feature or request label Jul 29, 2022
@rampasek
Copy link
Owner

Hi Rishabh!

Yes that is possible, but would need a little bit of refactoring:

  1. PyG Datasets have pre_transform hook, see the doc, which applies a transformation function to each graph example in the dataset and saves that version to the disk, keeping that version cached so it is only computed once -- the first time the dataset is downloaded and processed.
  2. The way I precompute the PE/SE stats is already implemented as transformation function applied to each dataset, but I do so after loading the "vanilla" dataset from the disk, having to recompute it each time as you mention. But this could be hooked up to the pre_transform hook and cached.
  3. In order to load the correct dataset with the desired precomputed PE/SE you need to rename the dataset accordingly, so it is saved under a unique name and make sure that if the PE/SE config matches what was used to precompute the cached version.

So that is roughly what needs to be done. I haven't implemented yet as it was impractical for me to have many different cached versions of the datasets (I'm a bit storage limited). I don't immediately plan to do it now, hopefully later on.

@rish-16
Copy link
Author

rish-16 commented Jul 30, 2022

That's a great suggestion, thanks! I'll look into it :D

Appreciate the help!

@rish-16 rish-16 closed this as completed Jul 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants