New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How can one add a new algorithm and benchmark it with GOOD? #4
Comments
Hi Andrew, Your suggestions are beneficial! We noticed your GOOD algorithm, too:smile:. Recently, we have added DIR, EERM, and SRGNN into our baselines in
Let's take the GOOD-Motif as an example. The first thing I will do is copy this ERM config file as includes:
- base.yaml
model:
model_name: GIN
ood:
ood_alg: my_algorithm #ERM
ood_param: 100 # if I only have one OOD specific parameter
extra_param: # More OOD specific parameters will be included in this extra parameter list.
- 1e-3
- 0.5
DIR_ratio: 0.6 # If one wants to access parameters as attributes instead of elements in the extra_param list, I will recommend to create a new named parameter like this.
train:
max_epoch: 200
lr: 1e-3
mile_stones: [150] If one chooses to add, for example, To use these new parameters in the code, we need to have the access of loss = loss + config.ood.ood_param * IRM_loss
p = config.train.epoch * config.ood.extra_param[0]
num_subgraph_node = num_node * config.ood.DIR_ratio
When it is time to run the new algorithm, the command is: goodtg --config_path GOOD_configs/GOODMotif/basis/covariate/my_algorithm.yaml
The new algorithm will be automatically evaluated when it inherits BaseOODAlg. IRM can be a good example if one tries to modify loss. Mixup is a good example if one is developing a data augmentation method. In your case, I think DIR may be a good example? You may be also interested in how to modified the model structure. Here is the DIR's example. The
Actually, we are trading off between POP (Procedural Oriented Programming) and OOP (Object-Oriented Programming). We will consider incorporate the Currently, the overall structure is a main pipeline (training/val/test) embedding with three types of modules (datasets/networks/ood_algorithms). Therefore, except for adding new arguments, the major change should locate at a new network file with a registered GNN class and a new ood_algorithm file with a registered algorithm class.
We have some small scripts to do that. We will organize them as a part of GOOD soon.
First, the algorithm should do proper valuable control by themselves. For example, if the novelty of the algorithm does not locate at a new GNN encoder, the algorithm should follow the GNN encoder used by others. This can be simply done by setting the same Thank you again for your valuable suggestions! We would like to share our roadmap or TODO list in README. Please keep an eye on our work for updates! 😊 |
Hi GOOD team, thanks for your detailed answers! I have similar questions as in #5 . It seems in the current version, we have to manually sweep the hyperparameters and collect the results of different runs. Maybe I missed something. Is there any convenient way to do so? |
Hi Andrew, As I mentioned in #5, the best way to do it in the current version is to manage the log files. We need to assign one log file for each run. The way of saving log files may be helpful. Therefore, we wrote a script to read log files in the same way as how we save them. A more convenient way to do it is to rerun the same config file with a different def load_task(task: str, model: torch.nn.Module, loader: DataLoader, ood_algorithm: BaseOODAlg,
config: Union[CommonArgs, Munch]):
r"""
Launch a training or a test. (Project use only)
"""
if task == 'train':
train(model, loader, ood_algorithm, config)
elif task == 'test':
# config model
print('#D#Config model and output the best checkpoint info...')
test_score, test_loss = config_model(model, 'test', config=config)
elif task == 'read_log':
with open(config.log_path, 'r') as f:
return f.readlines()[-1] |
Hi Andrew, We have updated this project to version 1. You can now launch multiple jobs and collect their results easily. Please refer to the new README. Please let me know if any questions. |
Hi Shurui, Thanks very much for the update! They can help us a lot in incorporating our algorithm into GOOD benchmark with the updated code. Besides the running code, I am wondering would it be possible for you to maintain some online "leaderboard"? which could help the followers to identify the state-of-the-arts in OOD generalization on graphs : ) BTW, congulaturations on your NeurIPS acceptance, we believe GOOD could facilitate the development of better OOD algorithms for graph data! |
Hi Andrew, You are very welcome! And thank you for your advice and congratulations! For the online leaderboard: Yes, I have the plan to maintain a leaderboard later. It will take some time for the full documentation update and the online leaderboard building, but I will try my best to work on them as earlier as possible. : ) |
Hi Andrew, Current, the documentation for "how to add a new algorithm" has been completed. The leaderboard construction is in our roadmap. I'll close this issue, since the major issue has been solved. Please start a new issue or a discussion if any questions. 😄 |
Hi GOOD authors,
Thanks for your impressive amount of work on developing the GOOD benchmark. As one who also works with OOD algorithms for graph data, I believe this benchmark could provide more insights for future developments in this field. Recently, I am trying to add and benchmark my graph OOD algorithm (which happens to have the same name as GOOD 🤣) with the GOOD benchmark:
However, when reading through the code and the documents, I find it seems there is no explicit description of the pipeline for one to add a new algorithm and benchmark with GOOD. For example,
It could facilitate the follow-up developments with the GOOD benchmark if the authors could provide a convenient way along with an explicit description for adding and benchmarking new algorithms. Going through the OOD literature, I find DomainBed seems to be a good example that provides both a rigorous evaluation and a convenient new algorithm integration pipeline. I believe the GOOD benchmark would make much more impact on the community if you could add the corresponding features to the existing code : )
Best Regards,
Andrew
The text was updated successfully, but these errors were encountered: