-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variance of the training/testing results #5
Comments
Hi Qi, Thanks for reaching out. Although we did not encounter such a large variance problem for our trained models, the random seeds of training and sampling could affect the results since the training/sampling of score-based models depends heavily on the sampled noises. This could be intensified for larger graphs such as grid. Furthermore, we provide the generation performance of using 1024 generated samples in Section D.1 of our paper. We can observe that the performance is similar to that of using a smaller number of samples. Thus evaluating the performance with a small number of samples (which is actually the same number of graphs in the test set) would not attribute to the large variance. |
Hi Jaehyeong, Thanks for your explanation on the randomness. By the way, your paper presents an interesting variant, Many thanks, |
Hi Qi, Thanks for your interest. You could modify For the |
Hi Jaehyeong, Thanks for your helpful comments! Have a wonderful day. Best, |
Hi there,
Thanks for sharing the code for your wonderful project. I have a question about the variance of the sampling results. I ran the training on the grid dataset using the default config file (with an arbitrary seed).
The testing-time performance metrics I got are:
MMD_full {'degree': 0.460601, 'cluster': 0.008495, 'orbit': 0.126024, 'spectral': 0.681714}
On the other hand, The MMD results claimed on the paper are:
deg: 0.111, clus: 0.005, orbit: 0.070
for GDSS anddeg: 0.171, clus: 0.011, orbit: 0.223
for GDSS-seq.The MMD results of the samples generated by the provided checkpoint model are:
MMD_full {'degree': 0.093013, 'cluster': 0.00718, 'orbit': 0.101709, 'spectral': 0.793645}
I understand that the random seed could affect the sampling results. But this variance is a bit large in my perspective (especially the network I trained myself). Do you have any insights about this? The previous EDP-GNN baseline seems to have a large variance when the number of generated samples is small. Do you think it could be attributed to the intrinsics of the score-based model?
Best,
Qi
The text was updated successfully, but these errors were encountered: