New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About what cSBM parameters to use #4
Comments
If I used the parameters listed in the paper and set the average degree to 5, I got the following edge homophily table
which is much more homophilic than the table in the paper. Can you tell me the exact parameters used for generating cSBM synthetic datasets? |
Hi Xiuyu, Thank you for your interest in our work. For cSBM datasets you should use what we have stated in the supplement, i.e. n = 5000 and f = 2000. As for average degree, the default of 5 should be fine. For the homophily table, can you elaborate a bit more on how you calculate the value, or pasted here the function you used? Also, what is the value of epsilon you used? It should be 3.25 instead of the default 0.1 which will be too small. |
Hi Jianhao, Thank you for the quick reply. I used epsilon=3.25 in the experiments, and used the following code to generate the table: def node_homophily(edge_idx, labels, num_nodes):
edge_index = remove_self_loops(edge_idx)[0]
hs = torch.zeros(num_nodes)
degs = torch.bincount(edge_index[0,:]).float()
matches = (labels[edge_index[0,:]] == labels[edge_index[1,:]]).float()
hs = hs.scatter_add(0, edge_index[0,:], matches) / degs
return hs[degs != 0].mean() which should be consistent with how H(G) was defined. |
Can you show the code for |
Sure. It is just the |
Hi Xiuyu, Thank you so much for pointing out this issue! I have tested both your function and our previous function with cSBM dataset and other simple graph and it turns out you're correct about the homophily scores. There is a small bug in our code when computing the homophily scores (doing division with torch integers) which caused the numbers to be smaller. I have fixed it and got similar results as yours. We will update the values in our paper accordingly. Thanks again for letting us know! |
In the Appendix A.5 of the paper, it is stated that n=5000 and f=2000. However,
create_cSBM_dataset.sh
set n=800 and f=1000. Which set of parameters should I use?Also, I was not able to find what average degree was used in the paper. Should I just set it to 5 as in
create_cSBM_dataset.sh
? Thanks.The text was updated successfully, but these errors were encountered: