-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
calculate graph similarity by using GraKel #59
Comments
Hi @tiedaoxiaotubie , Assuming that the nodes of your graphs are not annotated with attributes, you can transform them into grakel graphs as follows:
where Then, you can compute the n x n kernel matrix (where n is the number of graphs). You can think of the normalized kernel values as the similarities between graphs. Thus, the (i,j)-th element of the kernel matrix corresponds to the similarity between the i-th graph and the j-th graph. To produce the kernel matrix you can use one of the kernels that is implemented in grakel. For instance, you can compute the shortest path kernel as follows:
Then, the (i,j)-th element of |
Thank you very much for your reply! the nodes of my graphs are neither annotated with attributes nor with labels, I tried with your code, but seems not work for me. I have to admit that my graph is pretty large, in which the edge number is more than 6,000, maybe the GraKel cannot handle such a big graph in one time? I tried to use ShortestPath first, it raised an error immediately:
I searched the previous issue that similar to my question, I noticed there were people tried to use RandomWalk to calculate the similarity:#8, so I also gave it a try, my code are as follows:
at this time, my error message was:
|
It seems that not only the graphs are large, but also the dataset itself (35,259,844 graphs in total?). Comparing every pair of graphs is not feasible since it requires generating a 35,259,844 x 35,259,844 matrix which does not fit in memory. Do you need to compare each one of the 35,259,844 graphs against every other of the dataset? |
OK, I changed the number of graphs to be calculated, only select 4 of the original graphs, but there are still more than 6,000 edges and 5,000 nodes for each of the graphs, at this time, the RandomWalk worked, while the ShortestPath still failed. The error message of ShortestPath are as follows:
It seems trying to get labels from the graph, but the graphs of mine are neither annotated with attributes nor with labels, which makes me confused. The output of RandomWalk are as follows:
For the output of RandomWalk, I have 2 questions:
Hope I could get your answer, thank you very much :) |
With regards to the two questions:
With regards to the shortest path kernel, it expects node-labeled graphs. You can set the Perhaps the most efficient kernel is the WL subtree. But this kernel expects the graphs to contain node labels. To run the kernel, you can assign the same label to all nodes of all graphs (for instance, the label 'a'). To do this, have a look at this example: https://ysig.github.io/GraKeL/0.1a8/auto_examples/nx_to_grakel.html#sphx-glr-auto-examples-nx-to-grakel-py |
Perfect answer, awesome! I got better result after using a smaller lamda in RandomWalk. the error of shortest path kernel also was handled after adding |
您的邮件我已经收到了,我会尽快处理然后给你回复。谢谢!
|
Hi, I wonder if we can use GraKel to calculate graph similarity only? I don't need to classify the graphs, I only want to get the similarity among them, I have generated graphs with networkX, since multiple strategies of graph kernel have been implemented in GraKel, I think it should have the ability to export the graph similarity, but sadly I failed to find such an API.
The text was updated successfully, but these errors were encountered: