Parameters for Les Miserables dataset #2

mewwts · 2016-10-13T06:36:16Z

Hi,

Thanks for node2vec - such an interesting idea.

Could I ask you to specify some additional parameters for the case study 4.1 in you paper so that I can reproduce the community-result?

For the top example you set p=1, q=0.5, but I'm wondering what you specified num_walks, walk_length for the random walk generation, as well as size, window, min_count, sg and iter for Word2Vec.

Hope this isn't too cumbersome to reply to. Thanks again!

The text was updated successfully, but these errors were encountered:

aditya-grover · 2016-10-13T22:26:16Z

Please direct all questions regarding the paper to adityag@cs.stanford.edu. Feel free to open an issue if there is any clarification specific to the node2vec implementation provided in this repository.

mewwts · 2016-10-14T08:08:53Z

Sure ¯_(ツ)_/¯

Tixierae · 2018-01-22T09:25:16Z

@mewwts did you get the answer?

mewwts · 2018-01-23T06:34:56Z

Hey @Tixierae - I got some parameters from @aditya-grover back then. For word2vec
size=8, window=2, sg=1, iter=1. I was however not able to replicate the results.

Tixierae · 2018-01-23T09:54:10Z

@mewwts many thanks for the quick reply! So they did use a non-default window size (the default is 10). It seems indeed to be a critical tuning parameter that really depends on the graph (e.g. see Figure 2 of Watch your step: Learning graph embeddings through attention - from Google).
My guess is that the window size should be to some extent proportional to the size of the graph and to its diameter. It may be harmful to use a window of size 10 if the shortest path between any two nodes in the graph is, say, 3.
Do you know by any chance what values of num_walks and walk_length they used?

mewwts · 2018-01-23T12:40:56Z

Exactly, @Tixierae! Thanks for linking to that paper, looks like a good read. Printed it now.

I was not able to find the values of those parameters sadly. The email I got from @aditya-grover said the random-walk parameters were set to "very low values" due to network size being small.

Tixierae · 2018-01-23T13:15:12Z

thanks @mewwts !
@aditya-grover What would you recommend for num_walks, walk_length and window when the graph is small/very dense? Any rule of thumb to set window size based on graph density/diameter?
PS: I know it may not be the best place to ask, but some quick feedback would be very much welcome and would benefit more people than tru private messaging. Thanks much in advance!

mewwts · 2018-01-23T13:22:54Z

@Tixierae I think the best thing you can do for now is try to grid search these parameters. The network is quite small right?

Tixierae · 2018-01-23T13:33:18Z

@mewwts yes, each network is small, but I have thousands of them, for several datasets. The final task is graph classification, for which I am 10-fold cross validating a 2D CNN, with many epochs for each fold (I'm using this approach). So, I can do a coarse grid search, but each combination of parameters is quite costly to test. Hence, getting good priors would help a lot.

Tixierae · 2018-02-07T09:44:01Z

@mewwts section 8 of this paper: http://projekter.aau.dk/projekter/files/259997796/mi109f17___Vertex_Similarity.pdf

mewwts · 2018-02-08T15:49:33Z

Thanks @Tixierae - interesting!

annaguldberg · 2020-05-02T21:21:08Z

Hi, I have a network of 311 nodes. It is quite dense with an average shortest path of 2. I have used p=1, q=2 and kept the window size and walk length very small, but are not getting great results. Does anyone have any suggestions to what could be wrong?

bianxintong · 2020-12-28T10:58:55Z

@mewwts section 8 of this paper: http://projekter.aau.dk/projekter/files/259997796/mi109f17___Vertex_Similarity.pdf

I was having a hard time replicating the homophily result (structural equivalence was somehow easier to replicate, idk why), thanks to this study, i was finally able to go from this:

to:

if I resize the node by node degree, I obtain as far the best approximation of the image in the paper that i can get:

I guess when the graph is so small, we need to repeat the walk many times to make word2vec actually learn something; and since the window size so small, we need to walk a long way the get the surrounding community structure. And, the window size is definitely important.

sarmad-MOAHAMMED · 2021-01-20T12:41:18Z

@mewwts section 8 of this paper: http://projekter.aau.dk/projekter/files/259997796/mi109f17___Vertex_Similarity.pdf

Hi,
Could you share the code for this project ?

Thanks.

bianxintong · 2021-02-27T10:10:40Z

@mewwts section 8 of this paper: http://projekter.aau.dk/projekter/files/259997796/mi109f17___Vertex_Similarity.pdf

Hi,
Could you share the code for this project ?

Thanks.

Edited on 24-03-2021:
first I compiled the node2vec bin, then did:
!./node2vec -i:lesmisDir.edgelist -o:lesmisDir.emb -d:16 -l:8 -r:100 -k:2 -p:1 -q:0.5 -e:1
then I did a 5 cluster kmeans clustering
then export the result to gephi for graphing.

I found the node2vec bin worked better than open source implementation (stellargraph in this case)

I stumbled upon my notes of replicating the results today, so I modified this comment. I was frustrated by the amount of effort to replicate the result to be honest that was why I didn't document well my process. But I think that's more like a problem of node2vec itself, that the hyperparameters are really sensitive and really depends on your graph.

aditya-grover closed this as completed Oct 13, 2016

Tixierae mentioned this issue May 22, 2020

Parameters to run node2vec on custom dataet Tixierae/graph_2D_CNN#2

Open

sarmad-MOAHAMMED mentioned this issue Jan 24, 2021

Help bianxintong/example#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parameters for Les Miserables dataset #2

Parameters for Les Miserables dataset #2

mewwts commented Oct 13, 2016 •

edited

Loading

aditya-grover commented Oct 13, 2016 •

edited

Loading

mewwts commented Oct 14, 2016

Tixierae commented Jan 22, 2018

mewwts commented Jan 23, 2018

Tixierae commented Jan 23, 2018

mewwts commented Jan 23, 2018

Tixierae commented Jan 23, 2018

mewwts commented Jan 23, 2018

Tixierae commented Jan 23, 2018 •

edited

Loading

Tixierae commented Feb 7, 2018

mewwts commented Feb 8, 2018

annaguldberg commented May 2, 2020 •

edited

Loading

bianxintong commented Dec 28, 2020

sarmad-MOAHAMMED commented Jan 20, 2021

bianxintong commented Feb 27, 2021 •

edited

Loading

Parameters for Les Miserables dataset #2

Parameters for Les Miserables dataset #2

Comments

mewwts commented Oct 13, 2016 • edited Loading

aditya-grover commented Oct 13, 2016 • edited Loading

mewwts commented Oct 14, 2016

Tixierae commented Jan 22, 2018

mewwts commented Jan 23, 2018

Tixierae commented Jan 23, 2018

mewwts commented Jan 23, 2018

Tixierae commented Jan 23, 2018

mewwts commented Jan 23, 2018

Tixierae commented Jan 23, 2018 • edited Loading

Tixierae commented Feb 7, 2018

mewwts commented Feb 8, 2018

annaguldberg commented May 2, 2020 • edited Loading

bianxintong commented Dec 28, 2020

sarmad-MOAHAMMED commented Jan 20, 2021

bianxintong commented Feb 27, 2021 • edited Loading

mewwts commented Oct 13, 2016 •

edited

Loading

aditya-grover commented Oct 13, 2016 •

edited

Loading

Tixierae commented Jan 23, 2018 •

edited

Loading

annaguldberg commented May 2, 2020 •

edited

Loading

bianxintong commented Feb 27, 2021 •

edited

Loading