Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get.adjlist very very slow, and potential fix #194

Closed
cfhammill opened this issue May 16, 2017 · 5 comments
Closed

get.adjlist very very slow, and potential fix #194

cfhammill opened this issue May 16, 2017 · 5 comments

Comments

@cfhammill
Copy link
Contributor

For large graphs get.adjlist is very slow, for a graph with 125000 vertices, and 735000 edges

system.time(adj1 <- igraph::get.adjlist(graph, "out"))
   user  system elapsed 
665.323   1.508 667.865 

The C-code to extract the neighbours isn't the culprit

system.time(.Call("R_igraph_get_adjlist", graph, 1, package = "igraph"))
   user  system elapsed 
  0.011   0.000   0.012 

The slowness can be traced to:

res <- lapply(res, function(x) V(graph)[x + 1])

I suspect the proper vertex list indexing is overkill here and causing an extreme slowdown. Since the return-type from c-code should be guaranteed to be a list of integer vectors, and the doc promises to return a list of integer vectors could we get away with:

vvec <- unclass(V(graph))
res <- lapply(res, function(x) vvec[x + 1])

Or potentially even:

res <- lapply(res, `+`, 1)

Benchmarks:

system.time(na1 <- new_adj_list1(graph, mode = "out"))
   user  system elapsed 
  0.187   0.000   0.188

system.time(na2 <- new_adj_list2(graph, mode = "out"))
   user  system elapsed 
  0.069   0.000   0.069

all.equal(na1, na2)
[1] TRUE

all.equal(lapply(adj1, function(x) as.numeric(unclass(x)))
        , na1)
[1] TRUE
@gaborcsardi
Copy link
Contributor

Thanks! Would you like to submit a pull request?

@cfhammill
Copy link
Contributor Author

Sure thing. Do you prefer the first or second version, the second has a precedent in ego

res <- lapply(res, function(x) x+1)

@gaborcsardi
Copy link
Contributor

gaborcsardi commented May 25, 2017 via email

@cfhammill
Copy link
Contributor Author

Hmm, looking at the ego example, it seems like this problem was found before. The c code returns a 0-indexed integer vector list, increments by one, and then checks igraph_opt("return.vs.es"), if it's true, the integer vectors are cast as vertex lists with create_vs .

Setting that option to TRUE, ego on one of my example graphs takes 450s, set to FALSE it takes 61ms. For now I think I will use that same solution in as_adj_list, but I wonder how to make that option obvious so that people who run into this will know how to correct it. I could add it to the docs, but if others are like me, they might not see it there. Could it be added as an argument, defaulting to the option?

@ntamas
Copy link
Member

ntamas commented Aug 3, 2022

I think this is now basically fixed as we use the new unsafe_create_vs() function in as_adj_list, which now constructs V(graph) only once instead of once for every row of the adjacency list.

@ntamas ntamas closed this as completed Aug 3, 2022
@github-actions github-actions bot locked and limited conversation to collaborators Mar 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants