New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvement of GNP generators for graphs and digraphs #12362
Comments
comment:1
The patchs are not working yet :( Here the modifications are not taken into account by sage -b. Any help, advise,... is more than welcome ! D. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
comment:4
Another proposal for RandomDirectedGNP. The function is generic and also works for RandomGNP, but I haven't modified the graph_generators.py file because further benchmark are required: networkx implementation versus this implementation. How can I remove useless attachments of this ticket ?? Thanks, D. |
comment:5
I have changed the status from new to need_review. Without this patch
With this patch
This patch could be used for undirected graphs, but the networkx implementation is competitive. Waiting for your feedback. D. |
comment:6
Helloooooo !!! Nice speedup During the Sage week we also found a similar speedup in a Poset code, by replacing a About the length of the code : it can be divided by two at no cost by removing the "else" on the "loops" argument. If you try to build an edge between i and i in a graph that that has been built with loops = False the edge is silently not created. And there should be no noticeable difference in the runtime if the code with loops is applied to the case where loops are forbidden. It is also possible to write the two codes
and
as
Same thing here, a "if" in a loop (with branch prediction !) should not weigh much Nathann |
comment:7
Thank you Nathann for the good advices. I have changed accordingly the standard GNP algorithm. Some running time. The fast algorithm is better for large sparse graphs. Otherwise, the standard GNP algorithm is now fast enough.
D. PS: concerning m[i][j] versus m[i,j], is it for python or cython ? which kind of data structure ? PS2: in general in C, m[i][j] is faster than m[iN+j], so if you have a NxN matrix encoded in a 1D array A, it is interesting to create a **B array of size N and to assign B[i]=A+iN. Then, we have B[i,j]==A[i*N+j]. OK, I had real C course. |
comment:9
Hellooooooo !!!
Nice ! Could you advertise it in the doc ? Something like "the fast algorithm is only faster for LARGE instances (try it to know whether it is useful for you)" ? I tried to improve it a bit because I still find weird that networkx could beat us at this game, but found nothing yet... I'll try again later This being said, I was thinking of two things. The first of the two you reported yourself some time ago : the means seem to be below what is expected...
And there is also this "detail"
Could you also add the file to the documentation (the files in devel/sage-yourbranch/doc/en/reference/) and make the method available (through an optional argument) in graphs.RandomGNP ? Even though it can not be the default method, as networkx is faster (why the hell should it be ? And about the matrix : it was not about C arrays but about Sage matrices, and this is why the speedup was so huge (#12476). It is because a field was accessed in Python as Nathann |
comment:10
I have addressed all your remarks. The "detail" was because I have stupidly copy/paste your For graphs, I have decided to let the networkx method as default.
Concerning the average density, it is seems correct with larger sample.
A possible improvement for RandomDirectedGNP is to return a complete digraph (with or without loops) when p=1. However, no such function exists and it is not relevant in this patch. I let it for another patch. D. |
comment:11
I have added some However, I don't know how to set the seed inside a cython function. This could be a nice improvement to this patch. I have checked the D. |
Attachment: trac_12362_random_gnp.patch.gz |
comment:12
Hellooooooo David ! Well, I don't think not being able to define the seed as a parameter of the function is so bad. As it works fine when you set it globally for the doctests anyway.... Now, here is a small patch with some modifications. It adds a few tests, check that the given method is "networkx" or "Sage" but nothing else (that's how we do it usually). I also added a "(Cython)" to the title of the new file you create. It makes more sense this way when you look at the documentation. Otherwise, you would have:
This being said, I added one test which does not pass... Here it is :
It looks like there is some deviation with the "fast" algorithm, which does not occur with the usual one :
Well, that's already 1% of edges too much. I don't think the deviation with the first algorithm is very bad (though for some reason it is always positive on my tests, and I do not like that), but I'm no big fan of the deviation for the second implementation Oh, and I also noticed that the graphs returned did not necessarily have n vertices when p was too low. I fixed that in the patch too Nathann |
comment:13
Attachment: trac_12362-rev.patch.gz Hello Nathann, The We should also remove the tests with cputime. They are quite old and will differ from one computer to another. If you agree with these proposition, I will prepare a rev-rev patch ;-) D. |
comment:27
That's much better ! For me the patch is ready to do :)) D. |
comment:28
On hawk, this yields
This is probably a bug uncovered by this patch, rather than caused by this patch. But anyway, it should be looked at. |
comment:29
Updated ! It must be because of some numerical noise on that machine. Actually, this doctest breaks when the CBC package is installed too, for the same reason. Let it be fixed with the updated version of Nathann |
Attachment: trac_12362-edge_disjoint_spanning_tree.patch.gz |
comment:30
The patch works perfectly with sage-5.0.beta7 on my computer. Tests are also OK.
The long tests are also OK for all .py and .pyx files of Thanks Nathann for last update. D. |
comment:31
Long tests are also OK on a fresh new install of sage-5.0.beta8.
|
comment:32
Nathann, I don't know if it is a requirement, but there is no ticket number inside files trac_12362-rev.patch and trac_12362-edge_disjoint_spanning_tree.patch
Best, |
comment:33
That's a very controversial point in Sage development If I did not miss the latest updates, the moral is : it used to be a requirement, and it is now done automatically with scripts. If you put it anyway Jeroen will remove it himself, but you spare him that if you do not add it. But these things changes faster than lightning By the way, have you ever played the role of mentor in a "Google Summer of Code" project ? Looks like two people asked to participate in one for Sage : the first about Graphs, the other with LP... I should really work on my application right now, but I still must think f that too O_o;;; Nathann |
comment:34
Replying to @nathanncohen:
Exactly.
That's false. If you put it anyway, you end up with a commit message like
|
comment:35
OK, so let me know if I have something to do. Best, David. |
Merged: sage-5.0.beta9 |
comment:37
This patch is bad. Before:
After:
|
comment:38
OOch David ? You know where that comes from ? Nathann |
comment:39
Well, actually our own code is still faster....
But did something change with networkx since ? |
comment:40
Hello, the default method is networkx (as before this patch) and so when used with default parameters this patch has no impact on the running time. When I wrote this patch, we decided to let networkx as default algorithm because the running time was faster with new method only for large graphs. Something has changed with networkx. It is now surprisingly slow...
If we are unable to find why networkx is now slow, we can open a ticket to put Sage as default algorithm. For directed graphs, our implementation was always faster than networkx, and so we removed the networkx one. |
comment:41
Yeah, something must have changed in networkx, but the last update seems to be 1.2 a veeeeeeery long time ago. Where the hell does this come from ? |
comment:42
Here's the code of RandomGNP in beta8
and in beta14
|
comment:43
GOT IT !!!! The flag "fast" used to be "True", it is now "False". We set it to False because our "fast" algorithm is slower than the usual one, so for the Sage algorithm it is better to set it to False. Nathann |
comment:44
Bravo !
I will open a new ticket to correct this. Sorry. A remaining question is which should be the default algorithm. Apparently, the Sage implementation is often faster than the networkx one, except for large sparse graphs
|
The RandomDirectedGNP generator is quit slow because it is written in python. Also I propose to rewrite it in Cython.
With this ticket, I propose to create a file graph_generators_pyx.pyx containing graphs and digraphs generators in cython and to provide a faster implementation of the GNP generator for both graphs and digraphs.
APPLY:
CC: @nathanncohen
Component: graph theory
Keywords: directed graphs, generators
Author: David Coudert
Reviewer: Nathann Cohen
Merged: sage-5.0.beta9
Issue created by migration from https://trac.sagemath.org/ticket/12362
The text was updated successfully, but these errors were encountered: