-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: sparse.csgraph: a faster DFS implementation: (i) linear (ii) releases gil #4166
base: main
Are you sure you want to change the base?
Conversation
…ptions in the original implementation (like using -9999 for tree root) are supported but can be optionally removed (a cleaner way of designating tree top is to add a self loop)
Looks like you should be using Also, more of a general question: the other functions in this module use the older Cython array types: |
Actually, I can't use the Note also that key to the implementation is the use of memory views which are more efficient than using the numpy arrays and allows the implementation to release the gil. Indeed I think that this is prefered to using numpy arrays. Just for the record, I am testing on a 64 bit system and the testsuite runs with no errors. |
New code needs to use the same approach to types as the rest of the |
cdef int order_end = 0 | ||
|
||
cdef int* status = <int*> stdlib.malloc(n*sizeof(int)) | ||
cdef DfsStackEntry* stack = <DfsStackEntry*>stdlib.malloc(n * sizeof(DfsStackEntry)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unchecked mallocs! Why not use two NumPy arrays of the appropriate type and get automatic memory allocation for free?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can use a numpy array for the status buffer. However the stack needs to save both node and neighbor index.
As a side note, the allocation of numpys array is way slower and could be avoided for internal buffers. I used it only for return values.
Finally, I've added the tests for the memory allocation in a separate commit.
I must say I find this a lot more readable than the old code; csgraph algorithms often don't look like graph algorithms the way I know them. |
I totally agree with pv. I've added a commit to respect both indexing schemes by using the fused type from parameters.px. |
scipy/sparse/csgraph/_traversal.pyx
Outdated
cdef int* status = <int*> stdlib.malloc(n*sizeof(int)) | ||
if not status: | ||
with gil: | ||
raise MemoryError() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't actually work the way you think it does (as I've found out the hard way). This raise
will abort the function, but because it's marked nogil
, it has no way of propagating the exception. Instead it will log the exception to stderr
and return.
In these situations, I usually return an error code, then check for that in a function that is not nogil
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I'm guessing that the try-finally block is as useless as raising an exception. I've changed the code to reflect that.
…ng to support both 32bit and 64bit indexing schemes as fused types are not allowed in structs
Thank you for the valuable advice on raising exceptions from |
scipy/sparse/csgraph/_traversal.pyx
Outdated
@@ -399,9 +399,76 @@ cdef unsigned int _breadth_first_undirected( | |||
|
|||
return i_nl | |||
|
|||
cdef struct DfsStackEntry: | |||
long node | |||
long index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
long
is not reliable. It's 32 bits wide on 64-bit Windows. Consider using Py_ssize_t
.
Looks like the return of symengine/symengine#308. Does scipy need a PR analogous to numpy/numpy#5271? |
… across the module
Any advice on how to proceed? It does not seem to be related to the commit. |
Travis CI failure is unrelated. I've restarted the build, let's see how it fares this time |
Naive question here, but does this actually release the gil? http://docs.cython.org/src/userguide/external_C_code.html#declaring-a-function-as-callable-without-the-gil says that nogil only declares it as not requiring the gil. You must still release it |
I've checked the resulting c code and it seems to be the case. I'll submit a fix for that. |
@atiasnir are you still planning to make the changes to release the GIL discussed above? This PR seems close to ready. |
I've re-implemented
depth_first_order
.The new implementation offers the following advantages:
As a result the new implementation is much faster.
In addition, this new implementation offers a deviation from the original functionality by representing the root of the tree using a self loop (which, to the best of my knowledge, is more standard). The original implementation uses "-9999" which introduces indices outside the domain of the nodes. Currently this behavior is only available through a deisgnated keyword.