-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: wrap sparsetools manually instead via SWIG #3440
Conversation
Awesome work! Works for me on 32-bit linux, now testing the release builds. |
Other observations besides compile time and memory usage:
|
Doesn't work on OS X yet:
|
The OSX issue seemed to be some header ordering problem (Python.h must be included before C++ stdlib headers, for whatever reason). Fixed. The |
Builds on both OS X versions now, so good to go and backport to 0.14.x I think. Also tested with scikit-learn, works fine. |
Fixes gh-2944 by removing all the warnings generated by Clang due to SWIG. |
And will stop code analysis tools/sites from reporting that Scipy is a C++ project:) |
Minor issue, this doesn't seem to work on Windows with MinGW: Many hundreds of these:
Quite a few of these:
And this one just once:
|
Maybe you have some left-over files from previous build? "Possible C/C++ prototypes are:" is a message from SWIG. |
That's possible, I didn't manually clean out site-packages. I'll check why they are picked up, probably that can be prevented by some renaming. Otherwise a lot of people could run into this on upgrade. |
The |
The issue was that there's now a |
Installing over a previous Scipy installation causes problems, as sparsetools is not a package any more. Therefore, change the name. At the same time, mark the module for internal use only.
done, and rebased |
Not sure what's up with the noncanonical astype failure. Does not occur on 32-bit linux. |
Shall we merge this and figure that last one out after the beta? Would be good to get that out the door finally. |
Would be better to resolve it first, as it probably indicates something is wrong. |
OK. Probably won't be able to look at that before Saturday though. |
It looks like the test is wrong. I have a bit of a hard time figuring out why the noncanonical tests are written the way they are, but what's going on is that this code:
returns complex arrays with negative numbers for the failing test and those turn into large positions ones after If I insert the following debug prints the output should show what I described above:
Output for one failing test (the COO one) on Linux:
And under Wine:
|
Ok, in that case it's probably safe to just merge, and mark the test as knownfailure, similarly as several other noncanonical tests. coo and csr/csc matrices can in principle be allowed to contain duplicate entries, as this is convenient for construction. |
OK I'll do that and add some comment about the failure pointing to this discussion. Thanks. |
The COO matrix documentation makes explicit that duplicates are permitted, with the claim that "Duplicate (i,j) entries are summed when converting to CSR or CSC." Clearly |
Ah, missed that in the docs. It's a bit incomplete, they should always be summed (like in Regarding the failures: I don't see what summing duplicates has to do with |
MAINT: wrap sparsetools manually instead via SWIG
I haven't understood where |
Issue was that negative numbers were produced that were then going through astype(). Discussion on gh-3440.
Failures fixed by avoiding usage of negative numbers in 727fa40. |
Thanks Pauli, this PR was massively helpful! |
@jnothman but it was failing on |
the test loops simply over |
Issue was that negative numbers were produced that were then going through astype(). Discussion on gh-3440.
backported to 0.14.x, last commit is 90fbdd2. |
Hmm... but the unsigned int case should be handled (it tests |
No, all that stuff is done in the Anyway, I fixed it already to not use negative numbers - those aren't necessary at all. |
Well, your solution could still produce negative numbers if there are On 16 March 2014 14:44, Ralf Gommers notifications@github.com wrote:
|
Yes, I did look at the test matrices to make sure that there were no negative numbers created for the choice I made. It's not the most robust fix if new tests are going to be added. My purpose was to get this PR in and get 0.14.0b1 out the door (building the binaries now). If you have a more future-proof solution then it's very welcome of course:) |
This PR changes sparsetools to be wrapped via custom code generation rather than via SWIG.
This reduces memory and time requirements for compiling sparsetools. The approach is scalable: the set of routines can be easily split into smaller compilation units if necessary. Code generation is done at build time, so there are also no huge autogenerated files checked in the VCS.
Reduction in memory requirements likely originates from tighter, manually written, Python argument parsing code, which is shared between all routines.
A convenience drawback is that the argument specs for each function needs to be typed in manually, rather than parsed directly from the C++ code. However, this is implemented so that errors in the argument list specs result to compilation errors rather than crashes at runtime.
I also took a look at Cython: fused types don't seem to play together with C++ templates at the moment. A similar code generation framework and
call_thunk
would be then needed also with Cython, and I think the present approach is less magical. (The other suggestion of replacing this with templated C also requires the type dispatch mechanism to be written, and would probably cause pain when rewriting the boolean and complex ops.)Some statistics: