-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Cython as a codegen target #27
Comments
A nice overview of the current solutions to interface with C code (excluding weave): http://scipy-lectures.github.com/advanced/interfacing_with_c/interfacing_with_c.html I wonder whether it makes sense to maybe use Cython also for the spikequeue (instead of swig)? |
Nice, I'll take a look at that on my flight back to Boston. :) Yeah, you might be right about Cython for spikequeue - the advantage would be that it would probably automatically handle all the datatypes stuff easier than writing it with templates in C++ and then trying to make that work with SWIG. It does mean rewriting it from scratch though. |
Just a small thing: I guess i we add cython as a codegen target we should also rename "cpp" to weave (and maybe have cpp as a common superclass for all targets that deal with cpp code -- I guess they'll have a lot in common and only the interface to Python is different?) |
I'm wondering whether we should actually bother with Cython or rather go with numba. We would basically have to write a new Python target (one that doesn't use vectorization but instead loops, so rather similar to the current C code) and then wrap the function in But this is not a high-priority item for now, anyway. |
I think for runtime code generation rather than standalone, it's not much work to implement a new language in any case, so we can just see how it goes later on. |
Material for a very recent Cython tutorial can be found here: https://public.enthought.com/~ksmith/scipy2013_cython/scipy-2013-cython-tutorial.zip |
OK I started work on this i the |
What did you use for testing with Cython? I used a "pure state update" example like this G = NeuronGroup(100, 'dv/dt=-v / (10*ms) : 1')
G.v = np.linspace(0, 1, 100) And it is indeed very slow, much slower than pure Python. In this case, the reason seems to be mostly the |
I looked into |
Why? Also, have you looked at this?
I think it is also a good idea. In my experience numba is still young and a bit buggy, but if the generated code is simple enough, then it might work very well. |
I think it's a bad idea for us to start rolling our own version of
The trouble with numba is that it's pretty reliant on the continuum Python distribution. I think we should implement a numba target as well as weave and cython, since it's not actually that much effort to write new runtime targets. @rossant - what's the idea of that link? |
When I said we can't use it, I didn't mean that we can't use Cython at all. But the way we use |
This sounds like a reasonable approach. With our codegen infrastructure, the knowledge about types etc., this shouldn't be too difficult to implement, I guess. |
@mstimberg Yes, I guess you'd need to build the generated code once, and then call the "compiled" function at every time step. @thesamovar This extension lets you compile Cython code at runtime, then call this function as you want. Cython is needed for compilation, not at every function call. It seems to be close to what you want to achieve, doesn't it? (now I don't know the code generation stuff much so I may be completely wrong!) |
@rossant Yep, that's exactly what we want. How stable would you say their code is? I just have in mind the horrible situation with sparse matrices in Brian 1, and don't ever want to have to deal with that again. My feeling is that generally IPython is not stable, they tend to change things quite a lot. |
@thesamovar I was thinking about taking this code and adapting the relevant portion to your needs. Take it as an "how-to-compile-cython-code-on-the-fly" example. |
Right, it's not too complicated (same with the |
Which APIs are internal and undocumented? I think these features are as stable as we can get right now in the Python ecosystem, but I may be wrong. Might be worth getting in touch with the Cython devs... |
Interesting, OK - that might work then. @mstimberg you were worried about it being a lot of effort, but actually looking at the code in that link, which is very similar to |
@thesamovar: you said it would be a bad idea :) But yes, the code doesn't look too difficult (even though it would be nicer if one would refactor |
OK I've been playing around with Cython a bit more and managed to make a bit of headway. So I wrote a quickly modfied version of Another problem is that there is no support for being intelligent about functions like So I read that by default cython will still do bounds checking and allow for wraparound indices, which slows it down. I disable that. It's still incredibly slow. See my example Despite all these optimisations, here are some timings on the state update of the jeffress example:
In other words, Cython is still slower than just using numpy, and weave is an order of magnitude faster. Maybe I missed something but I think I did all the Cython optimisations I know how to do. Do you guys want to give it a try and see if I missed something? |
Optimizing Cython code in the dark is indeed really painful. Annotations are helpful: they show you the unoptimized lines. You can also use annotations in the IPython notebook ( |
Also: do you have the possibility to test pure Cython code (for instance in the notebook) without using the codegen machinery? In other words, generating the whole Cython script and benchmarking that in the notebook. It would make it easier to test things and to debug. What is the type of |
Just as a little addition to what Cyrille sad: I think the current code has a lot of overhead since it does a large amount of work dealing with the parameters passed to the function. Since these do not actually vary (except for maybe A minor comment on optimisation flags: setting |
OK, well, I made some progress. Long story short, I now have this (on a 10x larger problem than before):
So now Cython is almost as fast as weave. I think almost none of the difference is overhead because the timing is based on just 100 function calls, and setting N=1 gives times of 0.01 for each of them. Also, tests on the earlier, slower version, showed that the time was scaling with N, so overhead is definitely not part of the explanation. The big change was... It's unexpected... Setting the My guess is the default implementations of This raises some questions:
Incidentally, I also tried the following optimisation:
Then use the For the final version, if we decide to continue with it, I like Marcel's idea of wrapping everything up in a class. |
Ok, that's great news, so we all have to apologise to Cython (especially you for your commit comments ;) ). I'm also quite surprised by the magnitude of the change with Anyway, about the
So that seems to be pretty close to weave now. |
OK, on fixing it to use the On the accuracy point, my question is more: does the loss in accuracy matter for us? Maybe we should try this out in some examples. Basically, apart from knowing that |
My impressions is that we don't have to worry too much. AFAICT there are two kind of optimizations that PS: Bertrand says hi! |
Just a minor remarks about performance: I tested with the Intel compiler/MKL libraries and did not see any change in performance for weave in the |
Actually there was something else going on here, see #173 for explanation. |
OK so for the moment to make Cython run fast we have to use our own modified Cython inline function. However, what this function does is actually pretty minimal and maybe we can make it work with normal Cython. I think the key thing is just whether or not we can force Cython to use the |
What do you mean by "work with normal Cython"? I don't think there's a way to use |
Yeah our own version is not too bad, it's just nicer not to have code like that if we can avoid it. Maybe there is a way to include the c flags that I didn't see? Or maybe the Cython people could be encouraged to include it for a future release? But you're right, it's not terrible to have our version. |
Looking at the code, there does not seem to be a way. But it would be an easy patch to add a new argument to allow for it and I'm sure they'd be happy to include it ;) I don't know how long their development cycle is, though. |
I started work on this in a new branch @mstimberg, the big thing that is missing is support for function implementations. You're more familiar with that code, so if you have some time it would be great if you could have a look at it. At the moment, it runs very slowly because functions are calling back to Python, but it is running at least. If you don't have time or want to focus on other stuff, I'll try to work it out. |
Cool, I'll try to have a look at the function thing soon. |
I pushed a commit to the branch, making functions mostly work. The way the
What "code" means for the third option is language-specific: for numpy, we provide a Python function, for C++, we provide code as a string. For Cython we should probably support both (this is what I added), either code as a string or a Python function (as a fallback for user-defined functions). For demonstration purposes, Two major things are still missing:
|
For |
Yes, exactly. I moved it above the main function, don't know whether this matters in Cython, though. |
We should probably not use |
Ugh, trying to implement Cython support is really like bashing your head against a wall. It seems you can't create buffers with bool dtype so you have to work around it by using uint8. And so on and so forth. I wonder if it's really worth the effort and how the community settled on this as better than weave. |
OK so I made some progress but there's still lots of things to fix and it feels like I'm basically not using Cython but rather working around it trying to coax it into producing the C++ code that I want. I'm wondering if there are other options given that weave is not being ported to Python 3. For example, we could continue to use Cython but only use it to wrap into a separately compiled C++ module. This gives us the advantage that we can basically just use our weave code but add in an extra Cython wrapping stage. Or, we could try to ditch Cython and weave and find our own way to do something effectively equivalent to weave but slightly more restricted in scope to what we want. Or we could continue trying to make Cython work. Any thoughts? @mstimberg @rossant |
I haven't followed the discussion in detail, but I would say that using Le mercredi 9 juillet 2014, Dan Goodman notifications@github.com a écrit :
|
OK I'm making some progress on Cython finally. It's still a bit hacky, particularly choosing the right names for dtypes and handling all the different types of The major thing still remaining to do is to implement all the different templates using Cython. This can be quite an exercise in frustration, but with the existing working examples as a reference it shouldn't take too long now. Then we need to test for efficiency and correctness, and it can relatively soon be merged. Testing for efficiency is usually quite easy: either it runs almost as fast as weave or hugely slower if you've forgotten to handle type definitions correctly in some part of the code. |
Cool, I'll try to have a look at the |
OK I made a pull request for this, let's continue there. |
This will be necessary for Python 3 because weave is not being ported to Python 3 (very sad, I know).
The text was updated successfully, but these errors were encountered: