enhancements numericsplan

DagSverreSeljebotn edited this page Jun 9, 2009 · 7 revisions
Clone this wiki locally

Overall plan for numerics in Cython

(Author: Dag Sverre Seljebotn)

Different goals?

It was put nicely by a user in a thread related to numerics: "I wouldn't start a project with Cython. If I know from the beginning it is too slow in Python, then I do that directly in C/C++.".

It is certainly legitimate for Cython to stay within its niche. But for the record, the above is not how I feel -- I would like Cython to be usable as a "primary tool" for numeric computation, without having to step into C++ or Fortran for performance and convenience reasons alone.

Looking to Fortran though, Cython is lacking /a lot/. There's a lot of time to fix that -- Fortran has been around for 40 years and will likely be some more -- but the question is if. Is it worth it if it turns Cython into a Fortran clone? OTOH, going half the way like today is going to create a continuous demand for more until firm decisions are made about the issue.

Earlier design decisions

It was decided to not make Cython know anything directly about NumPy, but rely on PEP 3118 instead.

The immediate problem

Once #277 and #301 are fixed the limit is pretty much there for what we can do with buffers in Cython without a design overhaul.

The main question is: Are we better off stopping there? (After all, one should likely not add everything to a language which users want...)

Some examples of features going beyond this:
  • For calling external libraries more transparently than the case is today, automatic copies must be made of non-contiguous buffers.
  • Efficient slices are high on the wanted-list
  • Efficient expression computation -- turn "a = b + c + d" for three arrays b, c, and d into a loop adding them together. (There's a lot to say here which I'm leaving out for now, but this is a complicated thing to wish for.)

The problem here is that in each case, new objects must be created. Performance is one issue, but most important: How does one figure out what object to even create?

Solutions I can think of

  • (A) What I have been pushing for -- direct primitive "int[:]" which is not linked to an underlying object, but is seen only as a convenient way of doing PEP 3118-related data processing. Cython is free to define it's own API easily doing any of the above; probably following NumPy a bit but not always (e.g. integer division would not follow NumPy).
  • (B) Custom Cython-specific extension of PEP 3118. When acquiring a buffer, the object can also return a table containing function to do construction of new slices, etc. This has some issues I can talk more about though.


Finally, there's always giving up and let Cython focus more on the original intentions. I'd have to learn Fortran better then though.