Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Cython, Numba, or C/C++ for algorithmic code #126

Open
mrocklin opened this issue Mar 4, 2018 · 93 comments
Open

Use Cython, Numba, or C/C++ for algorithmic code #126

mrocklin opened this issue Mar 4, 2018 · 93 comments

Comments

@mrocklin
Copy link
Collaborator

mrocklin commented Mar 4, 2018

It seems likely that we'll want the ability to write fast numeric code in a low-level-ish language. There are a few options:

  1. Cython
  2. Numba
  3. C/C++, wrapped with a variety of other options

We should make a decision here before investing serious time in one path or the other.

There are, I think, a few categories of concerns:

  1. Speed
  2. Community adoptability (as a dependency)
  3. Barriers to attracting new developers
  4. Barriers to maintenance
@mrocklin
Copy link
Collaborator Author

mrocklin commented Mar 4, 2018

My personal thoughts

The default today seems to be Cython. This would be a fine choice. It seems to be fine for speed and community adoptability. I do think that it imposes a serious barrier to attracting new developers (I'd say 80% of the candidate pool will avoid Cython), and it will force a more complex build-and-deploy process. For example we can no longer just point people to try things on master to get quick feedback. We'll get questions from people on how to install from source on windows, etc..

C/C++ is, I think, a little cleaner than Cython in terms of attracting new developers. I think that C tends to be more often within people's comfort zone.

Numba is great for new developers and maintenance, but has issues in community adoptability (folks aren't accustomed to depending on it, thoughts here). Numba also has issues if we're going to be doing a lot with dynamic data structures and want to use std::map, std::vector and friends.
Having some familiarity with both Numba and Cython I personally also find the development cycle to be must faster with Numba than with Cython.

Suggestion

If I were doing this work I would stick to Numba until numba became an issue (either due to a need for dynamic data structures or downstream libraries being unwilling to depend on it) and then I would switch to something else. The cost of starting with Numba to me seems to be zero. We don't have to update our release mechanism, we don't have to have special documentation or field extra compilation help requests. The cost to switch to Cython if necessary is likely very small. We'll likely be writing exactly the same code we would write for Cython, just with less stuff around it.

Numba is also likely to be faster without us thinking as much, which I think is useful when experimenting with new algorithms, like in #125

However

However, I'm unlikely to do any of this work in the near future, and the preferences of those doing the work should have precedence.

@mrocklin
Copy link
Collaborator Author

mrocklin commented Mar 4, 2018

cc @rgommers @ogrisel @stefanv @shoyer for thoughts and hopefully critical feedback. In what system would you like to see low-level code in the pydata/sparse library written?

@hameerabbasi
Copy link
Collaborator

hameerabbasi commented Mar 4, 2018

In favor of Cython

  1. Speed is near-native C++ with the benefits of Python.
  2. Community adoptability: We will only need it for development, not as a hard dependency, so it has zero impact.
  3. Barriers to attracting new developers: Only a few algorithms need to be in Cython/Wrapped C++, others will be pure Python. Developers will have the choice of both wrapped C++ with Cython, or pure Cython.
  4. Wrapping logic is really trivial. BLAS and friends have wrappers in SciPy, we could use those.
  5. Making DOK faster is a priority for me for batch assignments (i.e. ndarray and SparseArray assignments), this needs std::unordered_map.
  6. Switching can be a somewhat expensive process down the line.
  7. @shoyer's algorithm for elemwise discussed in Support Everything that XArray Expects #1 will come in useful for CSD. Since the passes in that algorithm are likely to be slow, Numba could be almost twice as slow as std::vector. Other uses are very likely to pop up sooner or later.
  8. Debugging is better. This can be critical with complex logic.
  9. Interleaving code is relatively painless.
  10. We could re-use some things such as for radix sort.

In favor of Numba

  1. Pure Python. This could be useful for attracting devs.
  2. Faster performance overall.
  3. Iteration of N-dimensional arrays is possible without complex logic. This will be useful for advanced indexing.
  4. GPU support (?)

My Concerns

I guess it comes down to exactly two questions for me: Will we have to switch at some point (which I would like to avoid outright), or will Numba devs be willing to add support for accelerated dict and list operations, and interleaving? If the answers to both these questions are in the favor of Numba, then it's a no-brainer for me.

If, in the other hand, the answer to even one of these questions is "no"... Then my vote is for a mixture of Cython and Cython-wrapped C++.

I guess my main worries with Numba come down to the following:

  • Not everything is possible in it. I can think of a few occasions I ran into a brick wall with Numba.
  • We might have to switch, which could be more expensive than going with Cython in the first place.
  • Community adaptability.

@mrocklin
Copy link
Collaborator Author

mrocklin commented Mar 4, 2018

Some thoughts:

  1. I would not expect accelerated list/dict implementations from Numba. This is probably out of scope for them.
  2. Where possible I would like to see us avoid dynamic data structures for performance reasons
  3. I would not expect std::map to be significantly faster than Python's dict. At best I would hope for 2-3x improvements, not 10x.
  4. I find debugging to be much nicer in numba, I just remove numba.jit. I'd be curious to learn your debugging technique for Cython
  5. Can you expand what you mean by interleaving?
  6. I think that the presence of Cython in the codebase is itself a barrier to adoption. People won't be able to run from master without passing through a compilation step.
  7. I'm surprised to hear that switching from numba to cython seems expensive to you. To me it seems like the only cost is the work saved in the first place. In my experiences going the opposite direction (Cython -> Numba) code typically works after I strip out all of the annotations, rename the file from pyx to py, and put numba.jit on the function. For numeric for-loopy code they seem to be more or less equivalent to me.

However, mostly this is me trying to convince you not to adopt Cython prematurely. At this point you're doing most of the work and should probably decide.

As a co-maintainer though I would appreciate it if, prior to actually writing algorithmic code in Cython you first to setup the build process, packaging support (we should consider conda-forge), and documentation. I don't expect this to be particularly difficult, I don't think we'll need any external libraries to speak of, but it's best to understand the costs ahead of time and build in solutions in case you leave this project in the future.

@mrocklin
Copy link
Collaborator Author

mrocklin commented Mar 4, 2018

FWIW while I'm probably biased towards numba due to professional association I also have a lot of respect for Cython. As someone who has used both in different projects though (numba for personal things, cython for geopandas) and as someone who maintains a number of projects, the added cost of maintaining a Cython project in terms of person-hours is high, and that cost tends to be focused on fewer maintainers. I think I tend to optimize these days more to reduce and diffuse maintenance costs than most other things.

I personally would probably stop spending as much personal time fielding maintenance questions or build/release issues if we go for Cython. This is something that the community would need to pick up long term.

@hameerabbasi
Copy link
Collaborator

hameerabbasi commented Mar 4, 2018

I would not expect std::map to be significantly faster than Python's dict. At best I would hope for 2-3x improvements, not 10x.

There is also std::unordered_map, which is 2x-3x faster than std::map, and is the best equivalent to dict. std::map is more of a binary search tree.

I'd be curious to learn your debugging technique for Cython

GDB supports python natively, and Cython has extensions for GDB. It's on the command line, though (I've tried and failed to find a good IDE solution), so it is a bit of a pain compared to your solution. Also, it freezes your interpreter in, so conda env is a no-no.

Can you expand what you mean by interleaving?

Interleaving Pure Python/optimized code. Other problems would be calling unoptimized code from optimized.

People won't be able to run from master without passing through a compilation step

  • We can have a unified setup.py.
  • All developers would need to do for initial setup from master is pip install -e (with or without Cython installed)
    • If this is to work without Cython, :
      • we need a few more tweaks to setup.py to either pip install Cython
      • Or check-in generated C++ (preferred)
        • This would make the release easier as well.
  • If they frequently work with C/C++/Cython, Cython also has JIT support.

In my experiences going the opposite direction (Cython -> Numba) code typically works after I strip out all of the annotations, rename the file from pyx to py, and put numba.jit on the function.

There are also costs in terms of alienated developers we may have accrued that are used to Numba.

However, mostly this is me trying to convince you not to adopt Cython prematurely.

Of course, I understand the costs, and if the consensus is on Cython, then I would build these solutions in, along with docs, right at the very first check-in. I believe properly > quickly.

After your convincing and experience, I'm not too opposed to Numba, but would certainly like to wait for thoughts of others (particularly Scipy devs and whether they would consider it to be a blocker) before committing to it.

At this point you're doing most of the work and should probably decide.

My vote is still slightly in favor of Cython, but I'm open to Numba as an option. I don't believe I should make the decision, I would really wait for community consensus.

@albop
Copy link

albop commented Mar 4, 2018

I have started watching this project only recently and although I was super happy to finally a sparse multidimensional tensor lib that I could contribute to I'm just an observer here.
Funnily enough I spent some time yesterday trying to implement specialized tensor products for DOK with numba, and found out that dicts are not a recognized type yet. I used ordered list of tuples instead of dicts everywhere, which seems to be supported well but @mrocklin must know the details better than I do.
Here is one argument in favour of numba, which I haven't seen listed above. I have found very useful in other projects the ability to generate code, based on number of dimensions for instance, and jit-compile it to a fast, memory-savvy option. There are several instances, where I found this approach to be much faster than using numpy's generic functions, and I wouldn't know how to perform the same with Cython or C/C++ for that matter as dimension combinations are not known at compile-time.

@hameerabbasi
Copy link
Collaborator

hameerabbasi commented Mar 4, 2018

At this point, if Scipy devs came out and said they would be okay to depend on this project if Numba was used, I'd go for Numba. I just had a look at the docs and they're significantly better than Cython's (which alone makes them worth considering).

It'd be nice to have another contributor, @albop. :-)

@albop
Copy link

albop commented Mar 4, 2018

@hameerabbasi : I was considering saying something about me contributing to the project, but wouldn't like to disappoint if it takes me a bit too long to start. I'll try to create some issues first ;-) (with the usecase I have in mind...)

@hameerabbasi
Copy link
Collaborator

@albop If you have any questions about where something is handled, or how to implement something, don't hesitate to raise an issue and cc me on it. :-)

@shoyer
Copy link
Member

shoyer commented Mar 4, 2018

Numba actually does currently accelerated list and set. I don't know if accelerating dict is on their roadmap or not.

@ogrisel
Copy link
Member

ogrisel commented Mar 4, 2018

C++ is fine as long as the wrapper tool is modern enough to make it easy to package as a wheel (e.g. Cython or pybind11, no boost python please). I am fine with numba as a dependency as well. I can't speak for scipy developers.

@hameerabbasi
Copy link
Collaborator

hameerabbasi commented Mar 4, 2018

They do, it's even been labelled a high priority. xref numba/numba#2096

Numba actually does currently accelerated list and set. I don't know if accelerating dict is on their roadmap or not.

This makes me much more comfortable using Numba.

@rgommers
Copy link

rgommers commented Mar 4, 2018

I can't speak for all SciPy developers, but if Numba continues to improve and be maintained (not sure of its status now with all the changes at Anaconda?) then at some point we'll likely adopt it. The concerns regarding build dependencies etc. are minor compared to all the C/C++/Fortran we have already. If you want me to start that discussion now on scipy-dev, I'd be happy to do so.

Re debugging: it depends on what you're debugging. If it's an issue with the logic of your code, then Numba is much better (just remove @jit). If it's a problem with Numba or Cython behavior though, then Cython's debugging support is far better than Numba's. The latter case is probably less common, but not that uncommon - both due to changes in new releases and things that don't work identically to plain Python code in the first place. I'd say that as a contributor I would prefer Numba, and as a library maintainer I'd prefer Cython.

We recently got a similar question from Pythran, and I came up with a list of factors for the decision at https://mail.python.org/pipermail/scipy-dev/2018-January/022334.html. The ones I'm not sure about for Numba are:

  • portability (exotic stuff like AIX, Raspberry Pi, etc.)
  • maintenance status
  • templating support for multiple dtypes

@datnamer
Copy link

datnamer commented Mar 4, 2018

How will the sparse array object be represented in numba? The JIT class capability.is currently very limited.

@hameerabbasi
Copy link
Collaborator

@datnamer This is a discussion on how we will adopt Numba for our internal algorithms, to make our own library faster for users. It isn't much of a discussion about how to make our classes JIT-compatible.

@hameerabbasi
Copy link
Collaborator

If you want me to start that discussion now on scipy-dev, I'd be happy to do so.

That'd be really appreciated. 😃

@hameerabbasi
Copy link
Collaborator

hameerabbasi commented Mar 4, 2018

cc @woodmd, since you're planning to contribute, you should get a say. Also, @nils-werner.

@stefanv
Copy link

stefanv commented Mar 4, 2018

As long as things go well, numba has an edge above Cython. But once they start to misbehave, I'm not sure a) how to identify that it is misbehaving in numba or b) how to debug numba and steer it in the right direction. We have good answers to these questions for Cython because we've been using it for long all over the scientific Python ecosystem, but if you can find similar answers for numba it would be tempting to use that instead (the code will almost certainly be simpler to implement and debug).

@hameerabbasi
Copy link
Collaborator

hameerabbasi commented Mar 5, 2018

One question at large regarding Cython, which will be very useful later: How hard is it to make it work for general N-dimensional arrays? From what I can tell, only arrays of a specific dimension can be an input to a function.

I personally would probably stop spending as much personal time fielding maintenance questions or build/release issues if we go for Cython. This is something that the community would need to pick up long term.

Would you be willing to contribute in code (for the pure Python parts) and do code reviews (for both parts)? FWIW, I only intend to move the bottlenecks (currently only indexing, sorting, matching) to Cython/Numba/C++, keeping all else as-is. I don't intend to write the entire thing in another language, Numpy already does a lot of that for us. I also care a lot about maintainability and attracting devs, I'm sorry if it came across otherwise. Maintainability and community > A few ms shaved off an operation.

If building wheels is an issue, I have access to all three major OSs on my personal machine, and can easily build wheels for all of them (although I believe CircleCI/AppVeyor/Travis do this already)

You're a big part of this project at this stage, losing that could be a fatal blow to it.

@nils-werner
Copy link
Contributor

Re nD data structures: All internal data of COO arrays are 1D or 2D, are they not?

@hameerabbasi
Copy link
Collaborator

There are a few use cases for N-D iteration in our code that I can think of at this point:

  1. Batch DOK assignment.
  2. Advanced indexing.

@hameerabbasi
Copy link
Collaborator

portability (exotic stuff like AIX, Raspberry Pi, etc.)

It depends on LLVM via llvmlite, which is quite portable, I think.

maintenance status

Last release was 25 days ago, code frequency seems to be good. Not sure about the future, though. @mrocklin might have insights on that.

@mrocklin
Copy link
Collaborator Author

mrocklin commented Mar 5, 2018

I've raised an issue highlighting debugging issues in Numba here: numba/numba#2788

Engagement there would be welcome.

@mrocklin
Copy link
Collaborator Author

mrocklin commented Mar 5, 2018

I do not have any major concerns about the longevity of Numba. Folks probably shouldn't rely on my opinion here due to the conflict of interest. I'd be happy to chat in more depth with folks if desired.

@hameerabbasi
Copy link
Collaborator

this optimization will allow to compile inner loops in nopython mode regardless of what code surrounds those inner loops.

This was what I was talking about when I spoke of interleaving, so that concern can be put aside. Not too long ago, this wasn't supported IIRC. It also means templates and custom dtypes are supported.

@serge-sans-paille
Copy link

@hameerabbasi serge-sans-paille/pythran#866 implements the missing features to support these two functions in Pythran, and add them to the testc ases. If you can provideme with more similar kernels to port, that's great :-)

@mrocklin
Copy link
Collaborator Author

Most scipy style code in the past has gone without nested dynamic data structures, mostly for performance reasons. I recommend that you convey more information about what you're trying to achieve and why you're trying to use these data structures and perhaps others in the community will be able to suggest alternatives.

@hameerabbasi
Copy link
Collaborator

@serge-sans-paille Unfortunately, these already work in Numba. I haven't written the actual kernels needed.
@mrocklin A radix argsort, for one needs list[list[int]] AFAICT to be reasonable, maybe I'm wrong, though.

Also, dict support is absolutely essential to speeding up DOK.

But the main thing holding me back (feature-wise instead of speed-wise) is list[list[int]]. All three issues I referenced above need this in one form or another.

For advanced indexing, I need list[np.ndarray[int, 1D]] to pass the flattened-broadcasted versions of the input arrays. I realize I could stack and send them in, but that's extra conversion overhead. I also need zip(*x) where x has the type I mentioned before, this can't be done as the tuple length can't be inferred at compile time.

For CSD/ndarray in elemwise, I'll need to bring them into "common compressed dimensions" (easy) but then I have to loop over them in a fashion that's hard to describe without putting into code but will need list[list[int]].

@mrocklin
Copy link
Collaborator Author

I realize I could stack and send them in, but that's extra conversion overhead

Right, I recommend verifying that this overhead is meaningful before trying to avoid it.

but then I have to loop over them in a fashion that's hard to describe without putting into code but will need list[list[int]].

I recommend providing more detail about exactly what algorithms or code or outcomes you're trying to accomplish and perhaps someone can provide alternatives with statically allocated data structures. You've done this with radix sort, which is a nice example. If there are others then I recommend providing pseudocode, tests that you think properly scope the problem, or links to algorithms or papers that you're trying to implement so that others can understand what you're trying to accomplish.

My guess is that it's very natural to use nested lists of lists for some of these operations, but that these are neither strictly necessary nor always optimal. There are often many ways to accomplish operations like these. CSR and CSC were designed to not need dynamic data structures. My guess is that it is possible to accomplish the operations that relevant libraries need without dynamic data structures. It may be that you're trying to implement everything all at once, which may not be the optimal use of time.

However I've said this before, and you seem to disagree. That's fine, I suspect that it is because you know more about the problem. What I'm asking here is that you share more of your knowledge so that others start to understand why these data structures are necessary. Then maybe people (myself or others) can help work around the problem. As things stand currently it's hard for others (or at least myself) to know what you have in mind.

At the same time I recommend that you push upstream and mention your concerns on the relevant Numba issue. I see that you've mentioned your desire on numba/numba#2560 . I recommend that you ping that issue again, saying again why you're interested and maybe asking how you could contribute to resolve the problem.

@mrocklin
Copy link
Collaborator Author

Also, dict support is absolutely essential to speeding up DOK.

FWIW I personally don't care about DOK at all. None of the applications that I know of (xarray, scikit-learn, tensor factorization algorithms) are likely to care much either. If I were in your position and wanted to make useful software I would query downstream users to hear what operations they need and focus on those first.

The two main communities that I would query are XArray and Scikit-Learn. I think that this would help to prioritize work.

@ogrisel
Copy link
Member

ogrisel commented Apr 26, 2018

I confirm that I don't know any algorithm in scikit-learn that would be optimally written using the DOK datastructure. We mostly use CSR / CSC at consuming time (to run linear algebra operations such as matrix vector products row wise or column wise) or row-wise and column-wise reductions (e.g. to scale the data). We also use COO for producing sparse matrices from another kind of data (e.g. a text vectorizer outputting a sparse matrix for bag of words counts) prior to conversion to CSR or CSC.

@mrocklin
Copy link
Collaborator Author

mrocklin commented Apr 26, 2018 via email

@ogrisel
Copy link
Member

ogrisel commented Apr 26, 2018

Actually many scikit-learn algorithms have Cython code that directly accesses the 3 CSR/CSC component arrays indices, data and indptr and then do pointer arithmetic inside those data structure components.

One could expect that some of those algorithms could be refactored to work on row-wise-chunked CSR or column-wise or row-wise chunked CSC (depending on the algorithm and how they can be parallelized).

Example Cython code on CSC matrices:

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/cd_fast.pyx#L320

Some other algorithms (also in Cython) use a low overhead dataset abstraction to wrap dense 2D numpy arrays and CSR sparse matrices:

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/seq_dataset.pyx

Is is often possible to imagine that such algorithms could be extended to have a pure python outer-loop that would work on a list of row-wise CSR or CSC chunks in which a variant of the existing Cython code would run either sequentially (for out of core computation with a small model moving from one node to the next to update its parameter when accessing the chunks of data on that node) or sometimes in parallel accross chunks.

Other algorithms written in Python use dot products with a numpy array (and maybe sometimes another sparse matrix) and column-wise or row-wise aggregations (mean, sum) possibly on slices.

@seibert
Copy link

seibert commented Apr 26, 2018

Some notes on issues raised above:

  • Being able to nest containers (like list[list[int]]) should be available in Numba 0.39 (scheduled for late June). However, we suspect that will reveal a new bottleneck for some use cases, where the unboxing cost of nested Python lists (when transitioning from Python to nopython mode) becomes prohibitive. For temporary lists, this won't matter.

  • The architecture support limitations should soon be resolved. By next release, ARMv7 will be working (except some issues that are also present in SciPy linalg functions), ARMv8 is working except for a few NaT conversion issues when working with datatime dtypes, and POWER8 will be solved with the future LLVM 6.0.1 release which contains some required PPC64 bug fixes.

  • We've had a request from HPAT for a typed dictionary that can be used as a temporary data structure inside algorithms. This is something we want to do for several reasons, but we're currently juggling development priorities between these sort of container features, refactoring our ufunc/gufunc compiler, and supporting the ability to jump back and forth between python and nopython mode in the same function. As a result I don't have an ETA, but it is a request we are aware of.

@hameerabbasi
Copy link
Collaborator

@seibert Seems like you're well ahead of schedule. 😃 Thanks a lot. I'll try to juggle around things so I do things that don't need list[list[int]] first. However, sorting (we need a counting/radix argsort) is the biggest bottleneck right now that needs it, the rest I think I can make work without this.

@shoyer
Copy link
Member

shoyer commented Apr 26, 2018

I'm not entirely sure who would use sparse arrays in xarray, but my guess is that all-purpose flexibility of COO would suffice for most analytics purposes.

That said, for data analysis, I think the most useful capability would be support for fill-values other than zero (namely, NaN).

One intriguing usecase for sparse arrays in xarray is to reproduce the capabilities of the "multi-dimensional" databases used in business intelligence (something like Online analytical processing).

@mrocklin
Copy link
Collaborator Author

Hrm, that's an interesting thought. Do we know people who care about OLAP cubes? Perhaps @Stiivi ? It would be useful to get an introduction to someone that might need an in-memory sparse multi-dimensional data structure and would be willing to try things out and provide feedback from concrete applications.

@hameerabbasi
Copy link
Collaborator

Fill values are easy to build and I have them planned soon-ish. Most operations can be made to work with it, the only one I can think of that won't is dot/tensordot.

@hameerabbasi
Copy link
Collaborator

I'm still waiting on numba/numba#2560, it's a blocker for a lot of features. Failing that in 0.39, I'd like to discuss Cython and/or C/C++ wrapping alternatives.

@mrocklin
Copy link
Collaborator Author

mrocklin commented Jun 1, 2018 via email

@JoElfner
Copy link

JoElfner commented Jun 7, 2018

@hameerabbasi and @mrocklin It looks like the implementation will be coming in 0.39. It is labeled as ready for review: here

@aldanor
Copy link

aldanor commented Sep 18, 2018

Refcounted lists seemed to have landed in numba indeed. Does anyone know what's the story with dict?

@hameerabbasi
Copy link
Collaborator

Refcounted lists seemed to have landed in numba indeed. Does anyone know what's the story with dict?

See numba/numba#2096.

@JoElfner
Copy link

Afaik numba version 1.0 release date is scheduled for december 2018. Dict support is marked as high priority for 1.0.

@kiwi0fruit
Copy link

In the light of this question Why Julia? Will Python/Numba and Python/Cython lose to Julia? I suggest to go with Numba and actively contribute to it if needed.

@rth
Copy link

rth commented Mar 10, 2020

Looks like the decisions was to go with numba, and given the amount of work invested there, this would be unlikely to change in the future? Should this be issue closed then?

Performance would be a significant argument for or against adoption (#331). For users who don't need much extra functionality outside of scipy.sparse (but just want a sparse COO/CSD 2D ndarray class), if this library can be made faster than scipy.sparse than accepting 2 extra dependencies (numba, llvmlite) can likely be acceptable. If the performance is similar or worse, that's going to be more difficult to sell (putting aside whether numba is generally faster than cython or not).

Wrapping low level libraries is probably not doable now with numba, but it still might be worth following the work done in https://github.com/vbarrielle/sprs in Rust. So far it doesn't have the ND generalized data structures that pydata/sparse has, but some work has been done lately to benchmark against scipy.sparse in sparsemat/sprs#184 (comment) and to improve results.

@hameerabbasi
Copy link
Collaborator

Wrapping low level libraries is probably not doable now with numba, but it still might be worth following the work done in https://github.com/vbarrielle/sprs in Rust. So far it doesn't have the ND generalized data structures that pydata/sparse has, but some work has been done lately to benchmark against scipy.sparse in vbarrielle/sprs#184 (comment) and to improve results.

While I'm somewhat of a fan of Rust myself, it seems hard, due to the amount of work invested, as you say. Not to mention, some upcoming features absolutely need JIT, due to an "explosion of types".

If Rust or C++ JIT is a stable thing (better than Numba) I'd like to hear about it.

@stuartarchibald
Copy link

<snip> Wrapping low level libraries is probably not doable now with numba, but it still might be worth following the work done in https://github.com/vbarrielle/sprs in Rust. So far it doesn't have the ND generalized data structures that pydata/sparse has, but some work has been done lately to benchmark against scipy.sparse in vbarrielle/sprs#184 (comment) and to improve results.

Numba has ctypes and cffi support, it's possible to get a long way with these (void *!) and the extension API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests