Feedback on Ndarray/Ndarray-linalg (from Reddit) #649

LukeMathWalker · 2019-06-23T13:25:31Z

Some interesting feedback on using ndarray and ndarray-linalg to port a numerical routine from Python to Rust: Reddit thread

Actionable highlights:

NumPy to ndarray documentation

The ndarray for numpy users document was a tremendous help. I really appreciate it and all the work the developers have put into making it.
I'd love to see the numpy-to-ndarray docs expanded. I know that the developers of ndarray are trying to make most function calls as general as possible, but I might request that some more of these general functions get added to the ndarray for numpy users document (either how to implement them or where to go for the implementation). For instance: the max/min and argmax/argmin functions.

Crate organization

On the topic of max/argmax functions, I implemented my own but I just saw that ndarray-stats has them. I am a bit confused why/how the ndarray and the semi-associated subcrates are being partitioned. For instance, the other stats functions in ndarray-stats seem pretty high level and make sense. However, the "max" function seems like something that should either be implemented in the ndarray crate itself (like the "sum" function/method has been) or "max" should be part of another "ndarray-lowlevelmath" crate.

Array indexing

Not being able to slice/index ndarrays with other ndarrays was a bit of a pain point. It's mentioned that you can't do that with ndarray on the ndarray for numpy users document. Are there plans to implement indexing by array?

Best practice wrt views/references/etc.

I'm not sure how / when to use views of arrays and when to use references. This is likely my lack of experience with Rust, but I don't know what the pros and cons of each are or when I should write a function that takes an array view versus a reference to an array. I haven't seen any tutorial information on this (so if anyone could point me to a document I'd be super appreciative).

Sweet closure 😛

My project that uses Python will really be improved greatly by switching to Rust and I'm thrilled that so much work has been done on the ndarray and associated crates in the last year that is enabling my work!

The text was updated successfully, but these errors were encountered:

LukeMathWalker · 2019-06-23T13:30:35Z

I have been reviewing our Ndarray for NumPy user doc and I think we could make it a little bit more comprehensive now that ndarray-linalg is part of the organization and more functionality gaps have been covered by ndarray-stats.
This should also improve ease of discovery (is there a max function? where?) - we could as well add some references to ndarray-linalg, ndarray-stats and ndarray-rand (and the respective scopes) in the homepage of ndarray docs, to make them more visible.

Happy to work on this and open a PR if it's desirable.

I was planning to address best practices when it comes to views and references in the series I have been putting together, but my free time has been quite limited recently. I should be able to get back on it in 2 or 3 weeks.

SuperFluffy · 2019-06-23T13:58:58Z

Before we advertise ndarray-linalg more prominently, I think we should discuss its design. I have been looking into using it for my work, but have decided to use my own implementations. While ndarray-linalg works great for one-off calculations, at the moment it's not very useful for tight/hot loops. I will use its qr code here as an example:

The first thing to note is its repeated use of ArrayBase::uninitialized. This was probably done for performance, but it should probably be benchmarked if in the real world this has any real effects on performance. People repeatedly expressed on reddit that unsafe code makes them question the maturity of libraries. Fortunately, this is an easy fix.

The main issue with the qr code is the large amount of allocations. A typical call does the following:

QR::qr allocates via ArrayBase::to_owned;
QRInto::qr_into allocates twice via take_slice and take_slice_upper;
QR_::qr allocates once via a Vec inside it's body;
QR_::householder allocates once via Vec.

Given that a typical use case in scientific calculations is to repeatedly perform QR factorization on a matrix of the same dimensions, it would be great to have a struct QR that contains all pre-allocated data structures and comes equipped with a method compute taking a matrix A and storing the result inside it. The factors Q and R can then be accessed by reference and cloned, if needed.

One possible design is what I did with my library for Gram Schmidt QR factorization. As an example, struct Classical implements the GramSchmidt trait. After construction using the trait methods from_matrix or from_shape, an array of the same dimensions can be repeatedly factorized using compute.

For one-off calculations, one can have convenience wrappers like this one here.

I did the same thing when implementing Al-Mohy and Higham's matrix exponentiation algorithm here: https://github.com/SuperFluffy/rust-expm/blob/master/src/lib.rs#L332

LukeMathWalker · 2019-06-30T07:07:45Z

I haven't had a chance yet to take a deeper look at the internal of ndarray-linalg, but, as a matter of principle, I don't think we should "hide" it until it's perfect and polished.
On the contrary, we should make it more visible while providing an easy reference to all its shortcomings (at least the ones we know of), thus making it easier for people to contribute and improve on what it's currently available.

Could we get something similar to #597 for ndarray-linalg @SuperFluffy @termoshtt?

termoshtt · 2019-07-02T09:09:51Z

The main issue with the qr code is the large amount of allocations

I agree that ndarray-linalg should have an interface which minimize memory allocation, but it can coexist with current luxury interface as LUFactorize does.

Could we get something similar to #597 for ndarray-linalg

This will be great, but I can get little time for working with it. I hope another main maintainer handles this if possible...

Crate organization

I've created ndarray-linalg crate around the comment #178 (comment) basically for my studies. ndarray crate itself is still not enough for general usage, and we need associated sub-crate. We have two options:

Merge these sub-crates into ndarray itself
- simplify the development
Re-split existing functionalities into sub-crates
- e.g. ndarray-rand and ndarray_linalg::generate::random has similar features

Monolithic ndarray crate is useful for users, and it will be also useful for developers because they have no need to consider which crate should implement some function. However, it will be harder for maintainer.

LukeMathWalker · 2019-07-04T21:44:37Z

Could we get something similar to #597 for ndarray-linalg

This will be great, but I can get little time for working with it. I hope another main maintainer handles this if possible...

I am not super familiar with the ndarray-linalg yet, so it would be beneficial if you could do a "brain dump" of current issues, future directions, etc.
I am happy to benchmark ndarray-linalg against NumPy to see what we are missing from a functionality point of view.

This was referenced Jul 21, 2019

Benchmarking rust-ndarray/ndarray-linalg#167

Closed

API for pre-allocated calculation for better performance rust-ndarray/ndarray-linalg#168

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback on Ndarray/Ndarray-linalg (from Reddit) #649

Feedback on Ndarray/Ndarray-linalg (from Reddit) #649

LukeMathWalker commented Jun 23, 2019

LukeMathWalker commented Jun 23, 2019

SuperFluffy commented Jun 23, 2019 •

edited

LukeMathWalker commented Jun 30, 2019

termoshtt commented Jul 2, 2019

LukeMathWalker commented Jul 4, 2019 •

edited

Feedback on Ndarray/Ndarray-linalg (from Reddit) #649

Feedback on Ndarray/Ndarray-linalg (from Reddit) #649

Comments

LukeMathWalker commented Jun 23, 2019

LukeMathWalker commented Jun 23, 2019

SuperFluffy commented Jun 23, 2019 • edited

LukeMathWalker commented Jun 30, 2019

termoshtt commented Jul 2, 2019

LukeMathWalker commented Jul 4, 2019 • edited

SuperFluffy commented Jun 23, 2019 •

edited

LukeMathWalker commented Jul 4, 2019 •

edited