increased performance of k-diagonal extraction in da.diag() and da.diagonal()#8689
increased performance of k-diagonal extraction in da.diag() and da.diagonal()#8689
Conversation
|
Can one of the admins verify this patch? |
|
add to allowlist |
|
FYI, I have extended this PR with @TAdeJong's simple padding solution for |
4e429e3 to
9d46f4a
Compare
|
Other than thinking about not duplicating code from |
|
Good question. You're right about code-duplication issues in that code-section. I would do so but ... I feel it would be less performant to call On the other hand, performance here is up for discussion as very large I don't feel strongly though about performance in this particular case, so if you have other arguments in favor of avoiding code duplication, feel free to share! |
|
Btw, many thanks for your constructive critique and support of this PR @TAdeJong . |
diag() now calls diagonal()
|
@TAdeJong , @pavithraes , @ian-r-rose RE: code duplication Upon further reflection, I agree with @TAdeJong's suggestion to reduce code-duplication. To do this, I generalized this k-diagonal extraction algorithm to higher dimensions and transferred the code into See the opening comment above of this PR for a note on algorithmic complexity/performance. |
0b09ffc to
12786c9
Compare
12786c9 to
ab940bb
Compare
jcrist
left a comment
There was a problem hiding this comment.
Thanks @ParticularMiner , overall this looks good to me. I left one test comment, but the rest looks good. While you're looking at this code, can I ask that you look through the tests for da.diagonal as well to see if they offer full coverage for the different parameters we'd care about here (dimensions of input array, chunking variation, axis arguments, etc...)? It'd be good to ensure we have full coverage here while you're still thinking about this code.
09029ef to
f93e75c
Compare
b3b1076 to
a5b2b37
Compare
a5b2b37 to
8b66f9f
Compare
|
Thanks for the update @ParticularMiner! @jcrist Did the updates address your comments? |
|
Thanks all! 😄 |
…agonal() (dask#8689) * added support for extracting k-diagonals from a 2d-array * included heterogeneous chunks in test_diag() * fixed linting errors in test_diag() * improved efficiency of diagonal extractor a bit * stole @TAdeJong's simple padding solution for diag(v, k) when v is 1d * reduced complexity of `diagonal()` from O(N**2) to O(N) diag() now calls diagonal() * fixed linting errors in diagonal() * reorganized tests and ensured coverage of diag() & diagonal() as per @jcrist's advice * catered for cupy type input arrays to diagonal()
pre-commit run --all-filesPerhaps you might find this useful. It mirrors the
dask-grblasimplementation, which supports rectangular chunks.It follows the straight path of the k-diagonal through the rectangular chunks of the input matrix , constructing the dask graph along the way. So chunks untouched by the diagonal end up not being part of the final dask graph, reducing the algorithmic complexity of
diagonal(A, offset, axis1, axis2)fromO(M Naxis1 Naxis2) to O(M max(1, Nk)) ,
where Naxis1 (Naxis2) is the number of chunks along
axis1(axis2) of arrayA; M is the total number of chunks ofAafteraxis1andaxis2have been removed; while Nk is the number of chunks touched by the k-diagonal after all axes, exceptaxis1andaxis2, have been removed.Relevant test-units have been modified to reflect this change.
Critique is welcome.
NB: It might be worth comparing this with #5683 (which I only discovered after pushing this commit).