Skip to content

Avoid unnecessary __getitem__ in block() when chunks have correct dimensionality#5884

Merged
mrocklin merged 1 commit intodask:masterfrom
astrofrog:avoid-getitem-block
Feb 12, 2020
Merged

Avoid unnecessary __getitem__ in block() when chunks have correct dimensionality#5884
mrocklin merged 1 commit intodask:masterfrom
astrofrog:avoid-getitem-block

Conversation

@astrofrog
Copy link
Copy Markdown
Contributor

This improves performance in block() by almost a factor of 2 when chunks have the same dimensionality as the final array. Here is an example:

import numpy as np
import dask.array as da


class ArrayLikeObject:

    def __init__(self):
        self._array = np.ones((1, 1, 20, 30), dtype=float)
        self.shape = self._array.shape
        self.ndim = self._array.ndim
        self.dtype = self._array.dtype

    def __getitem__(self, item):
        return self._array[item]


def test_perf():
    meta = np.zeros((0,), dtype=float)
    chunks = [[[[da.from_array(ArrayLikeObject(), meta=meta)] * 269] * 6] * 4]
    da.block(chunks)

Before this patch:

In [2]: %timeit test_perf()                                                                                                                                          
1.09 s ± 152 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After this patch:

In [2]: %timeit test_perf()                                                                                                                                          
580 ms ± 13.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

I haven't added any specific tests since both cases (same and different dimensionality) are likely already covered by the test suite, and this is just a performance improvement.

  • Tests added / passed
  • Passes black dask / flake8 dask

…ensionality, which improves performance significantly in that case.
@mrocklin
Copy link
Copy Markdown
Member

This looks great to me. Thanks @astrofrog ! Merging.

Also, I notice that this is your first code contribution to this repository. Welcome! I'm also personally excited to see you using Dask enough to find bugs. I hope that the project is proving useful to you all in astronomy-land.

@mrocklin mrocklin merged commit 5f61f7f into dask:master Feb 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants