Skip to content

Commit

Permalink
Optimize out first where in label (#102)
Browse files Browse the repository at this point in the history
* Store index and slice iterator before `for`-loop

* Handle `input` block selection in iterator

* Optimize out first `where` in `label`

* Move input selection within `zip`

As `starmap` is less familiar to others, move the selection operation
within `zip` using `map`. This should also make it clearer what the
selection is being used for and what is being selected from.

* Use Python 2/3 compatibility functions

Make use of iterator friendly Python 2/3 `imap` and `izip` functions
when generating the `block_iter`. This should ensure things like `next`
behave correctly (particularly on Python 2).

* Replace `lambda` with `operator.getitem`

Uses `operator.getitem` to make it a bit clearer to the reader as to
what is happening in this map operation (as opposed to the `lambda`).
  • Loading branch information
jakirkham committed Feb 13, 2019
1 parent 638cf24 commit 1a78285
Showing 1 changed file with 10 additions and 4 deletions.
14 changes: 10 additions & 4 deletions dask_image/ndmeasure/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

import collections
import functools
import operator

import numpy

Expand Down Expand Up @@ -210,14 +211,19 @@ def label(input, structure=None):
input = dask.array.asarray(input)

labeled_blocks = numpy.empty(input.numblocks, dtype=object)
total = _label.LABEL_DTYPE.type(0)

# First, label each block independently, incrementing the labels in that
# block by the total number of labels from previous blocks. This way, each
# block's labels are globally unique.
for index, cslice in zip(numpy.ndindex(*input.numblocks),
dask.array.core.slices_from_chunks(input.chunks)):
input_block = input[cslice]
block_iter = _pycompat.izip(
numpy.ndindex(*input.numblocks),
_pycompat.imap(functools.partial(operator.getitem, input),
dask.array.core.slices_from_chunks(input.chunks))
)
index, input_block = next(block_iter)
labeled_blocks[index], total = _label.block_ndi_label_delayed(input_block,
structure)
for index, input_block in block_iter:
labeled_block, n = _label.block_ndi_label_delayed(input_block,
structure)
block_label_offset = dask.array.where(labeled_block > 0,
Expand Down

0 comments on commit 1a78285

Please sign in to comment.