Resampling of nD arrays #511

ankit-maverick · 2013-04-19T23:46:43Z

Current api :

In [1]: import numpy as np
In [2]: from skimage.transform._warps import downscale_local_means
In [3]: a = np.arange(16).reshape(4, 4)
In [4]: a
Out[4]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])
In [5]: downscale_local_means(a, (2, 2))
Out[5]: 
array([[  2.5,   4.5],
       [ 10.5,  12.5]])
In [6]: 
In [6]: downscale_local_means(a, (2, 3))
Out[6]: 
array([[  3.        ,   1.66666667],
       [ 11.        ,   4.33333333]])
In [7]: from skimage.measure._sum_blocks import sum_blocks

In [8]: sum_blocks(a, (2, 2))
Out[8]: 
array([[10, 18],
       [42, 50]])

JDWarner · 2013-04-20T00:15:06Z

skimage/transform/rescale.py

+
+import numpy as np
+
+def downsample(image, factors, how = 'sum'):


PEP8: "Don't use spaces around the = sign when used to indicate a keyword argument or a default parameter value."

Also, I'm unsure if we have an official standard API for this but I believe a better kwarg would be mode or method.

JDWarner · 2013-04-20T00:33:50Z

Good beginnings! I left some comments in the code.

That Travis failure has nothing to do with this PR.

Thought to consider: could this easily be generalized to 3-D or n-D? I believe so...

ankit-maverick · 2013-04-20T22:59:40Z

@JDWarner : Sure, that should be possible. Though, I would have to think a bit on how to generalize it to n-dimensions.

cdeil · 2013-04-21T12:13:46Z

@ankit-maverick Thanks for implementing this!

Here's my comments:

+1 to @JDWarner's suggestions to make it n-D. The first argument should then be renamed array instead of image.
At the moment the looping is done in Python ... can you make this fast by using clever numpy trickery or Cython?
Intuitively it's not clear to me what "divide" and "uniform" mean for up-sampling. I don't have a better suggestion though ... maybe someone else has an idea?
The two different cases "sum" and "mean" only differ by a constant factor and shouldn't be coded twice. @JDWarner suggested applying the factor after the loops for the "mean" case. Another option could be to always multiply with a factor, but have that factor be one for the "sum" case. I think doing it this way might be faster because there is only one pass and not two over the data. But I'm not sure ... probably better to think about how to do the n-D case and avoid looping in Python as much as possible first.

ankit-maverick · 2013-04-21T13:04:22Z

@cdeil : Thanks a lot for the feedback. This seems a good amount of refactoring to do. I am currently busy with my end-terms, but will push as soon as it's possible for me.

JDWarner · 2013-04-21T17:48:04Z

@cdeil 👍

Agreed. I think 3D would be a good base goal, with hopes for n-D if possible.
Now that I think about it, skimage.util.view_as_blocks could provide a basis for fast vectorized numpy execution for the integer case. Might need to pad with zeros first. For the non-integer case, Cython may be needed to achieve similar speed.
Question: Is up-sampling meaningful in this context? I understand what's going on when downsampling (adding up local samples in a deterministic way), but upsampling (smearing out local values over a larger region) seems like very poor extrapolation. It's possible I'm just missing the point, being in a different field.
Agreed!

Two additional thoughts:

Is this truly a transform, or is it more of a measurement? Thinking about if it belongs in skimage.transform vs skimage.measure.
dtype concerns for overflows in sum method (related to previous). If it's a measurement we want to be able to quantify the combination, potentially even if it would have overflowed the original dtype. If it's a transform, I would generally expect the result to maintain dtype. In the second case, using "mean" exclusively would solve the problem - but I'm not sure which behavior is desired. Just bringing it up.

stefanv · 2013-04-22T09:41:47Z

@JDWarner Perhaps we can start with the case where only multiples of factor is supported. I agree that upsampling this way is a bit odd.

ankit-maverick · 2013-04-30T14:11:44Z

@cdeil @JDWarner : I still haven't found any alternative to Python looping in numpy. If you have any idea about something better that can be used, please point me to that resource.

Also, because of the above reason, the only way currently I can think of implementing these functions for 3D is by first testing the dimension of input ndarray and add code for 3 nested for loops which seems inefficient. Any ideas that you can think of?

ahojnnes · 2013-04-30T14:55:45Z

I have sth in mind, that could work for arbitrary dimensions... Not sure if it works, but I can show you when I get home.

Johannes Schönberger

Am 30.04.2013 um 16:12 schrieb "AnkitAgrawal" <notifications@github.com mailto:notifications@github.com>:

@cdeilhttps://github.com/cdeil @JDWarnerhttps://github.com/JDWarner : I still haven't found any alternative to Python looping in numpy. If you have any idea about something better that can be used, please point me to that resource.

Also, because of the above reason, the only way currently I can think of implementing these functions for 3D is by first testing the dimension of input ndarray and add code for 3 nested for loops which seems inefficient. Any ideas that you can think of?

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/511#issuecomment-17229355.

JDWarner · 2013-04-30T15:25:01Z

@ankit-maverick For vectorized looping over non-overlapping patches, check out skimage.util.view_as_blocks. This really neat function provides a view into the array using striding tricks. Here's a quick example in 2D (but view_as_blocks supports arbitrary dimensionality, so long as arr.ndim and the patch size length agree):

>>> import numpy as np
>>> from skimage.util import view_as_blocks
>>> arr = np.arange(64).reshape((8, 8))
>>> view_as_blocks(arr, (2, 2)).shape
(4, 4, 2, 2)
>>> view_as_blocks(arr, (4, 4)).shape
(2, 2, 4, 4)

Each (point, point, :, :) in this example represents a patch. Then, to be happy with arbitrary dimensionality, just flatten the patches and sum along the final axis. There's a trick that helps for this in np.reshape: if you pass a partial desired shape and then end with -1 all further axes are flattened into that axis.

The only gotcha here is that view_as_blocks only supports patch shapes which cleanly divide into the original array shape. If an array with one dimension that happens to be prime gets passed, this would be trouble. So, to be truly general, first check and ensure this is OK and if not pad asymmetrically (with zeros, I would think).

I'm sure there are other ways to do this, too.

ankit-maverick · 2013-04-30T15:40:55Z

@JDWarner @ahojnnes : Thanks. Anyways I got something much cleaner and better. A good commit is coming!

ankit-maverick · 2013-04-30T16:15:10Z

Credits to @seberg for suggesting this function to me on IRC.

ankit-maverick · 2013-05-01T04:37:05Z

TO DECIDE :

Best location for these functions in the codebase : transform module
Correct way of implementing Upsampling

JDWarner · 2013-05-01T05:10:49Z

skimage/transform/rescale.py

+
+def downsample(image, factors, method='sum'):
+
+    # works only if image.shape is perfectly divisible by factors


I'd like to relax this limitation and allow sums over arbitrary integer patches (obviously <= original array size). The method would be to zero pad each dimension which fails a residual check by the minimum necessary to allow the patch to fit.

To support arbitrary patch sizes with zero padding but retain n-D generality, this utility function may be of use:

def _pad_asymmetric_zeros(arr, pad_amt, axis=-1): """Pads `arr` by `pad_amt` along specified `axis`""" if axis == -1: axis = arr.ndim - 1 zeroshape = tuple([x if i != axis else pad_amt for (i, x) in enumerate(arr.shape)]) return np.concatenate((arr, np.zeros(zeroshape, dtype=arr.dtype)), axis=axis)

Check each axis and if patch size does not divide evenly, apply this with appropriate pad_amt.

Yes. Zero padding will be better than cropping because of loss of boundary information in the later, but then the boundary values may seem inconsistent with the neighbouring blocks. I am wondering if something like reflective padding can be better here? @JDWarner : Your thoughts?

Entirely depends on the goal. From what I understand ref. @cdeil the sum method is aimed at being quantitative. In this case you definitely want to use zero padding, or you change the total sum of the image! This is really bad for scientific applications. If this ended up in skimage.measure, or a wrapper for that method, I would definitely say zeros is the way to go.

On the other hand if you're going for looks rather than quantitative science, "downsampling" an image with local means, then I would say reflective padding would be appropriate - otherwise you'd get dark edges.

JDWarner · 2013-05-01T05:30:59Z

Thanks for the update, achieving n-D!

In my opinion, the method='sum' variant belongs more in skimage.measure with a name like sum_local_patches or local_patch_sums (which would be easier to have a parallel local_patch_means). The goal here seems to be quantitative. I'd expect any true resampling method to preserve the input dtype, so I'm hesitant to allow this to hit release in skimage.transform and surprise users.

The 'mean' variant makes more sense in a resampling context (and could/should preserve dtype if used in this way). As far as naming goes my objections to using derivatives of the term "sampling" in this context are on record, but perhaps I'm being overly pedantic. resample_local_means perhaps?

@stefanv @cdeil thoughts?

stefanv · 2013-05-01T20:51:47Z

Let's keep up/downsampling in the transform module for now--I think that's where most people will look for it by default.

ankit-maverick · 2013-05-02T04:58:06Z

@JDWarner @stefanv @cdeil @ahojnnes : On further thought, I feel the current implementation of upsample is redundant and not meaningful since it does not increase the resolution of the image in the true sense, for eg: upsampling by a factor of (2, 2) with the current implementation will return image with 4x size but since 2x2 pixel blocks have same values, it's not of any use, but a waste of memory.

Instead, we could do interpolation while upsampling like Image Resize Example in http://www.cambridgeincolour.com/tutorials/image-interpolation.htm.

Also, check some interpolation results at scipy.interpolate doc : http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html

@ all : your thoughts?

ahojnnes · 2013-05-02T06:29:05Z

@ankit-maverick I have implemented all of these interpolation methods. See skimage.transform.warp.

I think the best place to put your downsample function is skimage.transform._warps.py, because there we already have rescale, resize etc. and they are all related to this problem in my opinion.

ankit-maverick · 2013-05-02T06:51:47Z

@ankit-maverick I have implemented all of these interpolation methods. See skimage.transform.warp.

@ahojnnes : Thanks for informing me.

I think the best place to put your downsample function is skimage.transform._warps.py, because there we already have rescale, resize etc. and they are all related to this problem in my opinion.

I will update this in the next commit with tests and documentation probably tomorrow once I am done with my work on the SoC proposal.

ahojnnes · 2013-05-07T05:42:06Z

skimage/transform/_warps.py

+
+def downsample(array, factors, mode='sum'):
+    """Performs downsampling with integer factors.
+


We need an in-depth description of the algorithm here. Especially highlight the difference to resize / rescale - maybe cross-reference downsample in those functions as well.

ahojnnes · 2013-05-07T05:50:10Z

@ankit-maverick Your implementation is very elegant now, I really like it! I made some minor comments… let me know what you think.

ankit-maverick · 2013-07-04T10:09:39Z

@ahojnnes Can you check the Travis log and check why I am getting the ImportError? Works fine locally.

sciunto · 2013-07-04T11:18:32Z

@ankit-maverick circular import?

ankit-maverick · 2013-07-04T15:03:26Z

@sciunto Thanks for pointing out. But I still don't understand why the import chain doesn't stop at this line https://travis-ci.org/scikit-image/scikit-image/jobs/8729027#L2554

ankit-maverick · 2013-07-04T15:51:26Z

@stefanv @ahojnnes Anyone there to help with the above issue? I would like to close this up before I go to sleep.

JDWarner · 2013-07-04T16:29:50Z

Even though you specifically told it to import from ..transform._warps the __init__.py file in skimage.transform still gets parsed. Then hough_transform gets imported, which itself attempts to import measure. Round and round we go...

Sometimes this can be fixed by moving an import inside a function. Only the function hough_line_peaks uses measure, so you could try this... though I'm not sure it will work in this case.

ankit-maverick · 2013-07-05T03:02:38Z

Thanks a lot @JDWarner , your suggestion worked!!

@cdeil @ahojnnes This can be merged now.

ahojnnes · 2013-07-05T05:47:03Z

I have some more smaller concerns, but I'll shortly address them in another PR.

Resampling of nD arrays

stefanv · 2013-07-05T09:08:04Z

skimage/transform/_warps.py



 def resize(image, output_shape, order=1, mode='constant', cval=0.):
    """Resize image to match a certain size.

+    Resize performs interpolation to upsample or downsample 2D arrays. For
+    downsampling any n-dimensional array by performing arithmetic sum or
+    arithmetic mean, see measure._sum_blocks.sum_blocks and


Why are we making public references to private functions here?

ahojnnes · 2013-07-05T09:09:50Z

@stefanv There is already a new PR to fix the issues.

coveralls · 2014-07-17T17:33:30Z

Changes Unknown when pulling 971e7be on ankit-maverick:resample into * on scikit-image:master*.

forman · 2016-07-26T12:10:28Z

Just stumbled over #492 where it seems I had very similar requirements as @cdeil and developed https://github.com/CAB-LAB/gridtools, which are now in operational use for the production of a climate data cube. I use Numba for performance reasons. I'm happy to contribute one or the other method.

jni · 2016-07-26T14:38:53Z

@forman Sorry what do you mean exactly? It seems from a quick reading that @cdeil's requirements were met by this PR?

forman · 2016-07-28T11:42:42Z

@jni oh, sorry, I wasn't clear at all. I just wanted to inform anyone interested about some additional downsampling methods I've implemented in gridtools which may extend scikit-image capabilities.

cdeil · 2016-07-28T12:17:56Z

@forman - Thanks for pointing out https://github.com/CAB-LAB/gridtools

I see you've released it under the GPL license.

In case you're not aware: this means that many numerical Python people will not read or use or contribute to it. Numpy, Scipy, Numba, scikit-image, ... most numerical Python packages are under a liberal license like BSD or MIT.
I don't want to argue about licensing, but just in case you aren't opposed: please re-license to improve interop and re-use with similar numerical Python packages.

forman · 2016-07-28T12:23:44Z

@cdeil you are right, I'll put it under a more liberal license for the next release. Thanks!

cdeil · 2016-07-28T12:35:09Z

@forman - Thank you!

In this case ... I'll install it and check it out now.

😄

forman · 2016-07-28T14:08:30Z

Done.

ankit-maverick mentioned this pull request Apr 19, 2013

Add function to resample arrays by a given factor? #492

Closed

JDWarner reviewed Apr 20, 2013
View reviewed changes

JDWarner reviewed May 1, 2013
View reviewed changes

ahojnnes reviewed May 7, 2013
View reviewed changes

Cleaning up downsampling for integer factors

6f9d7c5

ahojnnes closed this Jul 4, 2013

ahojnnes reopened this Jul 4, 2013

Expanding the documentation in transform._warps

8ac4427

ankit-maverick added 2 commits July 5, 2013 01:34

Attempt to solve circular import issue

f79582e

Another attempt at solving circular import issue

70d1804

Correcting typos

971e7be

ahojnnes added a commit that referenced this pull request Jul 5, 2013

Merge pull request #511 from ankit-maverick/resample

24c0c40

Resampling of nD arrays

ahojnnes merged commit 24c0c40 into scikit-image:master Jul 5, 2013

ahojnnes mentioned this pull request Jul 5, 2013

Refactor N-dimensial array resampling and add additional functionality #632

Merged

stefanv reviewed Jul 5, 2013
View reviewed changes

ozak mentioned this pull request Dec 15, 2014

Block_reduce raises error with fractional blocks #1296

Closed


		import numpy as np

		def downsample(image, factors, how = 'sum'):


		def downsample(image, factors, method='sum'):

		# works only if image.shape is perfectly divisible by factors


		def downsample(array, factors, mode='sum'):
		"""Performs downsampling with integer factors.

Resampling of nD arrays #511

Resampling of nD arrays #511

Conversation

ankit-maverick commented Apr 19, 2013

JDWarner Apr 20, 2013

Choose a reason for hiding this comment

JDWarner commented Apr 20, 2013

ankit-maverick commented Apr 20, 2013

cdeil commented Apr 21, 2013

ankit-maverick commented Apr 21, 2013

JDWarner commented Apr 21, 2013

stefanv commented Apr 22, 2013

ankit-maverick commented Apr 30, 2013

ahojnnes commented Apr 30, 2013

JDWarner commented Apr 30, 2013

ankit-maverick commented Apr 30, 2013

ankit-maverick commented Apr 30, 2013

ankit-maverick commented May 1, 2013

JDWarner May 1, 2013

Choose a reason for hiding this comment

JDWarner May 1, 2013

Choose a reason for hiding this comment

ankit-maverick May 1, 2013

Choose a reason for hiding this comment

JDWarner May 1, 2013

Choose a reason for hiding this comment

JDWarner commented May 1, 2013

stefanv commented May 1, 2013

ankit-maverick commented May 2, 2013

ahojnnes commented May 2, 2013

ankit-maverick commented May 2, 2013

ahojnnes May 7, 2013

Choose a reason for hiding this comment

ahojnnes commented May 7, 2013

ankit-maverick commented Jul 4, 2013

sciunto commented Jul 4, 2013

ankit-maverick commented Jul 4, 2013

ankit-maverick commented Jul 4, 2013

JDWarner commented Jul 4, 2013

ankit-maverick commented Jul 5, 2013

ahojnnes commented Jul 5, 2013

stefanv Jul 5, 2013

Choose a reason for hiding this comment

ahojnnes commented Jul 5, 2013

coveralls commented Jul 17, 2014

forman commented Jul 26, 2016

jni commented Jul 26, 2016

forman commented Jul 28, 2016

cdeil commented Jul 28, 2016

forman commented Jul 28, 2016

cdeil commented Jul 28, 2016

forman commented Jul 28, 2016