Added datashader regridding operation #1773

philippjfr · 2017-08-01T11:14:34Z

As suggested and outlined in #1552 this PR adds a regrid operation that allows dynamically downsampling and upsampling HoloViews Image, RGB and HSV types to a specified x_range, y_range, width and height. It closely mirrors the aggregate operation but operates on Images instead. I introduced a common baseclass for the two operations and have started using it in various projects. It will require some further cleanup, decisions on naming and documentation but is now fully functional.

philippjfr · 2017-08-04T12:39:20Z

@jlstevens @jbednar I've now finished the code changes to this PR leaving decisions on naming and docstrings. I've included various changes to the aggregate implementation for compatibility with the most recent changes in datashader. All the changes should be backward compatible. I've also added unit tests for regridding.

The main naming decisions to make are about the aggregator and interpolation parameters which correspond to the downsample and upsample methods in gridtools respectively.

philippjfr · 2017-08-04T13:39:31Z

@jlstevens Any idea how I can make the raster regridding unit tests dependent on a particular version of datashader? They're failing now because it's still pulling datashader 0.5.

jlstevens · 2017-08-04T14:20:39Z

You could try doing a version check in the test class setUp method and raising SkipTest if it isn't a version it should be testing against. Not entirely sure that would work but seems worth trying...

jbednar

Looks good; mainly commenting on docstrings.

jbednar · 2017-08-04T15:57:55Z

holoviews/operation/datashader.py


    dynamic = param.Boolean(default=True, doc="""
       Enables dynamic processing by default.""")

+    expand = param.Boolean(default=True, doc="""
+       Whether the x_range and y_range should be allowed to expand
+       beyond the extent of the data.""")


Maybe this docstring could be, ahem, expanded, with some of the pros and cons. My guess is:

Whether the x_range and y_range should be allowed to expand beyond the extent of the data. Setting this value to True is useful for the case where you want to ensure a certain size of output grid, e.g. if you are doing masking or other arithmetic on the grids. A value of False ensures that the grid is only just as large as it needs to be to contain the data, which will be faster and use less memory if the resulting aggregate is being overlaid on a much larger background.""")

If that description is accurate, maybe it should be False by default?

Yes, that is accurate. I agree that making it False might be useful as that's usually what you want for visualization purposes, True is mostly needed when you want to match grid sizes for computation.

Ok, False then; HV should be optimized for viz, I think.

Actually there's a good reason not to enable it for aggregate. Computing the actual range is expensive when you have to iterate over all the datapoints and it doesn't actually gain you anything except making sure that pixels aren't wasted on regions where there is no data. I could set it to True for regrid only, but it might be better to be consistent.

I think it should be True for regrid only; gridded data has very clear bounds and it's probably more confusing if we don't respect those, and definitely slower.

and definitely slower.

Not necessarily true, it just changes whether the pixels are crammed into the bounds of the image or whether they're wasted on the area beyond the original image bounds. If you're using first/last aggregators that's true though.

jbednar · 2017-08-04T16:20:17Z

holoviews/operation/datashader.py

+         By default, the link_inputs parameter is set to True so that
+         when applying shade, backends that support linked streams
+         update RangeXY streams on the inputs of the shade operation.""")
+


Maybe expand to say why one might want to set it to False? E.g. if you reuse objects in different cells of a notebook and find them inappropriately linked?

Agree with Jim's suggestion here. Doesn't look like it has been updated yet...

I think we settled for ResamplingOperation.

Looks like it's been updated below; maybe it was supposed to be done here?

Yes, meant to add it everywhere.

jbednar · 2017-08-04T16:21:48Z

holoviews/operation/datashader.py

+    the x_range and y_range. If x_sampling or y_sampling are supplied
+    the operation will ensure that a bin is no smaller than the minimum
+    sampling distance by reducing the width and height when the zoomed
+    in beyond the minimum sampling distance.


when zoomed

jbednar · 2017-08-04T16:22:55Z

holoviews/operation/datashader.py

        else:
            layers = {}
            for c in agg.coords[column].data:
                cagg = agg.sel(**{column: c})
-                layers[c] = self.p.element_type((xs, ys, cagg.data), **params)
+                eldata = cagg if ds_version > '0.5.0' else (xs, ys, cagg.data)
+                layers[c] = self.p.element_type(eldata, **params)
            return NdOverlay(layers, kdims=[data.get_dimension(column)])


Hopefully once there is a new ds release we can remove this bit and simply require ds>=0.6.0.

jbednar · 2017-08-04T16:26:03Z

holoviews/operation/datashader.py

+
+    upsample = param.Boolean(default=True, doc="""
+        Whether to allow upsampling if the source array is smaller than
+        the requested array.""")


Should this be False by default since interpolation is nearest by default, and thus the result will be approximately the same whether or not upsampling is done? With linear interpolation the results will be different, but one could say in the docstring for the interpolation parameter that one might want upsample=False for interpolation methods that smooth the data, unlike nearest.

Yes, I'd agree with that, upsampling is rarely needed I think, again only when you're trying to match a higher resolution target gridding (in which case you should probably downsample the higher resolution grid rather than upsampling the lower resolution one).

jbednar · 2017-08-04T16:32:37Z

The main naming decisions to make are about the aggregator and interpolation parameters which correspond to the downsample and upsample methods in gridtools respectively.

I'm fine with those names.

jbednar · 2017-08-04T18:24:09Z

holoviews/operation/datashader.py

@@ -133,7 +139,7 @@ class aggregate(resample_operation):
    the x_range and y_range. If x_sampling or y_sampling are supplied
    the operation will ensure that a bin is no smaller than the minimum
    sampling distance by reducing the width and height when the zoomed
-    in beyond the minimum sampling distance.
+    beyond the minimum sampling distance.


when zoomed in beyond

jbednar · 2017-08-04T18:26:32Z

Ready to merge? Looks good to me.

jlstevens · 2017-08-04T18:27:50Z

I'll be able to review this PR shortly. Don't merge till then!

jlstevens · 2017-08-04T18:41:23Z

holoviews/operation/datashader.py

-
-    aggregator = param.ClassSelector(class_=ds.reductions.Reduction,
-                                     default=ds.count())
+class resample_operation(Operation):


Why is it called resample_operation and not just resample?

Is this an abstract base class or a usable one?

Abstract, doesn't have a _process

My issue here is that the name does not sound like an abstract class. I think it can have a simple name and should simply be hidden from users when importing.

What's your suggestion?

ResampleOperation?

jlstevens · 2017-08-04T18:44:16Z

holoviews/operation/datashader.py

+        if self.p.x_sampling:
+            width = int(min([(xspan/self.p.x_sampling), width]))
+        if self.p.y_sampling:
+            height = int(min([(yspan/self.p.y_sampling), height]))


I think you need to be careful here - I don't see why xspan and self.p.x_sampling can't both be integers on Python 2 causing potential integer division issues. Same when computing height.

Sure will add the __future__ import.

jlstevens · 2017-08-04T18:49:31Z

holoviews/operation/datashader.py

+        # Disable upsampling if requested
+        (xstart, xend), (ystart, yend) = (x_range, y_range)
+        xspan, yspan = (xend-xstart), (yend-ystart)
+        if not self.p.upsample and self.p.target is None:


Shouldn't there at least be a warning to say upsampling has been disabled?

Huh why? It's a parameter.

Seems like the parameter should be respected as set, with no warning.

In that case the docstring needs improvement:

"Whether to allow upsampling if the source array is smaller than the requested array."

What is the behavior if upsample=False? Padding?

Ok, Philipp's now explained it to me. The behavior makes sense though I think there could be a bit more clarification in the docstrings.

jlstevens · 2017-08-04T18:52:55Z

Looks good other than a few minor comments I made. Happy to see this PR merged once those things are addressed.

jlstevens · 2017-08-04T19:14:37Z

holoviews/operation/datashader.py

+       just as large as it needs to be to contain the data, which will
+       be faster and use less memory if the resulting aggregate is
+       being overlaid on a much larger background.""")
+
    height = param.Integer(default=400, doc="""
       The height of the aggregated image in pixels.""")


Isn't this height parameter used by regrid? I'm not sure it is clear to talk about an 'aggregated' image in that case. Maybe just talk about the 'output' image?

True, although in aggregation is still the correct term at least when downsampling. Will fix it anyway.

jlstevens · 2017-08-04T19:58:05Z

Looks good and the unit tests have passed. Merging.

jakirkham · 2017-11-02T19:38:58Z

So GitHub makes it look like this is in v1.8.3 and v1.8.4, but went to import it and it wasn't. Then I realized it had been removed from those releases ( #1884 ). Maybe GitHub is confused. Anyways when is this planned to land in a release?

philippjfr · 2017-11-02T19:40:02Z

Anyways when is this planned to land in a release?

Today :-)

jakirkham · 2017-11-02T19:42:51Z

Well isn't it my lucky day. 😄

philippjfr added 2 commits August 1, 2017 12:11

Added datashader regridding operation

1f0cfec

Added expand parameter for datashader resampling operations

bb9cbb0

jlstevens added the status: WIP label Aug 1, 2017

philippjfr added 5 commits August 3, 2017 13:00

Allow disabling of upsampling on regridding operation

7530ad4

Add min/max aggregators for regrid

e4790ad

Handle recent change for datashader regridding

6c7e6da

Improved BoundingBox comparison mismatch message

42f1b5c

Compatability with recent datashader changes

b30768e

philippjfr added the type: feature A major new feature label Aug 4, 2017

philippjfr added 2 commits August 4, 2017 13:30

Added unit tests for datashader regridding operation

31d40dd

Updated BoundingBox comparison unit test

9ae86a4

philippjfr removed the status: WIP label Aug 4, 2017

Raise error in regrid if datashader version is insufficient

0a9a3c1

jbednar reviewed Aug 4, 2017

View reviewed changes

Changed parameter defaults and updated docstrings on regrid

9529cc9

jbednar reviewed Aug 4, 2017

View reviewed changes

Fixed typo in docstring

b1fb551

jlstevens reviewed Aug 4, 2017

View reviewed changes

Small improvements for datashader ResamplingOperation

261de69

jlstevens reviewed Aug 4, 2017

View reviewed changes

Various fixes for datashader resampling docstrings

212d4c4

jlstevens merged commit 95691a4 into master Aug 4, 2017

ea42gh pushed a commit to ea42gh/holoviews that referenced this pull request Aug 12, 2017

Added datashader regridding operation (holoviz#1773)

dd575fb

jlstevens added this to the 1.8.3 milestone Aug 21, 2017

jlstevens deleted the datashader_regridding branch August 21, 2017 20:58

zbarry mentioned this pull request Sep 7, 2017

datashader -> regrid problem with RGB images #1857

Closed

pyup-bot mentioned this pull request Nov 3, 2017

Update holoviews to 1.9.0 zenotech/zCFDSuperBuild#104

Closed

pyup-bot mentioned this pull request Nov 13, 2017

Update holoviews to 1.9.1 zenotech/zCFDSuperBuild#120

Closed

pyup-bot mentioned this pull request Dec 12, 2017

Update holoviews to 1.9.2 zenotech/zCFDSuperBuild#139

Merged

Added datashader regridding operation #1773

Added datashader regridding operation #1773

Conversation

philippjfr commented Aug 1, 2017 • edited Loading

philippjfr commented Aug 4, 2017

philippjfr commented Aug 4, 2017

jlstevens commented Aug 4, 2017

jbednar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philippjfr Aug 4, 2017 • edited by jbednar Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philippjfr Aug 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philippjfr Aug 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbednar commented Aug 4, 2017

Choose a reason for hiding this comment

jbednar commented Aug 4, 2017 • edited Loading

jlstevens commented Aug 4, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlstevens Aug 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlstevens Aug 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlstevens commented Aug 4, 2017

jlstevens Aug 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlstevens commented Aug 4, 2017

jakirkham commented Nov 2, 2017

philippjfr commented Nov 2, 2017

jakirkham commented Nov 2, 2017

philippjfr commented Aug 1, 2017 •

edited

Loading

philippjfr Aug 4, 2017 •

edited by jbednar

Loading

philippjfr Aug 4, 2017 •

edited

Loading

philippjfr Aug 4, 2017 •

edited

Loading

jbednar commented Aug 4, 2017 •

edited

Loading

jlstevens Aug 4, 2017 •

edited

Loading

jlstevens Aug 4, 2017 •

edited

Loading

jlstevens Aug 4, 2017 •

edited

Loading