Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Image interface #994

Merged
merged 70 commits into from Apr 8, 2017
Merged
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
1de6cb2
Added basics of an Image interface
Nov 23, 2016
8859ff4
Fixes for data interface select returning kwargs
Jan 21, 2017
661fa97
Added out utility to compute BoundingBox ranges from coords
Jan 21, 2017
8af6152
Various fixes and improvements for image interface
Jan 21, 2017
32e9e2b
Separated column and grid based interface tests
Jan 21, 2017
34a2062
Fixes for ImageInterface and associated unit tests
Jan 22, 2017
3c441bd
Removed custom Surface.range implementation
Jan 22, 2017
677b77a
Reimplemented Image sample using interface methods
Jan 22, 2017
417ac99
Improved bound_range utility function
Jan 22, 2017
0ac0bb4
Fixed Image operations and plots to use interface methods
Jan 22, 2017
6c334ba
Implemented Image interface groupby
Jan 22, 2017
0e0a5c3
Defined Image.groupby
Jan 22, 2017
0b123b3
Reorganized inheritance of Raster types
Jan 22, 2017
7cd3f1b
Handle Raster/Image axis inversion correctly
Jan 22, 2017
f2e999f
Bokeh RasterPlot fix
Jan 22, 2017
1c02357
Implemented sampling and closest for Raster types
Jan 22, 2017
dca9d6c
Various fixes for Raster types
Jan 22, 2017
a4ceefc
Fixes after merge with master
Mar 2, 2017
730fdcc
Small fixes for Dateset.closest method
Mar 4, 2017
aa140a4
Small fixes to Raster Elements
Mar 4, 2017
c41da34
Fixed Dataset aggregate element types
Mar 4, 2017
c2c7810
Fixed orientation of Raster type
Mar 6, 2017
113f0f1
Added support for slicing vdim on RGB and HSV types
Mar 6, 2017
fd3960d
Stop returning kwargs from Interface.select
Mar 26, 2017
e0f9811
Fixed RGB constructor
Mar 26, 2017
484263a
Reverted various changes to Raster and interfaces
Mar 27, 2017
7a7dd5f
Made Dataset.sample support Image.sample signature
Mar 27, 2017
822215d
Small sampling fixes
Mar 27, 2017
72e9282
Improved casting between incompatible datatypes
Mar 27, 2017
49dd02b
Fixed gridded sampling
Mar 27, 2017
b26d61d
Fixed floating point bug in ImageInterface.values
Mar 28, 2017
86b20eb
Fixed bug in Raster display
Mar 28, 2017
940e1ed
Fixes for grid interface indexing
Mar 28, 2017
0ccf07c
Fixes for Image indexing
Mar 28, 2017
4337ab0
Added initial ImageInterface tests
Mar 28, 2017
89d2441
Further fixes to grid interface indexing
Mar 29, 2017
69f213c
Fixes for Image/Grid interfaces
Mar 29, 2017
bb78a49
Updated Image Element comparisons
Mar 29, 2017
4dd9652
Added further Image interface tests
Mar 29, 2017
752f711
Updated Image based operations
Mar 29, 2017
33751b1
Added unpickling support for Image
Mar 29, 2017
a7c9342
Small fix for Image comparisons
Mar 29, 2017
7587e50
Override Image.range to return bounds
Mar 29, 2017
b59d612
Made ImageInterface shape and length consistent with Dataset
Mar 29, 2017
23b6898
Made Image subclass of Raster again
Mar 29, 2017
1536d1a
Split Image and RGB Interface tests
Mar 29, 2017
07d6630
Removed stray print
Mar 29, 2017
09872b6
Small fixes for sampling
Mar 29, 2017
1b3a9d7
Removed array output from Continuous_Coordinates
Mar 29, 2017
49d1fab
Added comparison support for OrderedDict
Mar 30, 2017
bf017bd
Implemented list comparison
Mar 30, 2017
52bd235
Fixed NdMapping unit test
Mar 30, 2017
7748cb6
Handle tuple comparisons explicitly
Mar 30, 2017
7effaea
Fixed bug in indexed condition for multiple vdims
Mar 30, 2017
7cfb650
Added RGB interface tests
Mar 30, 2017
98cf70a
Simplified ImageInterface.select
Mar 30, 2017
1a9b840
Cleanup on grid interfaces
Apr 8, 2017
1641960
Added gridded option to GridInterface.shape
Apr 8, 2017
70543a2
Cleaned up Image Element
Apr 8, 2017
c824933
Cleaned up plotting code
Apr 8, 2017
a7a6b23
Removed strict enforcement of regular sampling on Image
Apr 8, 2017
b4fa77c
Readded closest argument to sample method
Apr 8, 2017
3be08ed
Updated dimension comparison unit tests
Apr 8, 2017
7a4becd
Round image densities to machine precision
Apr 8, 2017
070c290
Fixed tuple/list comparison error messages
Apr 8, 2017
8e31fff
Implemented gridded Dataset reindexing
Apr 8, 2017
29a1e17
Cleanup, comments and docstrings for interfaces
Apr 8, 2017
fa57964
Simplified bound_range utility
Apr 8, 2017
105be67
Improved validation in data conversions
Apr 8, 2017
ba70f8a
Cleaned up bound_range utility docstring
Apr 8, 2017
File filter...
Filter file types
Jump to…
Jump to file or symbol
Failed to load files and symbols.
+1,437 −676
Diff settings

Always

Just for now

@@ -378,7 +378,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Similarly, if we ask for the value of a given *y* location in continuous space, we will get a ``Curve`` with the array row closest to that *y* value in the ``Image`` 2D array returned as an array of $x$ values and the corresponding *z* value from the image:"
"Similarly, if we ask for the value of a given *y* location in continuous space, we will get a ``Curve`` with the array row closest to that *y* value in the ``Image`` 2D array returned as arrays of $x$ values and the corresponding *z* value from the image:"
]
},
{
@@ -394,19 +394,9 @@
]
},
{
"cell_type": "code",
"execution_count": null,
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"r10.data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The same sampling syntax can be used on HoloViews objects with any number of continuous-coordinate dimensions, in each case returning a HoloViews object of the correct dimensionality. This support for working in continuous spaces makes it much more natural to work with HoloViews objects than directly with the underlying raw Numpy arrays, but the raw data always remains available when needed."
]
Copy path View file
@@ -14,6 +14,7 @@
from .array import ArrayInterface
from .dictionary import DictInterface
from .grid import GridInterface
from .image import ImageInterface
from .ndelement import NdElementInterface

datatypes = ['array', 'dictionary', 'grid', 'ndelement']
@@ -54,6 +55,7 @@

from ..dimension import Dimension
from ..element import Element
from ..ndmapping import OrderedDict
from ..spaces import HoloMap, DynamicMap
from .. import util

@@ -97,13 +99,27 @@ def __call__(self, new_type, kdims=None, vdims=None, groupby=None,
if vdims is None:
vdims = self._element.vdims
if vdims and not isinstance(vdims, list): vdims = [vdims]

# Checks Element type supports dimensionality
type_name = new_type.__name__
for dim_type, dims in (('kdims', kdims), ('vdims', vdims)):
min_d, max_d = new_type.params(dim_type).bounds
if ((min_d is not None and len(dims) < min_d) or
(max_d is not None and len(dims) > max_d)):
raise ValueError("%s %s must be between length %s and %s." %
(type_name, dim_type, min_d, max_d))

if groupby is None:
groupby = [d for d in self._element.kdims if d not in kdims+vdims]
elif groupby and not isinstance(groupby, list):
groupby = [groupby]

if self._element.interface.gridded:
selected = self._element
dropped_kdims = [kd for kd in self._element.kdims if kd not in groupby+kdims]
if dropped_kdims:
selected = self._element.reindex(groupby+kdims, vdims)
else:
selected = self._element
else:
selected = self._element.reindex(groupby+kdims, vdims)
params = {'kdims': [selected.get_dimension(kd, strict=True) for kd in kdims],
@@ -112,7 +128,7 @@ def __call__(self, new_type, kdims=None, vdims=None, groupby=None,
if selected.group != selected.params()['group'].default:
params['group'] = selected.group
params.update(kwargs)
if len(kdims) == selected.ndims:
if len(kdims) == selected.ndims or not groupby:
element = new_type(selected, **params)
return element.sort() if sort else element
group = selected.groupby(groupby, container_type=HoloMap,
@@ -149,6 +165,9 @@ class Dataset(Element):
# Define a class used to transform Datasets into other Element types
_conversion_interface = DataConversion

_vdim_reductions = {}
_kdim_reductions = {}

def __init__(self, data, **kwargs):
if isinstance(data, Element):
pvals = util.get_param_values(data)
@@ -166,7 +185,7 @@ def __init__(self, data, **kwargs):
initialized = Interface.initialize(type(self), data, kdims, vdims,
datatype=kwargs.get('datatype'))
(data, self.interface, dims, extra_kws) = initialized
super(Dataset, self).__init__(data, **dict(extra_kws, **dict(kwargs, **dims)))
super(Dataset, self).__init__(data, **dict(kwargs, **dict(dims, **extra_kws)))

This comment has been minimized.

Copy link
@jlstevens

jlstevens Apr 5, 2017

Contributor

I assume this is an unrelated bug-fix regarding argument precedence.

This comment has been minimized.

Copy link
@philippjfr

philippjfr Apr 5, 2017

Author Contributor

I'll have to double check this, might be a leftover from something I tried early on.

This comment has been minimized.

Copy link
@philippjfr

philippjfr Apr 7, 2017

Author Contributor

Just checked this is crucial for Image to work correctly, 32 test failures without it.

self.interface.validate(self)


@@ -188,19 +207,33 @@ def __setstate__(self, state):

super(Dataset, self).__setstate__(state)

def closest(self, coords):

def closest(self, coords=[], **kwargs):
"""
Given single or multiple samples along the first key dimension
will return the closest actual sample coordinates.
Given a single coordinate or multiple coordinates as
a tuple or list of tuples or keyword arguments matching
the dimension closest will find the closest actual x/y
coordinates. Different Element types should implement this
appropriately depending on the space they represent, if the
Element does not support snapping raise NotImplementedError.
"""
if self.ndims > 1:
NotImplementedError("Closest method currently only "
"implemented for 1D Elements")
raise NotImplementedError("Closest method currently only "

This comment has been minimized.

Copy link
@jlstevens

jlstevens Apr 5, 2017

Contributor

Oops! :-)

"implemented for 1D Elements")

if kwargs:
dim = self.get_dimension(list(kwargs.keys())[0], strict=True)
if len(kwargs) > 1:
raise NotImplementedError("Closest method currently only "
"supports 1D indexes")
samples = list(kwargs.values())[0]
coords = samples if isinstance(samples, list) else [samples]

This comment has been minimized.

Copy link
@jlstevens

jlstevens Apr 5, 2017

Contributor

In this case is coords a tuple? How often is it used without being a list?

Might be worth always requiring a list of samples for consistency - it all depends on how often a single sample is needed.

This comment has been minimized.

Copy link
@philippjfr

philippjfr Apr 5, 2017

Author Contributor

Might be worth always requiring a list of samples for consistency - it all depends on how often a single sample is needed.

I'll make a comment, this is for img.sample(x=0), which wasn't supported on Datasets before.

This comment has been minimized.

Copy link
@philippjfr

philippjfr Apr 5, 2017

Author Contributor

Wait no, I was confused, I think it is possible to get rid of this. We'll see.

This comment has been minimized.

Copy link
@philippjfr

philippjfr Apr 7, 2017

Author Contributor

Seemingly used a few times at least on Image, so I think making the API consistent is reasonable.

This comment has been minimized.

Copy link
@jlstevens

jlstevens Apr 7, 2017

Contributor

... so I think making the API consistent is reasonable.

To be clear, that would mean requiring the list format?

This comment has been minimized.

Copy link
@philippjfr

philippjfr Apr 7, 2017

Author Contributor

No, it would mean to continue supporting both, the idea being that it matches .sample, .reduce, .select signatures.


if not isinstance(coords, list): coords = [coords]
xs = self.dimension_values(0)
if xs.dtype.kind in 'SO':
raise NotImplementedError("Closest only supported for numeric types")
idxs = [np.argmin(np.abs(xs-coord)) for coord in coords]
return [xs[idx] for idx in idxs] if len(coords) > 1 else xs[idxs[0]]
return [xs[idx] for idx in idxs]

This comment has been minimized.

Copy link
@jlstevens

jlstevens Apr 5, 2017

Contributor

Looks like the return format has changed given only a single sample. The new version does seem more consistent.



def sort(self, by=[]):
@@ -285,6 +318,7 @@ def select(self, selection_specs=None, **selection):
return self

data = self.interface.select(self, **selection)

if np.isscalar(data):
return data
else:
@@ -301,13 +335,16 @@ def reindex(self, kdims=None, vdims=None):
else:
key_dims = [self.get_dimension(k, strict=True) for k in kdims]

new_type = None
if vdims is None:
val_dims = [d for d in self.vdims if not kdims or d not in kdims]
else:
val_dims = [self.get_dimension(v, strict=True) for v in vdims]
new_type = self._vdim_reductions.get(len(val_dims), type(self))

data = self.interface.reindex(self, key_dims, val_dims)
return self.clone(data, kdims=key_dims, vdims=val_dims)
return self.clone(data, kdims=key_dims, vdims=val_dims,
new_type=new_type)

This comment has been minimized.

Copy link
@jlstevens

jlstevens Apr 5, 2017

Contributor
return self.clone(data, kdims=key_dims, vdims=val_dims,
           new_type=None if vdims is None else self._vdim_reductions.get(len(val_dims), type(self))

Or is that too long?

This comment has been minimized.

Copy link
@jlstevens

jlstevens Apr 5, 2017

Contributor

Or better:

return self.clone(data, kdims=key_dims, vdims=val_dims,
           new_type=self._vdim_reductions.get(len(val_dims), type(self) if vdims else None)


def __getitem__(self, slices):
@@ -330,7 +367,7 @@ def __getitem__(self, slices):
if isinstance(slices, np.ndarray) and slices.dtype.kind == 'b':
if not len(slices) == len(self):
raise IndexError("Boolean index must match length of sliced object")
return self.clone(self.interface.select(self, selection_mask=slices))
return self.clone(self.select(selection_mask=slices))

This comment has been minimized.

Copy link
@jlstevens

jlstevens Apr 5, 2017

Contributor

I think the old version returned data whereas self.select should return a Dataset. The former seems more intuitive to me but if they both work it doesn't matter much. Is the new version better somehow?

This comment has been minimized.

Copy link
@philippjfr

philippjfr Apr 5, 2017

Author Contributor

I'll double check.

elif slices in [(), Ellipsis]:
return self
if not isinstance(slices, tuple): slices = (slices,)
@@ -342,7 +379,7 @@ def __getitem__(self, slices):
value_select = slices[self.ndims]
elif len(slices) == self.ndims+1 and isinstance(slices[self.ndims],
(Dimension,str)):
raise Exception("%r is not an available value dimension'" % slices[self.ndims])
raise Exception("%r is not an available value dimension" % slices[self.ndims])
else:
selection = dict(zip(self.dimensions(label=True), slices))
data = self.select(**selection)
@@ -354,13 +391,58 @@ def __getitem__(self, slices):
return data


def sample(self, samples=[]):
def sample(self, samples=[], closest=True, **kwargs):
"""
Allows sampling of Dataset as an iterator of coordinates
matching the key dimensions, returning a new object containing
just the selected samples.
"""
return self.clone(self.interface.sample(self, samples))
just the selected samples. Alternatively may supply kwargs
to sample a coordinate on an object. By default it will attempt
to snap to the nearest coordinate if the Element supports it,
snapping may be disabled with the closest argument.
"""
if kwargs and samples:
raise Exception('Supply explicit list of samples or kwargs, not both.')
elif kwargs:
sample = [slice(None) for _ in range(self.ndims)]
for dim, val in kwargs.items():
sample[self.get_dimension_index(dim)] = val
samples = [tuple(sample)]

# Note: Special handling sampling of gridded 2D data as Curve
# may be replaced wih more general handling
# see https://github.com/ioam/holoviews/issues/1173
from ...element import Table, Curve
if len(samples) == 1:
sel = {kd.name: s for kd, s in zip(self.kdims, samples[0])}
dims = [kd for kd, v in sel.items() if not np.isscalar(v)]
selection = self.select(**sel)

# If a 1D cross-section of 2D space return Curve
if self.interface.gridded and self.ndims == 2 and len(dims) == 1:
new_type = Curve
kdims = [self.get_dimension(kd) for kd in dims]
else:
new_type = Table
kdims = self.kdims

if np.isscalar(selection):
selection = [samples[0]+(selection,)]
else:
selection = tuple(selection.columns(kdims+self.vdims).values())

return self.clone(selection, kdims=kdims, new_type=new_type)

lens = set(len(util.wrap_tuple(s)) for s in samples)
if len(lens) > 1:
raise IndexError('Sample coordinates must all be of the same length.')

if closest:
try:
samples = self.closest(samples)
except NotImplementedError:
pass
samples = [util.wrap_tuple(s) for s in samples]
return self.clone(self.interface.sample(self, samples), new_type=Table)


def reduce(self, dimensions=[], function=None, spreadfn=None, **reduce_map):
@@ -390,24 +472,34 @@ def aggregate(self, dimensions=None, function=None, spreadfn=None, **kwargs):
aggregated = self.interface.aggregate(self, kdims, function, **kwargs)
aggregated = self.interface.unpack_scalar(self, aggregated)

ndims = len(dimensions)
min_d, max_d = self.params('kdims').bounds
generic_type = (min_d is not None and ndims < min_d) or (max_d is not None and ndims > max_d)

vdims = self.vdims
if spreadfn:
error = self.interface.aggregate(self, dimensions, spreadfn)
spread_name = spreadfn.__name__
ndims = len(vdims)
error = self.clone(error, kdims=kdims)
combined = self.clone(aggregated, kdims=kdims)
error = self.clone(error, kdims=kdims, new_type=Dataset)
combined = self.clone(aggregated, kdims=kdims, new_type=Dataset)
for i, d in enumerate(vdims):
dim = d('_'.join([d.name, spread_name]))
dvals = error.dimension_values(d, False, False)
combined = combined.add_dimension(dim, ndims+i, dvals, True)
return combined
return combined.clone(new_type=Dataset if generic_type else type(self))

if np.isscalar(aggregated):
return aggregated
else:
return self.clone(aggregated, kdims=kdims, vdims=vdims)

try:
return self.clone(aggregated, kdims=kdims, vdims=vdims,
new_type=new_type)
except:
datatype = self.params('datatype').default
return self.clone(aggregated, kdims=kdims, vdims=vdims,
new_type=Dataset if generic_type else None,
datatype=datatype)


def groupby(self, dimensions=[], container_type=HoloMap, group_type=None,
@@ -532,7 +624,7 @@ def columns(self, dimensions=None):
dimensions = self.dimensions()
else:
dimensions = [self.get_dimension(d, strict=True) for d in dimensions]
return {d.name: self.dimension_values(d) for d in dimensions}
return OrderedDict([(d.name, self.dimension_values(d)) for d in dimensions])


@property
Copy path View file
@@ -183,7 +183,7 @@ def select(cls, dataset, selection_mask=None, **selection):
selection_mask = cls.select_mask(dataset, selection)
indexed = cls.indexed(dataset, selection)
data = np.atleast_2d(dataset.data[selection_mask, :])
if len(data) == 1 and indexed:
if len(data) == 1 and indexed and len(dataset.vdims) == 1:
data = data[0, dataset.ndims]
return data

Copy path View file
@@ -124,7 +124,7 @@ def select(cls, columns, selection_mask=None, **selection):
selection_mask = cls.select_mask(columns, selection)
indexed = cls.indexed(columns, selection)
df = df if selection_mask is None else df[selection_mask]
if indexed and len(df) == 1:
if indexed and len(df) == 1 and len(columns.vdims) == 1:
return df[columns.vdims[0].name].compute().iloc[0]
return df

@@ -219,7 +219,7 @@ def select(cls, dataset, selection_mask=None, **selection):
indexed = cls.indexed(dataset, selection)
data = OrderedDict((k, list(compress(v, selection_mask)))
for k, v in dataset.data.items())
if indexed and len(list(data.values())[0]) == 1:
if indexed and len(list(data.values())[0]) == 1 and len(dataset.vdims) == 1:
return data[dataset.vdims[0].name][0]
return data

Oops, something went wrong.
ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.