Implement lazy evaluation for DynamicMap methods #422

philippjfr · 2016-01-27T19:01:57Z

After a lengthy discussion on how to work with very large datasets with @jlstevens, we realized that DynamicMap callbacks and the lazy evaluation they allow gets us most of the way there but there's some crucial things missing for usability. In particular the methods on DynamicMap are all inherited from HoloMap, which means they work on the assumption that all the data is already loaded. This means that currently when using a method on a DynamicMap it will apply the operation only to the elements that's already loaded, which will in the best case do nothing but will usually result in exceptions when the next Element is loaded.

What I propose is to subclass the methods on DynamicMap to generate closures, which will apply the requested operation when a new key is requested. This will allow slicing, sampling, reducing, aggregating and overlaying operations to be applied at runtime. Even operations could trivially be modified to simply define a closure when passed a DynamicMap.

This will be a huge improvement for usability and make exploration and analysis of large datasets incredibly easy. The user could specify the data transformations to be applied directly on the object rather than having to write complex callbacks.

Here's a small toy example:

Say we have an array of temperatures on disk which is indexed by Date x Atmospheric Height x Latitude x Longitude.

Currently required implementation:

def get_contours(key):
    array, metadata = load_data_from disk(key)
    img = hv.Image(array, **metadata).select(Latitude=(0, 30), Longitude=(50, 80))
    return hv.contours(img)

def get_curve(key):
    array, metadata = load_data_from disk(key)
    return hv.Image(array, **metadata).reduce(Latitude=np.mean)

kdims = [hv.Dimension('Date', ...), hv.Dimension('Atmospheric Height', ...)]
contours = hv.DynamicMap(get_contours, kdims=kdims)
curve = hv.DynamicMap(get_curve, kdims=kdims)
contours + curve

With lazy evaluation:

def get_data(key):
    array, metadata = load_data_from disk(key)
    return hv.Image(array, **metadata)

dmap = hv.DynamicMap(get_data, kdims=[hv.Dimension('Date', ...),
                     hv.Dimension('Atmospheric Height', ...)])

hv.contours(dmap.select(Latitude=(0, 30), Longitude=(50, 80))) +\
dmap.reduce(Latitude=np.mean)

The lazy evaluation example really demonstrates the strengths of this approach, you can define the data loading operation once and then interactively explore the whole dataset without having to define more and more functions. I hope this gets the idea across, I think this is relatively little work for a pretty huge payoff, this would open up a lot of the functionality in HoloViews up to much, much larger datasets.

The text was updated successfully, but these errors were encountered:

jbednar · 2016-01-27T19:07:12Z

Sounds like an excellent idea to me. How will DynamicMap know that get_data returns an Image? Or does it not need to know that, as it will simply apply whatever operations are requested onto whatever it returns? And if so, does that mean that get_data can dynamically return different Element types?

philippjfr · 2016-01-27T19:14:03Z

How will DynamicMap know that get_data returns an Image? Or does it not need to know that, as it will simply apply whatever operations are requested onto whatever it returns?

Correct, it simply wouldn't care, if the operation you applied doesn't work for the return type it will complain at evaluation.

And if so, does that mean that get_data can dynamically return different Element types?

No, get_data has to consistently return the same Element type because the plotting code needs to initialize the appropriate artist/glyph. But you wouldn't want to have it return different types because then you can't apply any operations to it. The proposal does however allow for lazy overlaying so you can load the data into two different dynamicmaps and overlay them.

philippjfr · 2016-01-27T20:42:00Z

Having outlined the idea there are various methods that are problematic because they can't easily be lazily evaluated:

range could be made to work by working in chunks and with some caching, which would allow for our regular normalization options to be applied.
dimension_values for the value dimension would mean returning a 1D array of literally all the datapoints, which isn't really desirable.
groupby is a tough one but could be made to work if we had proper support for constant dimensions, which would also allow you to explode a DynamicMap into a GridSpace or NdLayout of DynamicMaps using .layout and .grid.
collapse is also tough because it would require more chunking so that only a certain amount of the data is in memory at the same time.

Everything else operates on individual elements and should thus be relatively easy to support.

philippjfr · 2016-02-10T19:00:39Z

Just a quick note that .relabel with depth > 1 should also lazily evaluate on generated Elements.

philippjfr · 2016-05-09T19:29:17Z

jlstevens · 2017-01-30T23:28:49Z

I would prefer it if these methods that cannot be made lazy simply didn't exist on DynamicMap. Even if they are made to always raise NotImplementedError, the fact they are defined and tab-complete but don't actually work is rather annoying.

Sadly, I don't know a good way of 'undefining' inherited methods (see this stackoverflow post) so the only suggestion is that these methods are introduced as a mixin class for HoloMap only, with a new intermediate class shared by HoloMap and DynamicMap.

The prospect of more classes isn't one I am particularly pleased with but I do think we should try to solve the issue of these dud methods somehow.

Edit: I know that IPython annoyingly ignores the order of __dir__ but maybe it doesn't ignore the set of items returned by it. In that case we could use this along with the trick proposed in the stackoverflow post of declaring these dud methods as properties that raise AttributeError...

jbednar · 2017-01-31T00:43:00Z

A shared base class seems like the clean way to do this. Undefining those methods will completely violate the meaning of class inheritance, which is really not a good idea---the subclass then no longer satisfies "is a", making it very hard to reason about.

Coming up with a name for the shared base class seems like the trickiest part, as I don't know what defines the set of things that make sense for both HoloMaps and DynamicMaps. But having such a name will be useful, because there are already a variety of cases where we need to refer to both of them without making a distinction. "Map" is the clearly shared bit, but doesn't mean much (which makes both of those names slightly dubious already).

philippjfr · 2017-01-31T00:44:01Z

A shared base class seems like the clean way to do this.

Strongly agree, we decided on ViewableMap as the baseclass.

jbednar · 2017-01-31T00:45:26Z

That name sounds reasonable, though of course a Curve is also a viewable map (from x to y), as is every other HoloViews Element with key and value dimensions. Can't think of anything better, though.

jlstevens · 2017-01-31T00:50:22Z

We already have a ViewableElement base class of Element which is why I suggested ViewableMap: this scheme seems fairly consistent to me.

jbednar · 2017-01-31T01:05:30Z

Sure.

Fundamentally, what all three of these classes share is that they (a) contain at least one additional value dimension beyond those mapped on to the screen at any one instant, and (b) specifically allow those other dimension(s) to be selected or iterated over or animated to choose the one specific value to be displayed at that instant. I.e., for dimensions n>s, dimensions 1 to s are mapped onto the screen in some way, and dimensions s+1 to n are selected in some way, whether by widgets or iteration.

So ViewableMap is a bit backwards, since compared to a regular Element, what's different about these classes is that they have mappings that are not viewable at any one time, only if iterated over. So IterableMap or SelectableMap or PlayableMap would also be true. Anyway, I don't much like any of these names; nothing is really all that accurate or compelling, mainly because the Map bit is not actually the thing that should be shared between all three, as that's not the distinguishing bit compared to Elements or other Containers.

jbednar · 2017-01-31T01:07:06Z

But of course, it's too late to change HoloMap and DynamicMap...

philippjfr · 2017-01-31T01:07:31Z

ViewableMap should be read as "Map of Viewables" not as "Map that is Viewable".

jbednar · 2017-01-31T01:08:23Z

That would work if Elements were called Viewables, but they aren't.

philippjfr · 2017-01-31T01:08:36Z

They are Viewable types along with Overlays.

jlstevens · 2017-01-31T01:09:05Z

Twice I was about to echo Philipp's reply but he beat me to it ;-p

jbednar · 2017-01-31T01:12:50Z

I don't know what you mean; there doesn't seem to be anything called Viewable in the hv source code, apart from one comment:

~/holoviews/holoviews> \grep Viewable *.py */*.py | \grep -v ViewableElement
plotting/renderer.py:        Given a HoloViews Viewable return a corresponding plot instance.

philippjfr · 2017-01-31T01:15:57Z

Right, the actual name is ViewableElement.

jbednar · 2017-01-31T01:17:38Z

So we are meant to read ViewableMap as "a map of ViewableElements"? Then it all makes sense, but I have to say that it's quite a stretch! You didn't like ElementMap, in that case? "Element" is normally a noun, while "Viewable" is normally an adjective, so ElementMap naturally leads to the parsing you had in mind.

philippjfr · 2017-01-31T01:20:11Z

So we are meant to read ViewableMap as "a map of ViewableElements"? Then it all makes sense, but I have to say that it's quite a stretch! You didn't like ElementMap, in that case?

Right, ViewableElement could really be renamed to Viewable. Basically Overlay types are also ViewableElement types, when I'd argue they should just be Viewable types since they really are not Elements. That's also why ElementMap isn't right.

jbednar · 2017-01-31T01:27:27Z

Well, you don't often use ViewableElement anywhere in the documentation, so I would think you could change it to Viewable now. And then one could replace many of the occurrences of Element in the tutorials to say the base class Viewable instead, i.e. anything that could either be an Element or an Overlay. At that point, ViewableMap would make sense (though still awkward due to the grammatically surprising use of Viewable as a noun (a la Smalltalk)).

That said, seems to me that a Layout is also viewable, but is not a ViewableElement, which argues for keeping the current name ViewableElement. The parent class of HoloMap and DynamicMap could then even be ViewableElementMap; it's long and ugly, but is not a name we need all that often, and it's more obvious how to parse that.

philippjfr · 2017-01-31T01:29:36Z

Sure, for me the more descriptive the better, this class should definitely not make it beyond the namespace of the core.spaces module.

jbednar · 2017-01-31T01:41:52Z

For code purposes, that's true, but for the documentation, we do need a term that unifies both HoloMap and DynamicMap, and ViewableElementMap would then be that term. Still, even though it's clunky, it probably helps explanations be clearer, because when we first introduce it we can explain why it's called that, and that there are two subtypes, which should actually help people understand HoloMap and DynamicMap better.

jlstevens · 2017-01-31T01:46:22Z

I agree ViewableElementMap is ugly and descriptive and fine for coding purposes (and I agree it should not escape core.spaces) but I'm not sure I would be happy seeing that name leak into our user docs.

The term I would be happy to use to describe both HoloMap and DynamicMap is NdMapping without mentioning ViewableElementMap. We can still use the term 'viewable' if we wish without mentioning the class names of things that users really don't need to know about...

jbednar · 2017-01-31T02:54:37Z

But NdMapping doesn't convey anything about what is mapped. It's fine to mention that these are both NdMappings, but it doesn't help explain how they relate to Elements or to other containers, it just explains how they do mapping.

ViewableElementMap does explain what is mapped. There are ViewableElements, which are things that can be viewed in a single screen-based coordinate system at a single time, of which there are many types (Curve, Image, etc., and various kinds of Overlays thereof). Then there are ViewableElementMaps, which are collections of ViewableElements that select one coordinate from these mapped extra dimensions to display at any one time. ViewableElements and ViewableElementMaps share much in common -- both have one screen-based coordinate system, both take up the same space on the screen, and both can be used interchangeably in a Layout, but one is fully rendered on screen, and what's rendered onscreen for the other is just one particular coordinate's data at any one time, from some larger mapping.

To me this makes much more sense than our current documentation, where it's nearly impossible to see or remember how HoloMaps relate to Elements. Not being as deeply involved in the code and principles as you two, I think I'm actually more trustworthy in this particular case than you two -- I know very clearly what's confusing, whereas you got over your confusion years ago. :-)

philippjfr · 2017-03-04T13:09:00Z

So I had a go implementing the final outstanding dynamic method, map and ran into trouble, if I allow a map that applies both to the DynamicMap and its items I ended up with infinitely nested DynamicMaps (well up to the max_recursion limit), here's my current (slightly incorrect) implementation to avoid this:

    def map(self, map_fn, specs=None, clone=True):
        """
        Recursively replaces elements using a map function when the
        specification applies.
        """
        if specs and not isinstance(specs, list): specs = [specs]
        applies = specs is None or any(self.matches(spec) for spec in specs)

        deep_mapped = self.clone() if clone else self
        if applies:
            return map_fn(deep_mapped)
        for k, v in self.items():
            deep_mapped[k] = v.map(map_fn, specs, clone)

        from ..util import Dynamic
        def dynamic_map(obj):
            return obj.map(map_fn, specs, clone)
        return Dynamic(deep_mapped, shared_data=True, operation=dynamic_map)

Will need to think about it more.

philippjfr · 2017-04-10T14:30:09Z

I've just done a final review of all the DynamicMap methods, and I think we're very close. Once #1240 is merged I believe all methods that can be made dynamic have been implemented that way. I also don't mind retaining the list of non-dynamic methods that apply to the cache. However there are three methods which absolutely cannot be supported, which are the top three in the list below. My suggestion would simply raise NotImplementedErrors for those cases.

# Should be disabled
add_dimension
drop_dimension
reindex

# General methods
dimensions
get
get_dimension
get_dimension_index
get_dimension_type
get_param_values
matches
pprint
params
warning

# Applies to cache (i.e. not dynamic)
collapse
dframe
dimension_values
items
info
keys
reduce
sample
split_overlays
table
traverse
update
values

# Dynamically implemented
__call__
__mul__
__getitem__
collate
grid
groupby
hist
layout
map
overlay
redim
relabel
select

# DynamicMap method
event
reset

One final thing I need to do is ensure that the __call__ is working correctly as I've encountered some weird interactions with it.

jbednar · 2017-04-10T14:31:29Z

Did there not end up being a shared base class so that these methods could truly just not be defined?

philippjfr · 2017-04-10T14:32:44Z

Did there not end up being a shared base class so that these methods could truly just not be defined?

These methods are defined all the way down at the MultiDimensionalMapping level, so I don't think there's a clean way not to define them.

jbednar · 2017-04-10T14:33:28Z

Ok, then...

philippjfr · 2017-04-10T22:01:13Z

Since merging #1262 this is now finally complete.

philippjfr added the type: feature A major new feature label Feb 4, 2016

philippjfr added this to the v1.5.0 milestone Feb 4, 2016

philippjfr modified the milestones: v1.6.0, v1.5.0 Apr 20, 2016

philippjfr modified the milestones: v1.6.2, v1.7.0 Jul 29, 2016

jlstevens modified the milestones: v1.7.0, v1.6.2 Aug 25, 2016

philippjfr mentioned this issue Dec 27, 2016

Implement dynamic relabel, redim and select methods #1029

Merged

philippjfr self-assigned this Mar 15, 2017

philippjfr mentioned this issue Mar 24, 2017

Implemented a dynamic hist method #1227

Merged

philippjfr mentioned this issue Apr 3, 2017

Implemented DynamicMap.map method #1240

Merged

philippjfr closed this as completed Apr 10, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement lazy evaluation for DynamicMap methods #422

Implement lazy evaluation for DynamicMap methods #422

philippjfr commented Jan 27, 2016

jbednar commented Jan 27, 2016

philippjfr commented Jan 27, 2016

philippjfr commented Jan 27, 2016

philippjfr commented Feb 10, 2016

philippjfr commented May 9, 2016 •

edited

jlstevens commented Jan 30, 2017 •

edited

jbednar commented Jan 31, 2017

philippjfr commented Jan 31, 2017

jbednar commented Jan 31, 2017 •

edited

jlstevens commented Jan 31, 2017 •

edited

jbednar commented Jan 31, 2017

jbednar commented Jan 31, 2017

philippjfr commented Jan 31, 2017

jbednar commented Jan 31, 2017

philippjfr commented Jan 31, 2017 •

edited

jlstevens commented Jan 31, 2017

jbednar commented Jan 31, 2017

philippjfr commented Jan 31, 2017

jbednar commented Jan 31, 2017 •

edited

philippjfr commented Jan 31, 2017 •

edited

jbednar commented Jan 31, 2017 •

edited

philippjfr commented Jan 31, 2017

jbednar commented Jan 31, 2017

jlstevens commented Jan 31, 2017 •

edited

jbednar commented Jan 31, 2017

philippjfr commented Mar 4, 2017 •

edited

philippjfr commented Apr 10, 2017 •

edited

jbednar commented Apr 10, 2017

philippjfr commented Apr 10, 2017

jbednar commented Apr 10, 2017

philippjfr commented Apr 10, 2017

Implement lazy evaluation for DynamicMap methods #422

Implement lazy evaluation for DynamicMap methods #422

Comments

philippjfr commented Jan 27, 2016

jbednar commented Jan 27, 2016

philippjfr commented Jan 27, 2016

philippjfr commented Jan 27, 2016

philippjfr commented Feb 10, 2016

philippjfr commented May 9, 2016 • edited

jlstevens commented Jan 30, 2017 • edited

jbednar commented Jan 31, 2017

philippjfr commented Jan 31, 2017

jbednar commented Jan 31, 2017 • edited

jlstevens commented Jan 31, 2017 • edited

jbednar commented Jan 31, 2017

jbednar commented Jan 31, 2017

philippjfr commented Jan 31, 2017

jbednar commented Jan 31, 2017

philippjfr commented Jan 31, 2017 • edited

jlstevens commented Jan 31, 2017

jbednar commented Jan 31, 2017

philippjfr commented Jan 31, 2017

jbednar commented Jan 31, 2017 • edited

philippjfr commented Jan 31, 2017 • edited

jbednar commented Jan 31, 2017 • edited

philippjfr commented Jan 31, 2017

jbednar commented Jan 31, 2017

jlstevens commented Jan 31, 2017 • edited

jbednar commented Jan 31, 2017

philippjfr commented Mar 4, 2017 • edited

philippjfr commented Apr 10, 2017 • edited

jbednar commented Apr 10, 2017

philippjfr commented Apr 10, 2017

jbednar commented Apr 10, 2017

philippjfr commented Apr 10, 2017

philippjfr commented May 9, 2016 •

edited

jlstevens commented Jan 30, 2017 •

edited

jbednar commented Jan 31, 2017 •

edited

jlstevens commented Jan 31, 2017 •

edited

philippjfr commented Jan 31, 2017 •

edited

jbednar commented Jan 31, 2017 •

edited

philippjfr commented Jan 31, 2017 •

edited

jbednar commented Jan 31, 2017 •

edited

jlstevens commented Jan 31, 2017 •

edited

philippjfr commented Mar 4, 2017 •

edited

philippjfr commented Apr 10, 2017 •

edited