New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connect streams with bokeh callbacks #889

Merged
merged 21 commits into from Sep 30, 2016

Conversation

Projects
None yet
3 participants
@philippjfr
Member

philippjfr commented Sep 28, 2016

This finally allows connecting Streams with interactions on a bokeh plot, which will allow for all kinds of interactive features becoming trivial to implement. The PR is currently a WIP and only works for a DynamicMap returning a single element.

The PR works by defining Callback classes which define the attributes that should be returned from the bokeh plot and the handles the callback will be attached to. The Callback then auto-generates the code to look up those attributes and sends them back via a JupyterCommJS.

The next part I will work on is how the source of a stream is looked up to make sure it also works with overlays and layouts.

Example

Having laid out what still needs work I will outline a concrete example of what does work. The PR registers the following callback:

class PositionXYCallback(Callback):

    attributes = {'x': 'cb_data.geometry.x', 'y': 'cb_data.geometry.y'}

    handles = ['hover']


callbacks = Stream._callbacks['bokeh']
callbacks[PositionXY] = PositionXYCallback

When creating a PositionXY stream on a DynamicMap it will now create a hovertool on the source of the stream, which sends its x and y position back to the Stream.

dmap = hv.DynamicMap(lambda x, y: hv.Points([(x**i, y**i) for i in np.linspace(0, 3)],
                                            extents=(0, 0, 10, 10)),
                     kdims=[], streams=[PositionXY()])
dmap

xy

@philippjfr philippjfr added this to the v1.7.0 milestone Sep 28, 2016

@jlstevens

This comment has been minimized.

Member

jlstevens commented Sep 28, 2016

Fantastic!

It is probably worth noting that in the final version, users won't even need to define PositionXYCallback as we will have all those defined and registered already. This means, all that will need to be done is to import PositionXY and define the DynamicMap.

Nonetheless, it is great to see how little code is involved which suggests custom functionality will be very easy to add!

I haven't reviewed the PR yet but I am assuming the mechanism that will register these callbacks for each backend (in such a way the user doesn't need to worry about it) is something we will still need to discuss.

@philippjfr

This comment has been minimized.

Member

philippjfr commented Sep 28, 2016

It is probably worth noting that in the final version, users won't even need to define PositionXYCallback as we will have all those defined and registered already. This means, all that will need to be done is to import PositionXY and define the DynamicMap.

That's already the case, I just pulled it out to demonstrate how it works and how little code is involved. Currently there is a registry on Stream._callbacks, indexed by backend and Stream type.

@jlstevens

This comment has been minimized.

Member

jlstevens commented Sep 28, 2016

Great!

I'll review the PR when you feel it is ready and have hopefully added a few more examples.

@jbednar

This comment has been minimized.

Member

jbednar commented Sep 29, 2016

Excellent! Looks really clean. And somehow you did it by removing 300 lines of code.

@philippjfr

This comment has been minimized.

Member

philippjfr commented Sep 29, 2016

I've replaced the old callback system with this, which means some of the old downsampling callbacks should become operations (probably in another PR). There is some worry about backward compatibility, but I'd prefer not to have two separate systems around.

@jlstevens

This comment has been minimized.

Member

jlstevens commented Sep 29, 2016

There is some worry about backward compatibility, but I'd prefer not to have two separate systems around.

I agree. I don't think we need to worry too much as the old plotting callback system wasn't properly documented so I think we were probably the only people using it.

@@ -49,25 +49,6 @@ class PolygonPlot(ColorbarPlot, PathPlot):
style_opts = ['color', 'cmap', 'palette'] + line_properties + fill_properties
_plot_methods = dict(single='patches', batched='patches')
def _init_tools(self, element):

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Good to get rid of this! I guess it now inherits it from the superclass.

@@ -119,7 +119,7 @@ def send(self, data):
class JupyterCommJS(Comm):
class JupyterCommJS(JupyterComm):
"""

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

The new version makes more sense...was this a bug?

This comment has been minimized.

@philippjfr

philippjfr Sep 29, 2016

Member

Yep, but the class wasn't used.

index = param.List(default=[])

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

All these new stream classes look good - they will need docstrings on their parameters though....

This comment has been minimized.

@philippjfr

philippjfr Sep 29, 2016

Member

Agreed, will add them shortly.

def __init__(self, preprocessors=[], source=None, subscribers=[], **params):
super(PositionX, self).__init__(preprocessors=preprocessors, source=source,
subscribers=subscribers, **params)

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Thanks for getting rid of these unnecessary constructors!

if self._source:
raise Exception('source has already been defined on stream.')
self._source = source
self.registry[source].append(self)

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Perhaps this should be id(source)?

@@ -121,14 +115,26 @@ def __init__(self, preprocessors=[], source=None, subscribers=[], **params):
datastructure that the stream receives events from, as supported
by the plotting backend.
"""
self.source = source
self._source = source
self.subscribers = subscribers
self.preprocessors = preprocessors
self._hidden_subscribers = []
self.uuid = uuid.uuid4().hex

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Looks like the uuid attribute might no longer b necessary...

This comment has been minimized.

@philippjfr

philippjfr Sep 29, 2016

Member

Will remove it.

registry = OrderedDict()
registry = defaultdict(list)
_callbacks = defaultdict(dict)

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Even though this has an underscore, it is worth having a comment explaining what this is, as it isn't used anywhere else in this file (that I can see).

This comment has been minimized.

@philippjfr

philippjfr Sep 29, 2016

Member

Yea definitely need to add docstrings in this file still. Basically it defines the registry of callbacks for each backend which supply the stream information.

def replace_models(obj):
"""
Processes references replacing Model and HasProps objects.
"""

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Maybe 'Recursively processes references, replacing Models with there .ref values and HasProps objects with their property values'. At least that is how I understand the code!

This comment has been minimized.

@philippjfr

philippjfr Sep 29, 2016

Member

Yep, will update it.

callbacks = param.ClassSelector(class_=Callbacks, doc="""
Callbacks object defining any javascript callbacks applied
to the plot.""")

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Great to get rid of this! Though we mustn't forget to mention its removal in the CHANGELOG.

Convert the document to a dictionary of references. Avoids
converting document to json and the performance penalty that
involves.
"""

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Perhaps 'Avoids unnecessary JSON serialization/deserialization within Python and the corresponding performance penalty'. If you don't want to use that, just make it clear that this is all within Python.

@@ -168,18 +166,56 @@ def __init__(self, element, plot=None, **params):
self.current_ranges = None
super(ElementPlot, self).__init__(element, **params)
self.handles = {} if plot is None else self.handles['plot']
element_ids = self.hmap.traverse(lambda x: id(x), [Element])

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Was element_ids not needed?

This comment has been minimized.

@philippjfr

philippjfr Sep 29, 2016

Member

I realized that even if a HoloMap of Overlays has only a single item it's still static.

else:
source = self.hmap.last
streams = Stream.registry.get(source, [])
registry = Stream._callbacks['bokeh']

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Ah, so _callbacks contains the backend specific code? Worth clarifying where _callbacks is defined on Stream (see comment below).

if type(stream) in registry and streams}
cbs = []
for cb, group in groupby(sorted(callbacks), lambda x: x[0]):
cb_streams = [s for _, s in group]

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

cb_streams doesn't seem to be used...

This comment has been minimized.

@philippjfr

philippjfr Sep 29, 2016

Member

Thanks that's a bug.

cb_tools = []
for cb in callbacks:
for handle in cb.handles:
if handle and handle in known_tools:

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

If I am following this right, this could be simplified if there was a single (custom) tool for capturing events? I believe this code is about finding the right tool that the callback needs...

cb_tools.append(tool)
self.handles[handle] = tool
tools = cb_tools + self.default_tools + self.tools

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Which also means that tools may appear because a callback requests it...which could confuse users who see tools they didn't (explicitly) request. Another reason to consider a (hidden) custom tool that all our callbacks can use...

This comment has been minimized.

@philippjfr

philippjfr Sep 29, 2016

Member

In some cases an actual tool is necessary e.g. to get the geometry of a selection, but I agree a hidden tool would be preferable for hover and clicking callbacks.

@@ -512,7 +552,7 @@ def initialize_plot(self, ranges=None, plot=None, plots=None, source=None):
self.handles['plot'] = plot
# Get data and initialize data source
empty = self.callbacks and self.callbacks.downsample
empty = False
if self.batched:

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Is this temporary? If not, there is no reason for empty to appear as I don't see it being set a different value anywhere...

This comment has been minimized.

@philippjfr

philippjfr Sep 29, 2016

Member

Yes, I'm going to end up getting rid of empty everywhere I think. But probably in a separate PR.

el = element.get(key)
if el is not None:
el_tools = subplot._init_tools(el, self.callbacks)
el_tools = [t for t in el_tools

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

I guess OverlayPlot is merging the set of tools? Did it always have to do this or is this just for callbacks? If it is the latter, then I think this is another reason to have our own custom tool. :-)

This comment has been minimized.

@philippjfr

philippjfr Sep 29, 2016

Member

It's always been merging them.

modified bokeh plot objects.
Generates JS code to look up attributes on JS objects from
an attributes specification dictionary.
"""

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Could the docstring have a simple example of input and the corresponding JS output? It is hard to tell just looking at the code...

* handles : The handles define which object the callback will be
attached on.

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

What kind of object? HoloViews objects? If so, it should say so. Also, 'definition' is misspelt above...

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

I now think they are in fact 'plotting handles'.

location of the attribute separated by periods.
All plotting handles such as tools, the x_range,
y_range and (data)source can be addressed in this
way.

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

After reading this description, I think I understand but I think a short example would really help here....

The callback can also define a _process_msg method, which can
modify the data sent by the callback before it is passed to the
streams.
"""

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Having described these three things (handles, attributes and code) is there a performance reason that only code is a parameter and the other two are class attributes? I'm also a little confused as to why they are class attributes and not instance attributes...

This comment has been minimized.

@philippjfr

philippjfr Sep 29, 2016

Member

The user can supply some custom code to the instance to be executed alongside the callback but each Callback really only applies to certain handles (e.g. RangeX and x_range) and specific attributes on those handles (e.g. x_range.attributes.x). I have no objection making them all parameters though, you originally suggested making them class attributes.

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

In fact code seems to be a class parameter as you can't set it through the constructor! Nothing wrong with that although I think my question above still holds...

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Actually, code seems to be the only reason this class is parameterized at all...

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

The user can supply some custom code to the instance to be executed alongside the callback

Sure! In that case you'll need to call super :-)

In the end, I do think class attributes are fine now I see how they are defined on each concrete Callback...

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Though as the plotting code creates these things, I don't see how code could be customized per instance...

self.callbacks.update(self)
for cb in self.callbacks:
cb.initialize()
if not self.overlaid:

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

It is a little confusing that there is a _init_callbacks method which actually constructs the callback instances (based on my understanding of it) and then an initialize method as well. Maybe the former should be called _construct_callbacks instead...

This comment has been minimized.

@philippjfr

philippjfr Sep 29, 2016

Member

Agreed.

self.source = source
def initialize(self):

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

If I understand this right, the docstring could say something like 'Iterates through the plot (and any subplots), registering the necessary JS code across all relevant plotting handles'. Again, that is what I expect it to be doing from the code!

def set_customjs(self, cb_obj):

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

Should cb_obj simply be called handle for consistency?

class Callback(param.ParameterizedFunction):
def attributes_js(attributes):

This comment has been minimized.

@jlstevens

jlstevens Sep 29, 2016

Member

In the end, I am not convinced that this class needs to be parameterized.

Lastly, it might be nice to define Callback to be backend agnostic. These methods look fairly semantically independent from bokeh to me:

def __init__(self, plot, streams, source, **params):

def initialize(self):

def on_msg(self, msg):

def _process_msg(self, msg):

No need to abstract it out in this PR though...

@jlstevens

This comment has been minimized.

Member

jlstevens commented Sep 30, 2016

I understand this is now ready to merge!

Overall, I'm really pleased with how the streams system is turning out. It was simple to implement streams in core/plotting and this PR shows how you can get access to events from Bokeh in a nice clean way.

There are still some outstanding things to address in future PRs:

  • I think some of the implementation here could be simplified using a custom Bokeh tool to captures some of the events.
  • Throttling needs to be implemented on the JS side with the intent of eventually making it backend agnostic.
  • Overlays need to be handled properly.
  • We need to replace the plotting callback system e.g re-introduce downsampling as an operation.

Together with some nice documentiation and examples on mybinder, I think this will warrant the 1.7 release!

@jlstevens jlstevens merged commit 9d80880 into master Sep 30, 2016

4 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
coverage/coveralls Coverage increased (+1.5%) to 73.405%
Details
s3-reference-data-cache Test data is cached.
Details

@philippjfr philippjfr deleted the connect_streams branch Oct 14, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment