Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream events on overlaid DynamicMaps should selectively update layers that reference the stream #1049

Closed
jordansamuels opened this Issue Jan 10, 2017 · 6 comments

Comments

Projects
None yet
3 participants
@jordansamuels
Copy link
Contributor

jordansamuels commented Jan 10, 2017

Based on the recommendation from @philippjfr , I am using a holoviews.stream.PositionXY to overlay hv.Text onto a DynamicMap. This works well in theory; but with wide plots with several layers, the feedback is very slow. Here is an example:

import holoviews as hv
import numpy as np
import pandas as pd
from holoviews.operation.datashader import datashade, aggregate, shade
from holoviews.streams import PositionXY

hv.notebook_extension('bokeh')

%opts DynamicMap [width=1800, height=450, tools=['xwheel_zoom']] Layout [shared_datasource=False]

n=100000
def f(n):
    return np.cumsum(np.random.randn(n))
df = pd.DataFrame({'t': range(n), 'y1': f(n), 'y2': f(n), 'y3' : .10 * f(n)} )

def p(y):
    return hv.Curve(df, kdims=['t'], vdims=[y])

ts = datashade(p('y1'))
ch = hv.DynamicMap(lambda x, y: hv.HLine(y) * hv.VLine(x) * hv.Text(x+0.15, y, '%.3f' % x + ", %.3f" % y),
              kdims=[], streams=[PositionXY()], source=ts)
%opts RGB [width=1800, height=450]
(ts * datashade(p('y2')) * ch + ts).cols(1)

I'm using bokeh 0.12.3 and holoviews latest code as of 2425fad .

@philippjfr philippjfr changed the title Streams overlay is slow for large timeseries Stream events on overlaid DynamicMaps should selectively update layers that reference the stream Jan 11, 2017

@philippjfr

This comment has been minimized.

Copy link
Contributor

philippjfr commented Jan 11, 2017

Thanks @jordansamuels, this highlights an important issue with our current implementation. What is happening here is that ts * datashade(p('y2')) * ch is combined into one DynamicMap, which calls each of the individual layers. This means the datashade DynamicMap gets called every time a PositionXY event is sent. This will massively slow down what should be a very quick plot update. What should happen instead is that the outer DynamicMap only calls the individual layers if they are also attached to that PositionXY event. I'll have to think about how best to implement this, but it's an important issue that needs to be addressed before release. @jlstevens Any thoughts?

@jlstevens

This comment has been minimized.

Copy link
Contributor

jlstevens commented Jan 11, 2017

I agree this raises an important issue. I'll have to think a bit more about before suggesting anything though...

@jlstevens

This comment has been minimized.

Copy link
Contributor

jlstevens commented Jan 15, 2017

I've thought a bit about this a little and I think there are the following approaches:

  1. Try to automatically figure out which layers are affected by streams. This in itself doesn't sound too bad but then we need to figure out how to link this information to what gets refreshed.
  2. Using * with DynamicMaps doesn't return a normal Overlay but another DynamicMap (an instance of Dynamic if I remember rightly). We could have a special DynamicMul class just for * with additional methods such a refresh method to specify with layers ought to be updated.
  3. You could reach into the callback and explicitly specify what should be refreshed there. Not sure if this should be by layer number or group.label (or maybe the refresh method supports both?).

I'm not a fan of option 2. as I would rather not introduce a new class. I think options 1. and 3. might be related though if we can start with option 3 and then move to option 1 later. The idea is that we offer a way for users to explicitly state which layers should be updated by specifying it on the DynamicMap callback (all of them by default).

This would offer a way for the user to improve performance right away by specifying what shouldn't be refreshed explicitly. Later on, we might be able to set this mechanism automatically, by figuring out which layers the streams affect. This doesn't sound too bad but it does need to work for arbitrary deep overlays.

@philippjfr

This comment has been minimized.

Copy link
Contributor

philippjfr commented Jan 15, 2017

I've thought about this also, indeed I lost an issue I was about to open describing a proposal. I think any approach we implement must satisfy 1. otherwise it is useless. We definitely do not add a system that the user has to know about to figure this stuff out, it should just happen. I also wouldn't be happy with 2., definitely don't want a new class. In the end it depends a lot on what you meant by:

You could reach into the callback and explicitly specify what should be refreshed there. Not sure if this should be by layer number or group.label (or maybe the refresh method supports both?).

The way I see it the only solution will be to have an events system that allows registering, among other things, which streams just received an event. These would be set either globally or somewhere else, probably with a context manager. Then it would be on the dynamic overlay functions to use the get_nested_streams to figure out whether to update a particular layer, and using the dimensionless_cache context manager to control whether to use the cache or generate a new frame.

@jlstevens

This comment has been minimized.

Copy link
Contributor

jlstevens commented Jan 15, 2017

After discussing things with @jbednar and @philippjfr, we came to the conclusion that we might be over-complicating things.

I now believe we just need to auto-memoise the user provided callbacks. If the arguments to the callback doesn't change, we assume the output doesn't change so we just return the last value. Users could set this up for themselves now but it is probably worth adding this optimization automatically, especially for things such as datashader output.

@philippjfr

This comment has been minimized.

Copy link
Contributor

philippjfr commented Jan 16, 2017

This has now been addressed through auto-memoization. @jordansamuels Please let me know how you get on with this. I'd also strongly recommend you update to bokeh 0.12.4, which should speed up datashader plots by a lot.

@philippjfr philippjfr closed this Jan 16, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.