Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add memoization to dynamic Callable class #1063

Merged
merged 4 commits into from Jan 16, 2017

Conversation

Projects
None yet
3 participants
@philippjfr
Copy link
Contributor

philippjfr commented Jan 15, 2017

In order to address #1049 the Callable class now memoizes the last returned value from a DynamicMap, based on the args, kwargs and nested stream values.

Needs docstrings and tests. @jordansamuels this seems to address your issue. Small example for testing:

import holoviews as hv
import numpy as np
import pandas as pd
from holoviews.operation.datashader import datashade
from holoviews.streams import PositionXY

hv.notebook_extension('bokeh')

%opts RGB [width=800, height=450] VLine HLine (line_color='black' line_width=0.5)

n=100000
def f(n):
    return np.cumsum(np.random.randn(n))
df = pd.DataFrame({'t': range(n), 'y1': f(n), 'y2': f(n), 'y3' : .10 * f(n)} )

def p(y):
    return hv.Curve(df, kdims=['t'], vdims=[y])

ts = datashade(p('y1'))
ch = hv.DynamicMap(lambda x, y: hv.HLine(y) * hv.VLine(x) * hv.Text(x+0.15, y, '%.3f' % x + ", %.3f" % y),
              kdims=[], streams=[PositionXY()])
(ts * datashade(p('y2')) * ch + ts).cols(1)

@philippjfr philippjfr added the WIP label Jan 15, 2017

@jbednar

This comment has been minimized.

Copy link
Contributor

jbednar commented Jan 16, 2017

Wow! And the total amount of code even went down. Fabulous!

@philippjfr philippjfr removed the WIP label Jan 16, 2017

@philippjfr philippjfr requested a review from jlstevens Jan 16, 2017

@philippjfr

This comment has been minimized.

Copy link
Contributor Author

philippjfr commented Jan 16, 2017

@jlstevens Requested review, the code is fairly straightforward and I've added at least one test to ensure the memoization behaves correctly. There are already a number of tests to check memoization works during plotting, stopping the callback from being called repeatedly.

to a DynamicMap.
to a DynamicMap. Additionally a Callable will memoize the last
returned value based on the arguments to the function and the
state of all streams on its inputs, avoiding calling the function

This comment has been minimized.

@jlstevens

jlstevens Jan 16, 2017

Contributor

I think either 'avoiding calls to the function' or 'to avoid calling the function ...' would be read better.

to a DynamicMap. Additionally a Callable will memoize the last
returned value based on the arguments to the function and the
state of all streams on its inputs, avoiding calling the function
unncessarily.
"""

This comment has been minimized.

@jlstevens

jlstevens Jan 16, 2017

Contributor

Typo.



class dimensionless_cache(object):
"""

This comment has been minimized.

@jlstevens

jlstevens Jan 16, 2017

Contributor

Glad to get rid of this!

@@ -500,6 +517,8 @@ class DynamicMap(HoloMap):
""")

def __init__(self, callback, initial_items=None, **params):
if not isinstance(callback, Callable) and not isinstance(callback, types.GeneratorType):
callback = Callable(callable_function=callback)

This comment has been minimized.

@jlstevens

jlstevens Jan 16, 2017

Contributor

This means all callbacks will now be wrapped in Callable. This is a change from the previous behavior, and probably a good one (more consistent, at least with __mull__) though we should now document this. It would also be good to have an example of where a user might want to supply Callable themselves....

This comment has been minimized.

@jlstevens

jlstevens Jan 16, 2017

Contributor

For instance, we might want an example of the user declaring a Callable with their own set of inputs.

def __call__(self, *args, **kwargs):
return self.callable_function(*args, **kwargs)
inputs = [inp for inp in self.inputs if isinstance(inp, DynamicMap)]

This comment has been minimized.

@jlstevens

jlstevens Jan 16, 2017

Contributor

Very minor point - I would call the loop variable i given that inp is a bit weird and you probably don't want to clash with the input built-in. Alternatively, this is one instance of a pure filter so you could consider using that (it would be shorter).

return self.callable_function(*args, **kwargs)
inputs = [inp for inp in self.inputs if isinstance(inp, DynamicMap)]
streams = [s for inp in inputs for s in get_nested_streams(inp)]
values = tuple(tuple(sorted(s.contents.items())) for s in streams)

This comment has been minimized.

@jlstevens

jlstevens Jan 16, 2017

Contributor

This means it is memoizing on the stream parameters in ascending alphanumeric order. I think that is fine (and makes sense!) but it is worth noting that this is one particular policy for how streams parameters are ordered into a tuple key. This is important to stay consistent if we ever need it somewhere else and I am now wondering if it might be worth having a utility to codify the idea....up to you.

This comment has been minimized.

@philippjfr

philippjfr Jan 16, 2017

Author Contributor

It really doesn't matter much, it just needs to be consistently ordered. If we're using it somewhere else it can use the same scheme or another scheme without having any effect here.

This comment has been minimized.

@jlstevens

jlstevens Jan 16, 2017

Contributor

True - my point is that we should try to be consistent anyway!

@jlstevens

This comment has been minimized.

Copy link
Contributor

jlstevens commented Jan 16, 2017

Looks good!

I'm very happy this turned out to be far more straightforward than I anticipated. Overall only seven lines were added and the new approach is far more intuitive to me than dimensionless_cache ever was.

I think my only worry is that is that (from what I understand) Callback will memoise at every level so dmap1 * dmap2 * dmap3 is memoising 3 + 2 + 1 i.e 6 times (3 leaves, two binary muls, one triple mul). The main worry then is unnecessary/excessive memory usage.

@philippjfr

This comment has been minimized.

Copy link
Contributor Author

philippjfr commented Jan 16, 2017

I think my only worry is that is that, from what I understand, Callback will memoise at every level so dmap1 * dmap2 * dmap3 is memoising 3 + 2 + 1 i.e 6 times (3 leaves, two binary muls, one triple mul). The main worry then is unnecessary/excessive memory usage.

That's accurate although each level already caches these values in the DynamicMap cache so I don't think there will be much being cached additionally. It's also worth noting that each level simply wraps the previous level, so it's not holding copies of the data, so the overhead should simply be in the parameterized objects that are created, which should be fairly cheap.

Philipp Rudiger Philipp Rudiger

@philippjfr philippjfr force-pushed the callable_cache branch from e40e0fc to 8a701c8 Jan 16, 2017

@jlstevens

This comment has been minimized.

Copy link
Contributor

jlstevens commented Jan 16, 2017

Tests are passing so I will merge now.

@jlstevens jlstevens merged commit 4deedc8 into master Jan 16, 2017

4 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.06%) to 76.907%
Details
s3-reference-data-cache Test data is cached.
Details

@philippjfr philippjfr deleted the callable_cache branch Jan 27, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.