Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add memoization to dynamic Callable class #1063

Merged
merged 4 commits into from Jan 16, 2017
Merged

Add memoization to dynamic Callable class #1063

merged 4 commits into from Jan 16, 2017

Conversation

philippjfr
Copy link
Member

@philippjfr philippjfr commented Jan 15, 2017

In order to address #1049 the Callable class now memoizes the last returned value from a DynamicMap, based on the args, kwargs and nested stream values.

Needs docstrings and tests. @jordansamuels this seems to address your issue. Small example for testing:

import holoviews as hv
import numpy as np
import pandas as pd
from holoviews.operation.datashader import datashade
from holoviews.streams import PositionXY

hv.notebook_extension('bokeh')

%opts RGB [width=800, height=450] VLine HLine (line_color='black' line_width=0.5)

n=100000
def f(n):
    return np.cumsum(np.random.randn(n))
df = pd.DataFrame({'t': range(n), 'y1': f(n), 'y2': f(n), 'y3' : .10 * f(n)} )

def p(y):
    return hv.Curve(df, kdims=['t'], vdims=[y])

ts = datashade(p('y1'))
ch = hv.DynamicMap(lambda x, y: hv.HLine(y) * hv.VLine(x) * hv.Text(x+0.15, y, '%.3f' % x + ", %.3f" % y),
              kdims=[], streams=[PositionXY()])
(ts * datashade(p('y2')) * ch + ts).cols(1)

@jbednar
Copy link
Member

jbednar commented Jan 16, 2017

Wow! And the total amount of code even went down. Fabulous!

@philippjfr
Copy link
Member Author

philippjfr commented Jan 16, 2017

@jlstevens Requested review, the code is fairly straightforward and I've added at least one test to ensure the memoization behaves correctly. There are already a number of tests to check memoization works during plotting, stopping the callback from being called repeatedly.

to a DynamicMap.
to a DynamicMap. Additionally a Callable will memoize the last
returned value based on the arguments to the function and the
state of all streams on its inputs, avoiding calling the function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think either 'avoiding calls to the function' or 'to avoid calling the function ...' would be read better.

to a DynamicMap. Additionally a Callable will memoize the last
returned value based on the arguments to the function and the
state of all streams on its inputs, avoiding calling the function
unncessarily.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo.



class dimensionless_cache(object):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to get rid of this!

@@ -500,6 +517,8 @@ class DynamicMap(HoloMap):
""")

def __init__(self, callback, initial_items=None, **params):
if not isinstance(callback, Callable) and not isinstance(callback, types.GeneratorType):
callback = Callable(callable_function=callback)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means all callbacks will now be wrapped in Callable. This is a change from the previous behavior, and probably a good one (more consistent, at least with __mull__) though we should now document this. It would also be good to have an example of where a user might want to supply Callable themselves....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, we might want an example of the user declaring a Callable with their own set of inputs.

def __call__(self, *args, **kwargs):
return self.callable_function(*args, **kwargs)
inputs = [inp for inp in self.inputs if isinstance(inp, DynamicMap)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor point - I would call the loop variable i given that inp is a bit weird and you probably don't want to clash with the input built-in. Alternatively, this is one instance of a pure filter so you could consider using that (it would be shorter).

return self.callable_function(*args, **kwargs)
inputs = [inp for inp in self.inputs if isinstance(inp, DynamicMap)]
streams = [s for inp in inputs for s in get_nested_streams(inp)]
values = tuple(tuple(sorted(s.contents.items())) for s in streams)
Copy link
Contributor

@jlstevens jlstevens Jan 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means it is memoizing on the stream parameters in ascending alphanumeric order. I think that is fine (and makes sense!) but it is worth noting that this is one particular policy for how streams parameters are ordered into a tuple key. This is important to stay consistent if we ever need it somewhere else and I am now wondering if it might be worth having a utility to codify the idea....up to you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really doesn't matter much, it just needs to be consistently ordered. If we're using it somewhere else it can use the same scheme or another scheme without having any effect here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True - my point is that we should try to be consistent anyway!

@jlstevens
Copy link
Contributor

jlstevens commented Jan 16, 2017

Looks good!

I'm very happy this turned out to be far more straightforward than I anticipated. Overall only seven lines were added and the new approach is far more intuitive to me than dimensionless_cache ever was.

I think my only worry is that is that (from what I understand) Callback will memoise at every level so dmap1 * dmap2 * dmap3 is memoising 3 + 2 + 1 i.e 6 times (3 leaves, two binary muls, one triple mul). The main worry then is unnecessary/excessive memory usage.

@philippjfr
Copy link
Member Author

I think my only worry is that is that, from what I understand, Callback will memoise at every level so dmap1 * dmap2 * dmap3 is memoising 3 + 2 + 1 i.e 6 times (3 leaves, two binary muls, one triple mul). The main worry then is unnecessary/excessive memory usage.

That's accurate although each level already caches these values in the DynamicMap cache so I don't think there will be much being cached additionally. It's also worth noting that each level simply wraps the previous level, so it's not holding copies of the data, so the overhead should simply be in the parameterized objects that are created, which should be fairly cheap.

@jlstevens
Copy link
Contributor

Tests are passing so I will merge now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants