Defining the watching and triggering semantics #269

jlstevens · 2018-09-20T19:23:59Z

Param has recently been given the ability to 'watch' parameters, triggering callbacks when parameters are set. There is an important subtlety to think about as currently triggering occurs when parameters are set regardless of their value.

This is rather surprising: your watching callback will be triggered even if you keep setting the parameters to the same values (either with attribute setting or set_param). In my opinion, what you really want to do is watch parameters for changes.

Given these semantics, it would be nice to let users trigger watch callbacks without value changes should they decide to. Discussing this with @jbednar, we propose the following that avoids adding a boolean flag to the watch API:

Update the semantics of the watch mechanism to be specific to parameter changes.
Add a trigger method that takes a list of parameter names to trigger the (applicable) watch callbacks using the current set of parameter values. E.g foo.trigger('a','b') would pass value Change objects to the applicable watch callbacks for the current values of parameters a and b.

Unfortunately, updating the watch mechanism is itself a non trivial task as it is hard to know when something (mutable) has changed for anything other than simple types. This is the same problem we face if we want to generalize the memoization machinery in HoloViews so it can be used in param.

Ideally, I think we would tackle the following issues in param in this order:

Move the memoization/equality comparison machinery out of holoviews and expose it in a useful way from param.
Use this machinery to implement trigger and update watch with the new semantics.
Continue work on batching parameter changes on top of the updated semantics.

In practice, I think we can skip item 1 for now and partially implement item 2, with an explicit note that the equality checking is not sophisticated for complex mutable types. Then I can continue working on item 3 which is the next priority. When we have a chance, we can then implement item 1 and finish off item 2 to handle more complex types.

Does this seem reasonable?

The text was updated successfully, but these errors were encountered:

jlstevens · 2018-09-20T19:26:00Z

I'll note that equality testing is easy for the other attributes (what) supported by the watch mechanism as they are simple immutable literals e.g constant is a boolean, bounds is a numeric tuple etc. It is detecting the parameter value changes that is particularly tricky.

jbednar · 2018-09-20T19:43:31Z

That all sounds good to me.

philippjfr · 2018-09-21T12:43:37Z

I'm not at all fond of the hashing code in holoviews and would be very hesitant porting that code to param. For pretty much any array or dataframe data that code is extremely slow, so much so that I've basically had to bypass it entirely to get usable performance out of any data based Stream classes. In fact, in practice I've found that in a vast majority of cases the hashing is so slow that it actually provides no benefit over just disabling the memoization.

So I agree with the order you lay out at the end, starting by implementing this for cases where simple equality works and can decide to expand that to more complex types at a later point, hopefully with a cleaner/better/faster approach than currently implemented in holoviews.

philippjfr · 2018-09-21T12:48:42Z

Just to get my thoughts recorded somewhere, basically I believe any approach exclusively based on hashing is infeasible, because it requires the whole data to be hashed before any comparison can happen. The only thing I can see being feasible is a hybrid approach that performs some simple checks before a full hash is generated, e.g. things like:

Are the objects the same type?
Does the first value of the array match?
Do the dataframe column names match?
Does the array shape match?

These kinds of checks would bypass a full hash in the vast majority of cases, and only in the worst case would it fall back. The drawback is that any such code will get ugly very quickly.

jbednar · 2018-09-21T13:12:23Z

These kinds of checks would bypass a full hash in the vast majority of cases

I agree that simple checks like that should be done where feasible. However, isn't it necessary to do a full, deep comparison before one can make use of anything that's been memoized? So it seems like at best such checks can help make memoization less costly when it's not useful, and they won't help any for when it is useful.

philippjfr · 2018-09-21T13:17:42Z

That's a good point, doing this is always going to be costly and the cost of doing it can often outweigh any benefit gained by not executing the function. Ideally you would only activate it in cases where it's actually preventing large computations from being rerun.

philippjfr · 2018-09-21T13:26:31Z

It'll be worth looking at joblib.Memory and more specifically MemorizedFunc which uses on-disk caching but promises to have a very fast hashing approach that we might be able to copy.

jbednar · 2018-09-21T13:40:22Z

Ideally it would just be a hash on the underlying contents of memory -- simple to calculate, and may falsely require recomputation but would always be truly accurate when it says they match. But that's out of scope for anything we implement ourselves, so hopefully that's what joblib is doing.

jlstevens · 2018-09-21T20:29:48Z

Point 1 where the semantics of watch has now been addressed in #271.

As for fast equality checking, I don't think there are any easy answers. I certainly don't disagree with anything said here and I remember not being particularly happy with any approach when I implemented it. The options are:

Don't support memoization for non-trivial types.
Accept that equality checks can be expensive (normally due to serialization) and hope that it is generally faster than the recomputation you are avoiding by memoizing. This is what we opted for in holoviews.
Somehow hash the memory as Jim just mentioned. I don't know any easy way to do this is Python but maybe some library out there can handle fast, deep equality tests for arbitrary Python objects. Even if such a library exists, it would have to become a core dependency for either holoviews/param which means even this wouldn't be an ideal solution.

One approach (which I am not necessarily recommending!) would be to say parameters shouldn't be large, complex chunks of data and should only be relatively simple literals. This is normally true but we do know that there are cases where we do want parameters that hold a fair bit of state.

philippjfr · 2018-09-25T01:42:50Z

Currently my thinking is that we should probably just provide an extensible way to define equality checks for arbitrary types. If a user wants their custom object to work with it they can register an equality function. Testing for equality is always going to be faster than hashing and then comparing hashes since there is large (often huge) overhead in serialization, which is, I think, entirely unnecessary here because we have both the old and new value available for comparison. Just as a simple example here's the difference between HoloViews hashing comparisons and a simple numpy.array_equal:

I think we can all agree that 1.8 seconds (or a 3600x!!! difference to the equality check) for a fairly small array is pretty much unusable.

Defining equality function to cover the common cases such as literals, functions, ndarrays and dataframes etc. (basically all the types that param defines explicit Parameters for) won't be difficult and anyone else can define their own equality checks for custom objects they use.

jbednar · 2018-09-25T01:45:01Z

That sounds good...

jlstevens · 2018-10-01T18:30:34Z

Just to say that a first cut at triggering was introduced in #283 and batched watching has also been merged (though that also needs to support what properly).

tonyfast · 2020-08-24T15:22:59Z

@philippjfr @jlstevens has this all been dealt with? could y'all open a new issue if there are any small changes to address.

jlstevens mentioned this issue Sep 21, 2018

Only trigger watch callbacks on parameter change #271

Merged

This was referenced Sep 25, 2018

Watch changed comparison only compares safe types #277

Closed

Add Comparator to detect changes in arbitrary objects #279

Merged

ceball mentioned this issue Apr 13, 2020

Does == work as we'd like for Parameterized instances? #26

Closed

ceball added component: depends/watch status: discussion Discussion. Not yet a specific feature/bug. Likely to result in multiple PRs/issues. labels Apr 13, 2020

tonyfast closed this as completed Aug 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Defining the watching and triggering semantics #269

Defining the watching and triggering semantics #269

jlstevens commented Sep 20, 2018 •

edited

jlstevens commented Sep 20, 2018

jbednar commented Sep 20, 2018

philippjfr commented Sep 21, 2018 •

edited

philippjfr commented Sep 21, 2018 •

edited

jbednar commented Sep 21, 2018 •

edited

philippjfr commented Sep 21, 2018

philippjfr commented Sep 21, 2018

jbednar commented Sep 21, 2018

jlstevens commented Sep 21, 2018

philippjfr commented Sep 25, 2018 •

edited

jbednar commented Sep 25, 2018

jlstevens commented Oct 1, 2018

tonyfast commented Aug 24, 2020

Defining the watching and triggering semantics #269

Defining the watching and triggering semantics #269

Comments

jlstevens commented Sep 20, 2018 • edited

jlstevens commented Sep 20, 2018

jbednar commented Sep 20, 2018

philippjfr commented Sep 21, 2018 • edited

philippjfr commented Sep 21, 2018 • edited

jbednar commented Sep 21, 2018 • edited

philippjfr commented Sep 21, 2018

philippjfr commented Sep 21, 2018

jbednar commented Sep 21, 2018

jlstevens commented Sep 21, 2018

philippjfr commented Sep 25, 2018 • edited

jbednar commented Sep 25, 2018

jlstevens commented Oct 1, 2018

tonyfast commented Aug 24, 2020

jlstevens commented Sep 20, 2018 •

edited

philippjfr commented Sep 21, 2018 •

edited

philippjfr commented Sep 21, 2018 •

edited

jbednar commented Sep 21, 2018 •

edited

philippjfr commented Sep 25, 2018 •

edited