Add reduce macros #143

slott56 · 2020-08-01T12:17:28Z

[1, 2, 3].reduce(s, x, s+x, 0) for example, would compute a sum. An alternative is L.reduce("_+_", 0) which would use implied variable bindings but severly limit the features available. The 4th parameter, the initial value, defaults to zero to make sum and count slightly simpler.

Specialized reductions would be available to avoid wordy constructs using the foundational reduce macro.

L.sum() -- implicitly L.sum(x, x)
L.min() -- L.min(x, x)
L.max() -- L.max(x, x)
L.count() -- L.count(x, x) == size(L.filter(x, x))

This would permit introduction of mean(), stdev(), variance() permitting CEL the be applied to statistically-based decisions. size(L.filter(sample, sample > mean(benchmark)+3.*stdev(benchmark))) > 1. We can then supply an appropriate benchmark value in a binding.

Update:

Additionally, an L.first() would also be helpful.

This is not based on the above reduce(). It's a kind of existence test and can use short-circuit processing to stop processing when the first value has been found.

For example, resource["Tags"].first(x, x["Key"] == "Name" ? x["Value"] : null, "Default"). This lets us examine JSON documents with a list of {"Key": x, "Value": y} values for the first instance of x == "Name" and extract the y value.

This can be slightly more pleasant than resource["Tags"].filter(x, x["Key"] == "Name")[0]["Value"] because the first() macro can return a default value instead of suffering from an index error in the event of a missing {"Key": "Name", ...} entry in the list being examined.

The text was updated successfully, but these errors were encountered:

TristonianJones · 2020-08-26T17:53:16Z

This is great use case with solid examples.

@JimLarson and I have been wanting to introduce a more robust set of macros into CEL for reducers. I'll add a mini-design doc here and start the official language change process with the CEL governance team.

// Reduction always operates on the reduced (or initial) value and the iteration value from
// the input range.
// 
// Before iterating over the range value, which must be of list type, the <reduce_var> will
// be initialized to the <init_expr>. For each iteration of the range, the current value in the
// range will be assigned to <iter_var>. The <op_expr> may refer to both the <reduce_var>
// and <iter_var> and the result of the evaluation will be assigned to the <reduce_var>.
//
// If the <range_expr> is empty the <reduce_var> value will return the <init_expr> value.
<range_expr>.reduce(<reduce_var>, <iter_var>, <init_expr>, <op_expr>)

The macros of min, max, sum, and count would be written as expansions on reduce as follows:

<range_expr>.min() -> <range_expr>.reduce(r, i, int_max, r < i ? r : i)
<range_expr>.max() -> <range_expr>.reduce(r, i, int_min, r > i ? r : i)
<range_expr>.sum() -> <range_expr>.reduce(r, i, 0, r + i)
<range_expr>.count() -> <range_expr>.size()
<range_expr>.count(<i>, filter) -> <range_expr>.reduce(r, <i>, 0, filter ? r + 1 : r)

TristonianJones · 2020-08-26T18:13:14Z

@slott56 My only question is whether you can satisfy the following equation today without the reducers and just using the existing filter macro and size method with some extensions for mean and stddev (not sure over what range of values those functions would apply based on the example though):

<list>.filter(sample, sample > mean(benchmark)+3.*stdev(benchmark))).size() > 1

slott56 · 2020-08-26T20:42:23Z

Yes. As stated above, L.count(x, condition) is effectively size(L.filter(x, condition)). And yes, this is a redundant approach. I think a symmetry between count and sum are helpful. It makes generalization to sum of squares somewhat easier to understand.

I'd prefer <range_expr>. count() -> <range_expr>.reduce(r, i, 0, r + 1) but I don't think there's a justifiable reason for this other than simple symmetry.

slott56 · 2020-09-17T13:32:48Z

The use of int_max and int_min could be a problem when working with float, duration, or timestamp values. Perhaps min() and max() could use <range_expr>.first() to grab the first value in the sequence, irrespective of type. If there is no first value (because the sequence is empty) this will raise an erroneous result.

nbryant42 · 2023-08-13T22:39:09Z

At the office I've written an almost-compliant implementation of CEL in Elixir (it does deviate from spec in a few minor areas that make it a better fit for Elixir, e.g. all integers in Elixir/Erlang are bignums, so it makes a lot of sense to keep that property)

Our users do need to do a few things like .sum() that could be implemented as a reduce with more generality. My CEL implementation supports user-defined pluggable functions and macros by simply binding anonymous functions into a context map, so I'm strongly tempted to define a reduce. The only thing stopping me is what if upstream CEL implements it later and it ends up using a different parameter order.

I could get away with just defining things like sum() but my vote goes for adding this to the common language definition

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reduce macros #143

Add reduce macros #143

slott56 commented Aug 1, 2020 •

edited

Loading

TristonianJones commented Aug 26, 2020 •

edited

Loading

TristonianJones commented Aug 26, 2020

slott56 commented Aug 26, 2020 •

edited

Loading

slott56 commented Sep 17, 2020 •

edited

Loading

nbryant42 commented Aug 13, 2023

Add reduce macros #143

Add reduce macros #143

Comments

slott56 commented Aug 1, 2020 • edited Loading

TristonianJones commented Aug 26, 2020 • edited Loading

TristonianJones commented Aug 26, 2020

slott56 commented Aug 26, 2020 • edited Loading

slott56 commented Sep 17, 2020 • edited Loading

nbryant42 commented Aug 13, 2023

slott56 commented Aug 1, 2020 •

edited

Loading

TristonianJones commented Aug 26, 2020 •

edited

Loading

slott56 commented Aug 26, 2020 •

edited

Loading

slott56 commented Sep 17, 2020 •

edited

Loading