Allow for right-open cumulative operations #16272

t-ded · 2024-05-16T13:07:13Z

Description

Problem description:
I have encountered multiple times the necessity to compute some cum_* op in a way that would not consider the value of the current row. I have not found an issue addressing this.

Usecase:
For a very simplistic usecase example, assume that I would want to compute the cumulative maximum of a column and then see if the current value of the column is the highest one encountered so far. I could simply filter by pl.col('A') == pl.col('A_cum_max'), but this will not consider cases when the same value has been encountered before.

Proposal:
The most straightforward workaround that I have found is to just use a shifted cumulative maximum/whatever (i.e., cumulative maximum of the previous row in context), which may not be the easiest step computation-wise. Having some parameter such as right_closed=False (or maybe last_closed to not confuse when also using the reverse parameter) within the cum_* ops would save a lot of computation in this manner while (I believe and correct me if I am wrong please) not being so complicated to implement.
See the example for cum_max below taken from documentation along with my expected result:


df.with_columns(
    pl.col("a").cum_max().alias("cum_max"),
    pl.col("a").cum_max(right_closed=False).alias("cum_max_without_current_element"),
)
shape: (4, 3)
┌─────┬─────────┬────────────────────┐
│ a   ┆ cum_max ┆ cum_max_right_open │
│ --- ┆ ---     ┆ ---                │
│ i64 ┆ i64     ┆ i64                │
╞═════╪═════════╪════════════════════╡
│ 1   ┆ 1       ┆ null               │
│ 2   ┆ 2       ┆ 1                  │
│ 3   ┆ 3       ┆ 2                  │
│ 4   ┆ 4       ┆ 3                  │
└─────┴─────────┴────────────────────┘

The text was updated successfully, but these errors were encountered:

t-ded added the enhancement New feature or an improvement of an existing feature label May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow for right-open cumulative operations #16272

Allow for right-open cumulative operations #16272

t-ded commented May 16, 2024

Allow for right-open cumulative operations #16272

Allow for right-open cumulative operations #16272

Comments

t-ded commented May 16, 2024

Description