Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Table.running function to compute grouped running statistics. #8196

Closed
jdunkerley opened this issue Oct 31, 2023 · 5 comments
Closed

Add Table.running function to compute grouped running statistics. #8196

jdunkerley opened this issue Oct 31, 2023 · 5 comments
Assignees
Labels
-libs Libraries: New libraries to be implemented x-design x-new-feature Type: new feature request
Milestone

Comments

@jdunkerley
Copy link
Member

jdunkerley commented Oct 31, 2023

Currently, we have Column.running statistic:Statistic name:Text -> Column = ... allowing for computing a single running statistic such as maximum and minimum.

This ticket is to add a table version of the function.

  • Should be able to compute the same set of statistics as on a Column.
  • Has to be able to compute within groups and with specified ordering
  • Like add_row_number it should add a column and return the table with the rows in the original order.
  • Not supported within database backend at least for now.
    • At least in theory, support for this exists in Postgres, MySQL, and SQL Server via Window functions (Partition by and Order by should handle grouping and sequence, and most functions should be supported). This may not work for ALL statistics, but should cover Sum, Product, etc. Will not cover Kurtosis and Skew.
  • Add the following function to Statistic:
    • Product

Suggested API:

Table.running column:Text|Integer statistic:Statistic=Statistic.Count as:Text='' group_by:Vector=[] order_by:Vector=[] -> Table
@jdunkerley jdunkerley added x-new-feature Type: new feature request -libs Libraries: New libraries to be implemented labels Oct 31, 2023
@jdunkerley jdunkerley added this to the Beta Release milestone Oct 31, 2023
@jdunkerley
Copy link
Member Author

Would like to agree the design on this and then we can move forward.

@Cassandra-Clark
Copy link
Contributor

In theory, we could possibly add full function support here, but I think it opens a can of worms without very careful UX. Would like to discuss and consider if its worthwhile as a follow-on effort

@Cassandra-Clark
Copy link
Contributor

In theory, we could possibly add full function support here, but I think it opens a can of worms without very careful UX. Would like to discuss and consider if its worthwhile as a follow-on effort

After discussion, this is best left to an offset/multi-row function instead of complicating the running command.

@Cassandra-Clark
Copy link
Contributor

Weighted Average pulled to a separate ticket

@Cassandra-Clark
Copy link
Contributor

Add the following function to Statistic:

Product

Should be a separate ticket

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-libs Libraries: New libraries to be implemented x-design x-new-feature Type: new feature request
Projects
Status: 🟢 Accepted
Development

No branches or pull requests

3 participants