Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Expression: Window functions #9393

Closed
Tracked by #40313
sbelak opened this issue Feb 8, 2019 · 7 comments · Fixed by #41346
Closed
Tracked by #40313

Custom Expression: Window functions #9393

sbelak opened this issue Feb 8, 2019 · 7 comments · Fixed by #41346
Assignees
Labels
Querying/GUI Query builder catch-all, including simple mode Querying/MBQL Querying/Processor .Team/QueryProcessor :hammer_and_wrench: Type:New Feature

Comments

@sbelak
Copy link
Contributor

sbelak commented Feb 8, 2019

There are a number of requests (direct and indirect) for windows (eg. #8977, #9170, #6954 ...). Having them would certainly greatly increase what can be done using query builder. All the major SQL DBs seem to support OVER and jumping through some hoops so does Mongo (see https://stackoverflow.com/questions/25151042/moving-averages-with-mongodbs-aggregation-framework).

Proposal: add the following clause to MBQL
[:window <field> <unit> <lag>]
For simplicity's sake, limit to only time dimensions. Limiting the scope also means we can fall back to fully standard SQL using subselects for engines that don't support windows.

Some open questions:

  • do we support both overlapping and adjacent windows? If the latter, we could add an additional argument step to :window. However this further complicates the UI. Also, which is default?
  • is :window a clause you use inside :breakout, or should we add a new top-level key? I'm leaning towards the latter, as it feels like it will lead to a cleaner implementation (else we'll pay a for increased complexity in all functions dealing with breakouts).
  • How do we handle interaction with group-by? Is using PARTITION BY sufficient and always sensible.
  • Not sure what the best way to do this on the UI side is though. The grouping dropdown is already very big. Can the QB refactor help with this?

⬇️ Please click the 👍 reaction instead of leaving a +1 or update? comment

@markpaygo
Copy link

The use case for me is that on a week to week basis, the amount of money our customers spend varies. In addition, our customers make payments for our service on an irregular basis. Some pay small amounts, some pay large amounts. If I am only to look at the sum of payments, I view very lumpy data (e.g high payment followed by weeks of no payment) so I would like to smooth the lumpy data with a set period (8-weeks?) and only display the average payment in the current weekly window for each customer. I have attached 2 screenshots. The first illustrates the sum of all payments on a weekly basis. The second picture displays the weekly sum for each customer.

I've not got much input into the implementation questions that @sbelak asked above but this feature would really help understanding trends over time.

Cheers,

Mark

Sum of weekly payment history

Customer payment history

@sbelak
Copy link
Contributor Author

sbelak commented Apr 29, 2019

@markpaygo: thanks.
Btw the last couple of versions of MB come with support for trend lines https://metabase.com/docs/v0.32.0/users-guide/05-visualizing-results.html which might help you some as well.

@markpaygo
Copy link

markpaygo commented Apr 29, 2019 via email

@brunobergher brunobergher changed the title Sliding windows in MBQL Custom Expression: Window functions Jan 15, 2024
@joshstrike
Copy link

Being able to window in the builder would be awesome, but usually for me the partition would not be time; usually it's what needs to be partitioned over time. A simple use case for me is showing cumulative customer creation over time at different locations. The cumulative count function as it stands now leaves now way to properly break these out (their lines all show additive to each other, as seen in other bug reports). Window partition by location and order by (location, time) is the only way to get the data in, as it stands.

Just as a suggestion, rather than even going as far as to support native window functions, it would be sufficient if the cumulative and sum functions allowed you to group by another column (or two) or to write an expression on when/if they should add things to their tally. Something like a "filtered cumulative".

@megansmcguire
Copy link

Another use case to throw on the pile here (I can add this as its own feature request, but it ties well to the window/group by interactivity discussion here):

If I've grouped on 2 attributes (e.g. product and timestamp::month) and summed up sales, I'd love to be able to filter my display to the top N selling products per month.

Ideally, in the question editor, I'd build my question to do Sum of Sales by Product & Month; after I get to that summary, I'd limit to the top 3 rows (sorted by sales) per month, and display the products and their sales in that month.

Even without a filter, sorting based on the summary, but per grouping (Order by Month, Sum of Sales), where the sort order can be different per grouping, would be visually impactful for this use case (right now, it looks like if I do that sort order, it's sorting by the overall sum of sales, not the weekly sum).

For now, I'll write this in SQL or just display all results, but this is another case for window functions and a case where windows and groupings interact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Querying/GUI Query builder catch-all, including simple mode Querying/MBQL Querying/Processor .Team/QueryProcessor :hammer_and_wrench: Type:New Feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants