Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[multistage] Support Window Functions #7213

Closed
jackjlli opened this issue Jul 27, 2021 · 18 comments
Closed

[multistage] Support Window Functions #7213

jackjlli opened this issue Jul 27, 2021 · 18 comments
Assignees
Labels
Design Review In Progress This work is in progress. multi-stage Related to the multi-stage query engine

Comments

@jackjlli
Copy link
Member

Currently in pinot there is no way to query Pinot table in a sliding window fashion. E.g. suppose I want to get the 7-day average number of Covid-19 cases. For now in order to do this, we'd need to send multiple queries; one query for each window.

We would like Pinot to support aggregation functions (like sum, count, avg, T-digest, etc) on sliding window within one single query.

@jackjlli
Copy link
Member Author

LinkedIn Pinot team is currently working on this feature right now. We'll share the detailed design shortly.

@ashishkf
Copy link

ashishkf commented Sep 5, 2021

Can you share some high level details about the approach being taken? Is it based aggregation UDF or does it requires changes to pinot core itself?

@kishoreg
Copy link
Member

kishoreg commented Sep 5, 2021

It will be good to add an interface for sliding window aggregates functions similar to transform and aggregate function.

Once we have that, we can implement moving average etc

@jackjlli
Copy link
Member Author

jackjlli commented Nov 4, 2021

Hi all,

Here is the design document for supporting window functions in Pinot:
https://docs.google.com/document/d/17qUJl5hL1E8haB9APDsZTYa7askR6n5B0IQqyyZ3dYs/edit?usp=sharing

Please feel free to leave any comments or feedback.

Thanks,
Jack

@kishoreg
Copy link
Member

kishoreg commented Nov 4, 2021

Thanks for putting this together. Very good write up.

High level thoughts

  • can we do match recognize instead of window? It's much more modern and more powerful than window functions and is a superset
  • in the first version, let's make sure we only support functions where local aggregation is possible or the number of records returned by server is limited (this can be configured by the user).. without this, even simple window functions can bring down the server.. we will need this for joins and sub queries as well
  • should we create the query spi layer first to support FULL sql syntax parsing? This Will also allow us to build the window function operator incremental without invasive changes

Overall, this is a great initiative and let's try to think few steps ahead and lay the right foundation

@amrishlal
Copy link
Contributor

can we do match recognize instead of window?

My understanding is that Window Functions would be applied over the resultset being produced by a query to show additional column aggregations by partitioning the resultset. match_recognize seems to work over the table produced by the FROM clause to output another table that matches certain pattern sequences. So they appear to be different in that sense, some of the internal operators for window function could apply to match_recognize as well. I think one could apply window functions over the output produced by match_recognize?

@jackjlli
Copy link
Member Author

jackjlli commented Nov 4, 2021

in the first version, let's make sure we only support functions where local aggregation is possible or the number of records returned by server is limited

Totally agree on that. Window function is not the only feature facing this (e.g. post aggregation in order-by, having clause with LIMIT, etc). Same for future features like joins and sub queries.

should we create the query spi layer first to support FULL sql syntax parsing? This Will also allow us to build the window function operator incremental without invasive changes

Do we know when we aim to finish the query spi layer? If it will take a while, we can first do the implementation on the execution part with the existing way of query compilation and then wire in with the new query spi layer once that's ready.

@Jackie-Jiang
Copy link
Contributor

Jackie-Jiang commented Dec 3, 2021

@jackjlli Weixiang (@weixiangsun) is working on the gapfill function (#7422) which is quite similar to the window function. Both features require reading the results from previous rows. Can you sync up with each other on the design so that we don't duplicate the implementation? I feel a lot of the code can be shared across the 2 features.

@amrishlal
Copy link
Contributor

@kishoreg @mayankshriv @Jackie-Jiang @richardstartin we are planning to start window function implementation soon. Is there anything remaining with respect to design or any other related issue that needs to be discussed before we do so?

@kishoreg
Copy link
Member

@amrishlal

Can you please address the comments from @Jackie-Jiang in the design doc? This is a complex feature and the design doc is mostly describing how Pinot operators work. It will help me a lot if you can expand on the following topics

  • How will the Window Operator work
  • Current proposed solution of pulling all records into a broker is not a scalable solution. Pulling all records into a single broker memory is not going to scale.
  • Without a distributed solution, the response times are going to be orders of magnitude slower than running it via Presto/Trino connector
  • In a typical database, Window operators are built on solid distributed partitioning, sorting primitives, and have some commonalities with primitives needed JOIN. Adding Joins first and then the window function is better in terms of sequencing.
  • Nothing related to the design but the estimates in the design doc gives me a feeling that we are grossly underestimating the complexity here

Overall I am +1 on adding this feature but -1 on the design.

@amrishlal
Copy link
Contributor

amrishlal commented Aug 15, 2022

@kishoreg @mayankshriv @Jackie-Jiang @richardstartin

I got a bit sidetracked with other work after our last discussion on window functions, but would like to restart again. We have updated design based on the discussions that we had last time. There is also a fairly mature prototype that I am in process of cleaning up and finalizing. We would like to incorporate this code into Pinot. Please let me know your thoughts and comments.

cc: @siddharthteotia, @jackjlli, @mcvsubbu

@Jackie-Jiang
Copy link
Contributor

Are we planning to implement this on the broker side? I'm okay with that as an initial version, but that won't be the long term solution, and will have a lot of constraints due to the limited resource on a single node. Long term solution should be to implement it on the multi-stage engine. cc @walterddr to also chime in

@amrishlal
Copy link
Contributor

@Jackie-Jiang only the merge will be done on Broker side, the computation of window functions is on the server side. I think @siddharthteotia is going to be setting up time with you and @walterddr to discuss further. We do want to port the design to multi-stage engine as well.

@Jackie-Jiang
Copy link
Contributor

@amrishlal I don't see that in the design doc. If you have created a new one, please share it here.
IMO, it is not really possible to compute the window function on the server side without getting all the records first. Essentially broker needs to pull the matching records from server, then do the calculation.

@somandal
Copy link
Contributor

I'll be working on this issue

@siddharthteotia siddharthteotia added the In Progress This work is in progress. label Jan 26, 2023
@somandal
Copy link
Contributor

somandal commented Jan 26, 2023

Have a design document for phase 1 ready for review: https://docs.google.com/document/d/13CmFm4djI09JKF_Xty5acoXxJoxC9CLXAsmikgzgtIs/edit#heading=h.54yrqayotu6g

@siddharthteotia siddharthteotia added the multi-stage Related to the multi-stage query engine label Jan 26, 2023
@siddharthteotia siddharthteotia changed the title Support generic sliding window aggregations [multi stage] Support window functions Jan 26, 2023
@siddharthteotia siddharthteotia changed the title [multi stage] Support window functions [multistage] Support Window Functions Jan 26, 2023
@siddharthteotia
Copy link
Contributor

Initial set of query planning changes have been merged. Implementation for runtime is in progress

@Jackie-Jiang
Copy link
Contributor

Closing this as the basic window functions are all supported

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Design Review In Progress This work is in progress. multi-stage Related to the multi-stage query engine
Projects
None yet
Development

No branches or pull requests

7 participants