Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce support for sliding window aggregations #6873

Open
derekjn opened this issue Jan 15, 2021 · 8 comments
Open

Introduce support for sliding window aggregations #6873

derekjn opened this issue Jan 15, 2021 · 8 comments
Labels
enhancement P1 Slightly lower priority to P0 ;) sql-syntax streaming-engine Tickets owned by the ksqlDB Streaming Team

Comments

@derekjn
Copy link
Contributor

derekjn commented Jan 15, 2021

Support for sliding window aggregations has been added to Kafka Streams (KIP-450), so we should introduce support for them in the ksqlDB syntax:

SELECT x, count(*) FROM s WINDOW SLIDING (SIZE 1 SECOND) GROUP BY x EMIT CHANGES;
@derekjn derekjn changed the title Introduce support for sliding windows Introduce support for sliding window aggregations Jan 15, 2021
@mjsax mjsax added the streaming-engine Tickets owned by the ksqlDB Streaming Team label Jan 15, 2021
@vcrfxia vcrfxia added sql-syntax P1 Slightly lower priority to P0 ;) labels Jun 15, 2021
@anupsdtiwari
Copy link
Contributor

@vcrfxia - By when can we expect this feature in ksqldb ? Actually this is very common requirements and since ksqlDB doesn't support this, we have to look for some other query engine.

@vcrfxia
Copy link
Contributor

vcrfxia commented Jul 8, 2021

Hi @anupsdtiwari , this feature is among the items being considered for our roadmap but we don't have a specific release date to share yet. BTW, are you sure that your use case requires sliding windows rather than hopping windows? (There is some inconsistency in terminology within the stream processing field here.) Hopping windows are indeed a very common use case, but we have not seen as much demand for sliding windows. If you're willing to share your use case for sliding windows that'd be great as another data point. Thanks.

@anupsdtiwari
Copy link
Contributor

Hi @vcrfxia - Actually sliding window seems more common at our end.

My use case is as follows :-

At any point of time last N hours/days/minutes/months of aggregated data is needed.
Eg :- At any point of time, what was the number of clicks done by the user on a button in last 24 hours ? Similarly number of transactions / total amount added to user's wallet in last 2 days etc.

Also you can refer below example :-
Lets say at current_time 2021-07-09 10:00:01 AM for a user id 201, the number of clicks in the last 48 hours(2021-07-07 10:00:02 AM) is 50. But if i hit pull query again for this user at current_time 2021-07-09 13:00:01 PM then count should be from 2021-07-07 13:00:01 PM to current_time which might be different from 50.

@kaydraka
Copy link

Hi, I have the same requirements to @anupsdtiwari .
Alternative way to use Hopping window makes too many messages for me.

@anupsdtiwari
Copy link
Contributor

@vcrfxia @mjsax @derekjn
Do we have any updates on this ticket ?

@johnnycashash
Copy link

johnnycashash commented Dec 8, 2021

Hi, There is one requirement where it is needed to trigger different resultant events as soon as a certain type of input event count crosses multiple thresholds.

As soon as it crosses first threshold, we trigger event 1 and so on for upcoming thresholds...

We can utilize hopping window in this use case by doing a workaround of removing duplicates, but it requires some more streams to be created and processed.

By sliding window, this will be a much simpler query to emit those event.
ex:
SELECT x, type, count() FROM s WINDOW SLIDING (SIZE 1 HOUR) where type="A" GROUP BY x having count()=10 EMIT CHANGES;

SELECT x, type, count() FROM s WINDOW SLIDING (SIZE 1 HOUR) where type="A" GROUP BY x having count()=20 EMIT CHANGES;

@anupsdtiwari
Copy link
Contributor

@vcrfxia @mjsax @derekjn

Guys - Just checking, do we have any updates on this ticket ?

@cefothe
Copy link

cefothe commented Dec 30, 2022

Hey folks.

one more use case in the sliding window will be handy: I want to calculate the crypto volume for the last 24 hours.

My data looks like this:
{ "pair": "BTC/USDT", "price": 16603.04, "amount": 0.01014, "side": "sell" }

Currently, I am trying something like:
select pair ,sum(amount) as volume, WINDOWSTART AS window_start, WINDOWEND AS window_end from TRADES_STREAM WINDOW HOPPING (SIZE 24 hours, ADVANCE BY 1 seconds, GRACE PERIOD 0 MINUTES) GROUP BY pair EMIT CHANGES;

But as you can see, this hopping window will create many small windows, but I want only one moving window that always gives me the result for the last 24 hours by moving the window every 1 second.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement P1 Slightly lower priority to P0 ;) sql-syntax streaming-engine Tickets owned by the ksqlDB Streaming Team
Projects
None yet
Development

No branches or pull requests

8 participants