-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create or Replace command to modify KSQL query in place #2440
Comments
Here's a little bit more insight on this one. Currently, the One approach is to set A proposed solution is to allow setting the Streams |
This KSQL behavior (Generating brand new consumer group for recreated persistent query) also have some other side effects. |
We are building pipelines through KSQL to feed our analytics warehouse. Our policy is for minor schema version changes, only new fields can be added. It would be great if we could re-create the pipeline of streams, and not have to set offset to earliest, but, pass the offset to start with to the first stream in the pipeline in order to be able to re-create all our streams and pick right up where we left off, rather than having to start at the beginning... |
@nbyrnes-acv Read this blog and see if you think it can help. If you have any questions, feel free to reach out. |
I'm really surprised this issue is not rated higher. I must be missing something... are other people working around this somehow? |
cc @agavra . Worth tracking this as part of the new work to support query upgrades. This could be a low hanging fruit we could tackle in a early milestone. The ability to resume could be something we can fix early in the process. |
Yes - I believe there is someone already working on a KLIP for this (cc @eshepelyuk), but I totally agree this is something we should implement: #4622 |
This will be available (in some fashion, see documentation here: https://github.com/confluentinc/ksql/blob/master/docs/concepts/upgrades.md) in the next release (0.12.0)! |
Not able to access this link - https://github.com/confluentinc/ksql/blob/master/docs/concepts/upgrades.md. Instead of 'CREATE OR REPLACE' , Will dropping and recreating the KSQL stream resume processing from the same offset value from the previously terminated stream / query ? We are planning to update our KSQL streams in production and we would like to be mindful of any chance of data loss which will affect the downstream systems. Please suggest. Thank you. |
My customer has a set of Kafka topics sourced from a mixture of REST API processes and Oracle Golden Gate. These process can not readily be suspended (well they can, but it is not trivial through the number of moving parts). They also have some KSQL code processing and transforming data as it passes to another system. This uses multiple KSQL tables and streams.
When they need to modify the KSQL code they need to terminate the running queries, drop and replace the KSQL Streams and Tables (CTAS and CSAS).
The problem they have is that the create queries run from the current offset of the source topic and the data streamed between terminate and create is lost. They had hoped that as the KSQL topic names were the same that the previously read offset would be reused from the underlying topic metadata. However it seems that each new query uses its own uniquely named consumer group. The alternative of setting offset to earliest is not viable as this puts a large amount of already processed data on the output streams
Is there a way that we can associate a query to a specific start offset?
The text was updated successfully, but these errors were encountered: