Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KLIP-20: remove TERMINATE #4126

Conversation

big-andy-coates
Copy link
Contributor

@big-andy-coates big-andy-coates commented Dec 12, 2019

Description

Initial drop of KLIP-20 which proposed to remove the need for the TERMINATE statement.

@big-andy-coates big-andy-coates requested a review from a team as a code owner December 12, 2019 16:06
@big-andy-coates big-andy-coates added this to In progress in Language changes via automation Dec 16, 2019
@hjafarpour
Copy link
Contributor

I assume if we have INSERT INTO as discussed here (#4125 (comment)), we should also keep TERMINATE.


## What is in scope

* Removing `TERMINATE` from our SQL syntax.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of removing TERMINATE. Can't we just drop the requirement that the query be terminated to drop the stream/table?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot drop a stream/table if a query is writing into it. Otherwise we will have queries that write into zombie stream/tables!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjafarpour - I think @rodesai means to automatically discover and terminate the queries writing into it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not drop TERMINATE ? What use is it? Trad RDBS doesn't expose the persistent query updating materialized views to users. It's an implementation detail that shouldn't be exposed IMHO.

Copy link
Member

@mjsax mjsax Jan 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that dropping a MV, should imply that the corresponding query is stopped. However, I think it would still be useful to allow users to terminate a persistent query directly. I also commented on KLIP-17, as I think, keeping INSERT INTO as-is is useful.

This KLIP seems to relate to the "ownership model" that ksqlDB is lacking (ie, it's not well defined yet). If we assume that the output topic is not owned by ksqlDB, but a query write into it, it make total sense to allow users to just terminate the query.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This KLIP seems to relate to the "ownership model" that ksqlDB is lacking

I don't see how it is...

If we assume that the output topic is not owned by ksqlDB, but a query write into it, it make total sense to allow users to just terminate the query.

To be able to have a persistent query that writes to the topic there must be a source registered in KSQL that is backed by the topic. Currently this would either be by a CREATE STREAM or CREATE STREAM AS SELECT statement. Neither of these currently embody any sense of ownership.

Dropping the source created by the CS or CSAS doesn't currently delete the topic, and that wouldn't change with this KLIP.

What this KLIP proposes is that if the user wants to stop the query they would just drop the source. The query is an implementation detail.

@agavra agavra requested a review from a team December 16, 2019 23:46
@big-andy-coates big-andy-coates changed the title chore: initial drop of KLIP-20: remove TERMINATE KLIP-20: remove TERMINATE Dec 20, 2019
@apurvam apurvam added the design-proposal Tag KLIP Prs with this label label Jan 2, 2020
Copy link
Contributor

@purplefox purplefox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 To a better user experience.

Instead of asking the question "Should we drop terminate", I'd rather ask "If we were designing ksql again now from scratch would we introduce a terminate statement?". To me that's a pretty clear no.

As Andy points out queries are an implementation detail that shouldn't be exposed to users - it just causes unnecessary confusion and damages the user experience.

Even if a separate terminate is useful for some edge cases, as a general rule we should be very cautious of damaging the 99% user experience to make things easier for the 1%.

@mjsax
Copy link
Member

mjsax commented Jan 27, 2020

Instead of asking the question "Should we drop terminate", I'd rather ask "If we were designing ksql again now from scratch would we introduce a terminate statement?". To me that's a pretty clear no.

Without an clear "ownership model" you cannot really answer this question IMHO. Before we remove something that we might need to reintroduce later again, it might be better to keep it until we have an answer.

@big-andy-coates
Copy link
Contributor Author

Not sure how the ownership model fits here. Can you explain @mjsax ?

@derekjn
Copy link
Contributor

derekjn commented Feb 19, 2020

@big-andy-coates I know this isn't currently possible, but theoretically speaking if we wanted to introduce the ability to terminate a transient query, how would we do that after removing TERMINATE?

@mjsax
Copy link
Member

mjsax commented Feb 21, 2020

@big-andy-coates If the ownership model assumes that for example all MVs are owned by ksql, than it makes sense to delete a MV when a query is stopped, and thus, we can reverse the logic and say, if the MV is dropped, the query is terminated (hence, no TERMINATE keyword needed). However, if we assume that a TABLE is owned by an external entity, and maybe there are even multiple query pushing into the TABLE (note, for this case the result of the query would not be a MV but we have multiple "update streams" that are all applied/merged into the same result TABLE) than it make totally sense to have a TERMINATE statement that allows to stop queries without deleting the TABLE. However, there is no ownership model atm in ksql that defines what we actually want to support.

@mjsax
Copy link
Member

mjsax commented Feb 21, 2020

@derekjn I am not sure if I can follow? If you have a transient query, it's either issued in the CLI and one can just hit CTRL+C or via the new client protocol and the client can just close the connection that terminates the query. There is no TERMINATE involved atm and thus removing TERMINATE does not affect transient queries.

@derekjn
Copy link
Contributor

derekjn commented Feb 21, 2020

@derekjn I am not sure if I can follow?

@mjsax the use case here would be a cluster admin terminating transient queries independently of the session that they’re running in.

@mjsax
Copy link
Member

mjsax commented Feb 21, 2020

Thanks @derekjn -- that makes a lot of sense. One more reason to keep TERMINATE and to drop this KLIP.

@big-andy-coates
Copy link
Contributor Author

big-andy-coates commented Feb 28, 2020

@derekjn

@big-andy-coates I know this isn't currently possible, but theoretically speaking if we wanted to introduce the ability to terminate a transient query, how would we do that after removing TERMINATE?

Sure, that may be a valid use of TERMINATE. However, we don't have that functionality at the moment and what this KLIP is basically proposing is that the persistent query(s) running to keep a source up-to-date are an implementation detail, and hence should not be exposed to users.

@mjsax

Thanks @derekjn -- that makes a lot of sense. One more reason to keep TERMINATE and to drop this KLIP.

Ouch! ;). As per my comment to @derekjn above, this KLIP may be badly named: it's mostly about hiding persistent queries as they are an implementation detail.

@big-andy-coates
Copy link
Contributor Author

@big-andy-coates If the ownership model assumes that for example all MVs are owned by ksql, than it makes sense to delete a MV when a query is stopped, and thus, we can reverse the logic and say, if the MV is dropped, the query is terminated (hence, no TERMINATE keyword needed). However, if we assume that a TABLE is owned by an external entity, and maybe there are even multiple query pushing into the TABLE (note, for this case the result of the query would not be a MV but we have multiple "update streams" that are all applied/merged into the same result TABLE) than it make totally sense to have a TERMINATE statement that allows to stop queries without deleting the TABLE. However, there is no ownership model atm in ksql that defines what we actually want to support.

I think there something here that I'm just not understanding. I really don't see how this is at all related to an ownership model.

However, if we assume that a TABLE is owned by an external entity ... than it make totally sense to have a TERMINATE statement that allows to stop queries without deleting the TABLE

I think you're assuming that dropping the table deletes the topic. That is not the case! Hence, for your example, if the stream/table is owned by another system, but there is a C*AS running in ksql that is writing to it, then there is still no need for TERMINATE. The user can just drop the table/stream in KSQL.

Essentially, TERMINATE leaves a C*AS in a weird zombie state. There is no way to restart the query that was terminated.

To sum up: If we get rid of INSERT INTO then I see no reason to keep TERMINATE. If we choose to keep INSERT INTO then we still need TERMINATE.

My vote is strongly for replacing INSERT INTO with a combination of UNION ALL and ALTER TABLE, and then dropping TERMINATE.

@mjsax
Copy link
Member

mjsax commented Mar 10, 2020

I think you're assuming that dropping the table deletes the topic.

Yes, obviously (if KSQL owns the TABLE). Otherwise, users need to manually cleanup/GC topics what is a pain to do.

Dropping the source created by the CS or CSAS doesn't currently delete the topic, and that wouldn't change with this KLIP.

Well, that does exactly boil down to the ownership model. If KSQL owns the MV and the MV is delete, the corresponding topic should be deleted, too. Same for a TABLE that ksql owns. However, if KSQL does not own the TABLE it should not delete the topic.

Also, if I define a TABLE over a topic that KSQL doesn't own, I can still want to keep this TABLE to be user for other queries to read from it. Or I actually want to start a new query that INSERT INTO this TABLE (I am in favor to keep INSERT INTO for that reason).

Essentially, TERMINATE leaves a C*AS in a weird zombie state. There is no way to restart the query that was terminated.

Not sure what you mean by that?

The query is an implementation detail.

IMHO, this only holds for MV, but not for TABLES -- a TABLE is an independent entity and not coupled to any query that might right into it.

@guozhangwang
Copy link
Contributor

In CCloud, a topic may be shared among multiple users or be exclusively owned by a single user account. I'm wondering for ksqlDB in CCloud, do we allow different user accounts write persistent queries piping to the same sharable topics?

If not, the the ownership is pretty straight-forward: the topics created in ksqlDB is always "exclusive" to the user who writes the persistent queries resulting them, and as long as that user wants to drop the topic, all corresponding queries could be dropped all together since they should be owned by that same user.

If yes, then it's a bit tricky: by writing my query to a sharable topic in CCloud, I'm basically creating a query not to my own, but shared in the CCloud (of course, they would still be under same org account). This means that I basically allow anyone who has authority to the topic to also have authority to terminate that query. That is okay as long as we made it clear in our doc.

@big-andy-coates big-andy-coates deleted the KLIP-20-Remove-Terminate branch June 30, 2020 16:14
Language changes automation moved this from In progress to Done Jun 30, 2020
@big-andy-coates big-andy-coates restored the KLIP-20-Remove-Terminate branch June 30, 2020 16:35
Language changes automation moved this from Done to In progress Jun 30, 2020
@spena
Copy link
Member

spena commented Aug 5, 2020

@big-andy-coates Can we implement part of this KLIP, such as terminate the internal C*AS query when issuing the DROP STREAM|TABLE, but only if there are no other queries writing or reading from it? It would improve the current UX a lot by allowing that at least.

FYI @colinhicks @agavra

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design-proposal Tag KLIP Prs with this label
Projects
Development

Successfully merging this pull request may close these issues.

None yet

10 participants