Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI hangs and throws TimeoutException in 0.10.0 #5724

Closed
mikebin opened this issue Jun 29, 2020 · 6 comments · Fixed by #5730
Closed

CLI hangs and throws TimeoutException in 0.10.0 #5724

mikebin opened this issue Jun 29, 2020 · 6 comments · Fixed by #5730
Assignees
Labels
Milestone

Comments

@mikebin
Copy link

mikebin commented Jun 29, 2020

Describe the bug
The ksqlDB CLI hangs and throws TimeoutException(s) in v0.10.0 after running multiple transient queries.

To Reproduce
Steps to reproduce the behavior, include:
Version: 0.10.0

Create a simple stream with one record to query:

create stream s (val string) with (kafka_topic='s', value_format='json', partitions=1);
insert into s (val) values ('test');

Now run multiple transient queries sequentially against this stream multiple times in close succession, ending each query with Ctrl-c:

ksql> select * from s emit changes;
+--------------------+
|VAL                 |
+--------------------+
|test                |
^CQuery terminated

Eventually, after a certain number of repetitions, the query will hang with no output, possibly for several minutes. Meanwhile, most or all of the prior queries will still be running in the background (if you run show queries in another CLI session, you'll see a big list).

Eventually the hung query will return results.

Expected behavior

Each query is terminated immediately when Ctrl-c is entered.

Actual behaviour

After entering Ctrl-c to stop the query, the CLI prompt immediately returns, but the underlying Streams app/query continues to run for quite a while (often a couple of minutes or more) before finally terminating. During that period, either before or after the Streams app finally terminates, the following exception appears in the log:

ksqldb-server       | io.confluent.ksql.util.KsqlException: java.util.concurrent.TimeoutException
ksqldb-server       |   at io.confluent.ksql.api.server.ResponseOutputStream.blockIfWriteQueueFull(ResponseOutputStream.java:80)
ksqldb-server       |   at io.confluent.ksql.api.server.ResponseOutputStream.write(ResponseOutputStream.java:62)
ksqldb-server       |   at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81)
ksqldb-server       |   at java.base/java.io.BufferedOutputStream.flush(BufferedOutputStream.java:142)
ksqldb-server       |   at io.confluent.ksql.rest.server.resources.streaming.QueryStreamWriter.write(QueryStreamWriter.java:84)
ksqldb-server       |   at io.confluent.ksql.api.server.OldApiUtils.lambda$streamEndpointResponse$2(OldApiUtils.java:122)
ksqldb-server       |   at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:316)
ksqldb-server       |   at io.vertx.core.impl.TaskQueue.run(TaskQueue.java:76)
ksqldb-server       |   at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
ksqldb-server       |   at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
ksqldb-server       |   at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
ksqldb-server       |   at java.base/java.lang.Thread.run(Thread.java:834)
ksqldb-server       | Caused by: java.util.concurrent.TimeoutException
ksqldb-server       |   at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
ksqldb-server       |   at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
ksqldb-server       |   at io.confluent.ksql.api.server.ResponseOutputStream.blockIfWriteQueueFull(ResponseOutputStream.java:77)
ksqldb-server       |   ... 11 more

Additional context

The TimeoutException mentioned above appears to occur within about a minute after Ctrl-c, which may be before or after the actual Streams app terminates.

@apurvam
Copy link
Contributor

apurvam commented Jun 29, 2020

@vcrfxia @purplefox Seems to be a bug in the new vert.x server side push implementation.

@mikebin
Copy link
Author

mikebin commented Jun 29, 2020

One additional note - if I spaced out the transient queries (instead of running them in close succession), giving each more time to actually terminate, I didn't encounter the hanging issue as quickly (or at all). So there seems to be an element of timing on this. There wasn't a consistent number of transient queries that led to this issue.

@vcrfxia
Copy link
Contributor

vcrfxia commented Jun 29, 2020

Reverting #5386 appears to fix the bug. I don't understand the mechanism for how this change results in this behavior yet, though.

@big-andy-coates
Copy link
Contributor

Obviously, we can't just revert #5386 as this will reintroduces the bug it was fixing: namely stopping transient query results ballooning into memory. Though at least now we know what area to look in the code!

@apurvam
Copy link
Contributor

apurvam commented Jul 1, 2020

Repoening until this is merged to 6.0.x / 0.10.x

@apurvam apurvam reopened this Jul 1, 2020
@vcrfxia
Copy link
Contributor

vcrfxia commented Jul 1, 2020

The fix has already been backported to 6.0.x (thanks, Tim!): 663a67b

Closing again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants