New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingestion task fails with InterruptedException when handling the segments #10866
Comments
Same problem in |
I know as much as you do. No reply so far :/
…On Sun, 14 Feb 2021 at 12:48, Eray ***@***.***> wrote:
Is there any update about this?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#10866 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALCYT7PVACAR24FOQJCONLS66Z7ZANCNFSM4XKXWNOA>
.
--
Diego Lucas Jiménez
|
I also see this issue Intermittently. I am using version 0.20.1. |
No ack from druid so far :/ |
Facing the same issue here :( |
can you share the full logs of the Kafka indexing task and supervisor? |
Mhmhmh... how do I get that logs? |
Facing a similar issue, @tanisdlj indexing logs are determined by |
Thanks @bengriffin1 !! Right now I don't have any task failing with this specific error, so I have no access to that log. If @bengriffin1 has some logs might use them instead. Otherwise next time I see it will post it here. |
Ok, another crash. Supervisor task:
|
Middlemanager log:
|
Overlord log |
☝️ There you go @abhishekagarwal87 |
I have same issue intermittently in druid 0.21.0 (launch in docker containers), only when sending very few data in kafka (one line to insert in a new datasource). Does someone found any cause or workaround ? |
No, still happening to me. No idea how to fix, waiting to see if we have any answer here :/ |
The log line above from the overlord log looks incomplete. Was this modified? |
@a2l007 no, it wasn't modified. No clue what happened there, sorry :/. |
I see the log messages getting clipped at multiple places. Not sure what's going on there. |
Sorry, but no, we can't reduce the task count for ssp-events-hourly without losing data :( |
Hello there ! So, as I said, we randomly have a similar error (logs are not exactly the same, but it seems to be quite similar to me) on indexing data in druid from kafka streaming. We are on druid 0.21.0 but we had the same problem in 0.20.1. When creating data in druid, in fact we have 4 supervisors created. All of them on a similar scheme, pointing to slightly different datasources. Example of supervisor spec :
example of data in kafka topic :
Usually, everything is fine, but randomly one of the 4 ingesting tasks fails (4 tasks as we created 4 supervisors). It seems to only happens when ingesting few lines (let's say less than 10). Here is the stackTrace we obtain in druid 0.20.1 : Here is the stackTrace we obtain in druid 0.21.0 : |
In the broker, we have those logs : Hope it will help. |
Isn't stackTrace and description similar to #9207 ? |
the same issue for me, version is 0.21.1 |
@tanisdlj Hi,I had the same problem with druid version 0.16。Is it solved now? |
Not yet, no :( |
Seems to continue to randomly reproduce in druid 0.22.0. complete log of the task: Many things are quite strange in the logs.
Other thing is strange: Any idea what's going on here ? |
Strangely, the problem disappeared in our last version of our code. Don't really see what have really changed. Just, one datasource disappeared, so we just kafka ingest in three datasources in parallel instead of 4. And datas send are a little different. And we now use druid 0.22.1, but really not sure it is linked, problem seems to have disappeared a little bit before that. Hard to be sure, as it happens randomly. |
This issue has been marked as stale due to 280 days of inactivity. |
This issue has been closed due to lack of activity. If you think that |
Affected Version
0.20.0
Description
We've seen tasks failing randomly. Didn't managed to get a pattern or cause, so I'm thinking might be some sort of bug.
The tasks ends up with the status:
When happened, happened to three tasks that started around the same time (+-10 min)
Cluster size: 27 Dataservers (Middlemanager + Historical), 2 Brokers, 2 Routers, 2 Overlords, 2 Coordinator, 3 Zookeeper (in dedicated hosts).
Partial log:
The text was updated successfully, but these errors were encountered: