-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Middlemanager fails startup due to corrupt task files #7886
Comments
Hi @xvrl 👋 |
@vogievetsky looks like this startup behavior has been around since https://github.com/apache/incubator-druid/pull/5104/files . I'll have to look if the creation of empty completed task files is a regression or not |
Yeah the 'Ignored.' is wrong at least. |
yeah startup should be more forgiving and also the code writing those files should try to write them atomically as much possible to reduce the likelihood of empty/corrupted files. |
@himanshug , I have the same issue in version 0.14.2 after upgrading from 0.12.3. which release contains that fix? |
@david-z-johnson this will be made available in next Druid release , for now please use the workaround of deleting corrupted files. |
@himanshug , thanks, the work around works for me. |
Affected Version
0.15.0-SNAPSHOT (git sha d99f77a)
Description
A middle-manager shutdown may leave empty task files in
${druid.indexer.task.baseTaskDir}/completedTasks/
. This may be an issue on it's own, but it could also happen for reasons beyond our control.Those empty (corrupt) files cause the middlemanager to fail on a subsequent startup, due to https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/worker/WorkerTaskManager.java#L430 re-throwingh a JsonProcessingException
The exception message also looks incorrect, saying the files would be ignored, but instead it causes the entire startup sequence to interrupt, requiring user intervention to remove corrupt files in order to resume startup.
The text was updated successfully, but these errors were encountered: