Skip to content

Indexing Task failing : Unable to delete directory #5893

@shalini-jha

Description

@shalini-jha

We are trying to index files using ingestionSpec:
{
"type" : "index_hadoop",
"spec" : {
"ioConfig" : {
"type" : "hadoop",
"inputSpec" : {
"type" : "static",
"paths" : None
}
},
"dataSchema" : {
"dataSource" : None,
"granularitySpec" : {
"type" : "uniform",
"rollup" : False,
"segmentGranularity" : "minute",
"queryGranularity" : "none",
"intervals" : None
},
"parser" : {
"type" : "string",
"parseSpec" : {
"format" : "json",
"dimensionsSpec" : {
"dimensions" : None
},
"timestampSpec" : {
"format" : "auto",
"column" : "interval"
}
}
},
"metricsSpec" : None
},
}
}
(we update dimensions, metrics and other keys before posting)
We are facing the error(see logs below), we have all the permissions for the directory mentioned.
Whenever the number of rows in the file is greater than 75000 i.e the default maxRowsInMemory parameter for tuningConfig, the tasks try to delete intermediate files and fail.

We also tried setting the tuning config :

"tuningConfig" : {
"type" : "hadoop",
"leaveIntermediate" : True,
}
so that druid doesn't try to delete these intermediate files, but that doesn't seem to work, jvm tasks are still trying to delete the files.

We also tried setting "segmentWriteOutMediumFactory" : "offHeapMemory"

But still temprory files are used, the logs mention:
--2018-06-20T09:29:36,396 INFO [index00000-persist] io.druid.segment.IndexMergerV9 - Using SegmentWriteOutMediumFactory[TmpFileSegmentWriteOutMediumFactory]

So these tasks fail, whenever we have rows more than number of rows specified in maxRowsInMemory.
What is the problem here?

java.lang.Exception: java.io.IOException: Unable to delete directory /proj/scratch-nb-jhasha/logs/tmp/middlemanager/base9088930291904976125flush/final.
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) ~[hadoop-mapreduce-client-common-2.7.3.jar:?]
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) [hadoop-mapreduce-client-common-2.7.3.jar:?]
Caused by: java.io.IOException: Unable to delete directory /proj/scratch-nb-jhasha/logs/tmp/middlemanager/base9088930291904976125flush/final.
at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1581) ~[commons-io-2.5.jar:2.5]
at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:795) ~[druid-indexing-hadoop-0.12.1.jar:0.12.1]
at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:500) ~[druid-indexing-hadoop-0.12.1.jar:0.12.1]
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) ~[hadoop-mapreduce-client-common-2.7.3.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_60]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_60]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_60]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_60]
at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
2018-06-20T08:26:44,453 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_local743874811_0001 failed with state FAILED due to: NA
2018-06-20T08:26:44,468 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 30

This further causes the following error:

2018-06-20T09:30:37,364 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_test_2018-06-20T09:29:17.966Z, type=index_hadoop, dataSource=test}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:222) ~[druid-indexing-service-0.12.0.jar:0.12.0]
at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:238) ~[druid-indexing-service-0.12.0.jar:0.12.0]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:444) [druid-indexing-service-0.12.0.jar:0.12.0]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:416) [druid-indexing-service-0.12.0.jar:0.12.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_60]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_60]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_60]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_60]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_60]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_60]
at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_60]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:219) ~[druid-indexing-service-0.12.0.jar:0.12.0]
... 7 more
Caused by: io.druid.java.util.common.ISE: Job[class io.druid.indexer.IndexGeneratorJob] failed!
at io.druid.indexer.JobHelper.runJobs(JobHelper.java:390) ~[druid-indexing-hadoop-0.12.0.jar:0.12.0]
at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:95) ~[druid-indexing-hadoop-0.12.0.jar:0.12.0]
at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:293) ~[druid-indexing-service-0.12.0.jar:0.12.0]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_60]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_60]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_60]
at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_60]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:219) ~[druid-indexing-service-0.12.0.jar:0.12.0]
... 7 more
2018-06-20T09:30:37,374 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_test_2018-06-20T09:29:17.966Z] status changed to [FAILED].
2018-06-20T09:30:37,378 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
"id" : "index_hadoop_test_2018-06-20T09:29:17.966Z",
"status" : "FAILED",
"duration" : 73962
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions