Skip to content

Peon cannot write segments to GCS but can write index logs #7522

@dgregoire

Description

@dgregoire

Peon cannot write segments to GCS but can write index logs.

Affected Version

0.14.0-incubating

The Druid version where the problem was encountered.

Description

At the end of a hadoop indexation job the peon errors out but then the middle manager is able to write the indexation logs.

2019-04-21T04:53:36,122 INFO [forking-task-runner-1-[index_hadoop_supply_2019-04-21T04:52:45.180Z]] org.apache.druid.indexing.overlord.ForkingTaskRunner - Process exited with status[0] for task: index_hadoop_supply_2019-04-21T04:52:45.180Z
2019-04-21T04:53:36,123 INFO [forking-task-runner-1] org.apache.druid.storage.hdfs.tasklog.HdfsTaskLogs - Writing task log to: gs://druid-XXXXXXXX/indexing-logs/index_hadoop_supply_2019-04-21T04_52_45.180Z
2019-04-21T04:53:37,445 INFO [forking-task-runner-1] org.apache.druid.storage.hdfs.tasklog.HdfsTaskLogs - Wrote task log to: gs://druid-XXXXXXXX/indexing-logs/index_hadoop_supply_2019-04-21T04_52_45.180Z
2019-04-21T04:53:37,445 INFO [forking-task-runner-1] org.apache.druid.storage.hdfs.tasklog.HdfsTaskLogs - Writing task reports to: gs://druid-XXXXXXXX/indexing-logs/index_hadoop_supply_2019-04-21T04_52_45.180Z.reports.json
2019-04-21T04:53:38,651 INFO [forking-task-runner-1] org.apache.druid.storage.hdfs.tasklog.HdfsTaskLogs - Wrote task reports to: gs://druid-XXXXXXXX/indexing-logs/index_hadoop_supply_2019-04-21T04_52_45.180Z.reports.json
2019-04-21T04:53:38,652 INFO [forking-task-runner-1] org.apache.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_supply_2019-04-21T04:52:45.180Z] status changed to [FAILED].
2019-04-21T04:53:38,653 INFO [forking-task-runner-1] org.apache.druid.indexing.overlord.ForkingTaskRunner - Removing task directory: var/druid/task/index_hadoop_supply_2019-04-21T04:52:45.180Z
2019-04-21T04:53:38,666 INFO [WorkerTaskManager-NoticeHandler] org.apache.druid.indexing.worker.WorkerTaskManager - Job's finished. Completed [index_hadoop_supply_2019-04-21T04:52:45.180Z] with status [FAILED]

Peon is definitely producing output

2019-04-21T02:32:00,232 INFO [pool-34-thread-1] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_local152430381_0001_r_001394_0' to file:/var/druid/hadoop-tmp/supply/2019-04-21T023121.679Z_d3d8ec21cead4e3ea61a9e05af02f36e/_temporary/0/task_local152430381_0001_r_001394

but finishes with this error

Could not find job job_local812015859_0001
Finished peon task

Middle manager config looks like this:

druid.service=druid/middleManager
druid.port=8091

# Number of tasks per middleManager
druid.worker.capacity=4

# Task launch parameters
druid.indexer.runner.javaOpts=-server -Xmx4g -XX:MaxDirectMemorySize=3g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -XX:+ExitOnOutOfMemoryError -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
druid.indexer.task.baseTaskDir=var/druid/task

# HTTP server threads
druid.server.http.numThreads=25

# Processing threads and buffers on Peons
druid.indexer.fork.property.druid.processing.buffer.sizeBytes=536870912
druid.indexer.fork.property.druid.processing.numThreads=2

# Hadoop indexing
druid.indexer.task.hadoopWorkingPath=var/druid/hadoop-tmp

Is this a problem with the user (me) or some subtle path bug that I can't figure out?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions