Skip to content

task cannot be shutdown #4136

@ieasydevops

Description

@ieasydevops

hi , I found a problem as follows,it seems to be a serious bug there, how can i fix this?

after i restart all the component of druid, without delete mysql index, ( and i delete all the files in the var directory) , then a task do the job like that:

  1. TaskQueue - Asking taskRunner to run
  2. RemoteTaskRunner - Added pending task
  3. Coordinator asking Worker to add task
  4. [rtr-pending-tasks-runner-0]io.druid.indexing.overlord.RemoteTaskRunner -
    Task index_kafka_ucar-pallas-secure_4854008e83380cc_emcfgnhn switched from pending to running
  5. io.druid.indexing.overlord.TaskRunnerUtils - Task status changed to [RUNNING]
  6. RemoteTaskRunner - Worker wrote RUNNING status for task on
    [TaskLocation{host='null', port=-1}]
  7. TaskRunnerUtils - Task location changed to [TaskLocation{host='IP', port=port}]
  8. RemoteTaskRunner - No worker selections strategy set. Using default
  9. LocalTaskActionClient - Performing action for task ....
  10. Sent shutdown message to worker: IP:PORT, status 200 OK, response: {"task":"index_kafka_ucar-pallas-secure_4854008e83380cc_emcfgnhn"}

the problem is that ,this task do the 10 step forever, and cannot shutdown success; and middleManager node create too many file descriptor

 overlord send  shutdown message 

   2017-03-31T16:29:24,476 INFO [TaskQueue-Manager] io.druid.indexing.overlord.RemoteTaskRunner - Sent shutdown message to worker: 10.204.56.39:8091, status 200 OK, response: {"task":"index_kafka_ucar-pallas-secure_d25f91e8c1ca55f_hjhoakhe"}
       2017-03-31T16:30:24,477 INFO [TaskQueue-Manager] io.druid.indexing.overlord.RemoteTaskRunner - Sent shutdown message to worker: 10.204.56.39:8091, status 200 OK, response: {"task":"index_kafka_ucar-pallas-secure_d25f91e8c1ca55f_hjhoakhe"}

middlemanager Ignoring

   2017-03-31T16:29:24,476 INFO [qtp27971761-47] io.druid.indexing.overlord.ForkingTaskRunner -   Ignoring request to cancel unknown task: index_kafka_ucar-pallas-secure_d25f91e8c1ca55f_hjhoakhe

    2017-03-31T16:30:24,477 INFO [qtp27971761-47] io.druid.indexing.overlord.ForkingTaskRunner - 
   Ignoring request to cancel unknown task: index_kafka_ucar-pallas     secure_d25f91e8c1ca55f_hjhoakhe

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions