Skip to content

[Bug][Dependent Task] The dependent node sometimes fails to pause #5935

@reele

Description

@reele

When running a process that contains multi-level sub-processes and a large number of dependent nodes across processes (about 10000 + nodes), there will always be some dependent nodes and sub_process nodes in the running state after clicking Pause, and the process has actually been paused. dolphinschediler-master.log has no log output and the process instance is always in an inoperable state.

And when I delete all the records of t_ds_process_instance, it throws a lot of exceptions, such as:
[ERROR] 2021-08-03 09:29:45.647 org.apache.dolphinscheduler.server.master.runner.SubProcessTaskExecThread:[109] - parent work flow instance is null , please check it! work flow id 3570 [ERROR] 2021-08-03 09:29:45.648 org.apache.dolphinscheduler.server.master.runner.SubProcessTaskExecThread:[76] - exception: java.lang.NullPointerException: null at org.apache.dolphinscheduler.server.master.runner.SubProcessTaskExecThread.waitTaskQuit(SubProcessTaskExecThread.java:144) at org.apache.dolphinscheduler.server.master.runner.SubProcessTaskExecThread.submitWaitComplete(SubProcessTaskExecThread.java:58) at org.apache.dolphinscheduler.server.master.runner.MasterBaseTaskExecThread.call(MasterBaseTaskExecThread.java:272) at org.apache.dolphinscheduler.server.master.runner.MasterBaseTaskExecThread.call(MasterBaseTaskExecThread.java:50) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) [ERROR] 2021-08-03 09:29:45.648 org.apache.dolphinscheduler.server.master.runner.SubProcessTaskExecThread:[78] - wait task quit failed, instance id:3570, task id:79032 [ERROR] 2021-08-03 09:29:45.648 org.apache.dolphinscheduler.server.master.runner.SubProcessTaskExecThread:[109] - parent work flow instance is null , please check it! work flow id 3568 [ERROR] 2021-08-03 09:29:45.649 org.apache.dolphinscheduler.server.master.runner.SubProcessTaskExecThread:[76] - exception: java.lang.NullPointerException: null at org.apache.dolphinscheduler.server.master.runner.SubProcessTaskExecThread.waitTaskQuit(SubProcessTaskExecThread.java:144) at org.apache.dolphinscheduler.server.master.runner.SubProcessTaskExecThread.submitWaitComplete(SubProcessTaskExecThread.java:58) at org.apache.dolphinscheduler.server.master.runner.MasterBaseTaskExecThread.call(MasterBaseTaskExecThread.java:272) at org.apache.dolphinscheduler.server.master.runner.MasterBaseTaskExecThread.call(MasterBaseTaskExecThread.java:50) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) [ERROR] 2021-08-03 09:29:45.649 org.apache.dolphinscheduler.server.master.runner.SubProcessTaskExecThread:[78] - wait task quit failed, instance id:3568, task id:79016

and:
[ERROR] 2021-08-03 09:29:46.318 org.apache.dolphinscheduler.server.master.runner.MasterExecThread:[184] - master exec thread exception java.lang.NullPointerException: null at org.apache.dolphinscheduler.server.master.runner.MasterExecThread.getProcessInstanceState(MasterExecThread.java:668) at org.apache.dolphinscheduler.server.master.runner.MasterExecThread.updateProcessInstanceState(MasterExecThread.java:762) at org.apache.dolphinscheduler.server.master.runner.MasterExecThread.runProcess(MasterExecThread.java:922) at org.apache.dolphinscheduler.server.master.runner.MasterExecThread.executeProcess(MasterExecThread.java:200) at org.apache.dolphinscheduler.server.master.runner.MasterExecThread.run(MasterExecThread.java:181) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) [ERROR] 2021-08-03 09:29:46.318 org.apache.dolphinscheduler.server.master.runner.MasterExecThread:[185] - process execute failed, process id:3573 [ERROR] 2021-08-03 09:29:46.318 org.apache.dolphinscheduler.server.master.runner.MasterExecThread:[184] - master exec thread exception java.lang.NullPointerException: null at org.apache.dolphinscheduler.server.master.runner.MasterExecThread.getProcessInstanceState(MasterExecThread.java:668) at org.apache.dolphinscheduler.server.master.runner.MasterExecThread.updateProcessInstanceState(MasterExecThread.java:762) at org.apache.dolphinscheduler.server.master.runner.MasterExecThread.runProcess(MasterExecThread.java:922) at org.apache.dolphinscheduler.server.master.runner.MasterExecThread.executeComplementProcess(MasterExecThread.java:256) at org.apache.dolphinscheduler.server.master.runner.MasterExecThread.run(MasterExecThread.java:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

and:
[ERROR] 2021-08-03 09:29:46.671 - [taskAppId=TASK-7142-3574-79074]:[143] - process instance not exists , master task exec thread exit [ERROR] 2021-08-03 09:29:46.670 - [taskAppId=TASK-7196-3590-79264]:[143] - process instance not exists , master task exec thread exit [ERROR] 2021-08-03 09:29:46.656 - [taskAppId=TASK-7117-3579-79137]:[143] - process instance not exists , master task exec thread exit [ERROR] 2021-08-03 09:29:46.709 - [taskAppId=TASK-7197-3587-79195]:[143] - process instance not exists , master task exec thread exit [ERROR] 2021-08-03 09:29:46.707 - [taskAppId=TASK-7123-3575-79063]:[143] - process instance not exists , master task exec thread exit [ERROR] 2021-08-03 09:29:46.713 - [taskAppId=TASK-7123-3575-79064]:[143] - process instance not exists , master task exec thread exit [INFO] 2021-08-03 09:29:46.700 - [taskAppId=TASK-7142-3574-79074]:[223] - dependent task completed, dependent result:WAITING [ERROR] 2021-08-03 09:29:46.695 - [taskAppId=TASK-7188-3585-79346]:[143] - process instance not exists , master task exec thread exit [INFO] 2021-08-03 09:29:46.690 - [taskAppId=TASK-7187-3586-79210]:[223] - dependent task completed, dependent result:WAITING [INFO] 2021-08-03 09:29:46.690 - [taskAppId=TASK-7188-3585-79339]:[223] - dependent task completed, dependent result:WAITING [INFO] 2021-08-03 09:29:46.689 - [taskAppId=TASK-7117-3579-81267]:[223] - dependent task completed, dependent result:WAITING

Which version of Dolphin Scheduler:
-[1.3.6-release]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions