Skip to content

Fix cast issue#626

Open
chzhoo wants to merge 1 commit intoapache:masterfrom
chzhoo:cast_issue
Open

Fix cast issue#626
chzhoo wants to merge 1 commit intoapache:masterfrom
chzhoo:cast_issue

Conversation

@chzhoo
Copy link
Copy Markdown
Contributor

@chzhoo chzhoo commented Sep 24, 2025

What changes were proposed in this pull request?

I simulated the automatic failover capability of the container module by randomly killing threads inside the container and discovered that the driver module has an infinite crash issue. The crash stack is as follows:

2025-09-24 15:02:22,569 [driver-executor-0] ERROR Driver:154  - driver exception
java.lang.ClassCastException: org.apache.geaflow.runtime.core.scheduler.PipelineCycleScheduler cannot be cast to org.apache.geaflow.runtime.core.scheduler.ExecutionGraphCycleScheduler
	at org.apache.geaflow.runtime.pipeline.runner.PipelineRunner.executePipelineGraph(PipelineRunner.java:80)
	at org.apache.geaflow.runtime.pipeline.runner.PipelineRunner.runPipelineGraph(PipelineRunner.java:87)
	at org.apache.geaflow.runtime.pipeline.task.PipelineTaskExecutor.execute(PipelineTaskExecutor.java:54)
	at org.apache.geaflow.runtime.pipeline.executor.PipelineExecutor.runPipelineTask(PipelineExecutor.java:76)
	at org.apache.geaflow.cluster.driver.Driver.executePipelineInternal(Driver.java:136)
	at org.apache.geaflow.cluster.driver.Driver.lambda$init$0(Driver.java:92)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

The reason for the issue is that CycleSchedulerFactory.create() may return either a PipelineCycleScheduler type or an ExecutionGraphCycleScheduler type, but the code only accounts for the ExecutionGraphCycleScheduler scenario.

How was this PR tested?

  • Tests have Added for the changes
  • Production environment verified

@cbqiao
Copy link
Copy Markdown
Contributor

cbqiao commented Sep 24, 2025

LGTM

@chzhoo
Copy link
Copy Markdown
Contributor Author

chzhoo commented Oct 14, 2025

The CI check failure doesn't seem to be a code issue. Could we try retrying it? @cbqiao @yaozhongq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants