Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SplitBuilder.add(Flow) causes hung execution in some cases #3857

Closed
ee-usgs opened this issue Feb 24, 2021 · 3 comments
Closed

SplitBuilder.add(Flow) causes hung execution in some cases #3857

ee-usgs opened this issue Feb 24, 2021 · 3 comments
Labels
has: minimal-example Bug reports that provide a minimal complete reproducible example in: core type: bug
Milestone

Comments

@ee-usgs
Copy link

ee-usgs commented Feb 24, 2021

Bug description
If FlowBuild.SplitBuilder.add(Flow) is called multiple times to dynamically build a list of Flows, the result is a Flow that will hang.
Calling FlowBuild.SplitBuilder.add(Flow[]) with an array of Flows works as expected.

For instance, this does not error, but does not work as expected:

FlowBuilder<SimpleFlow> myFlow = new FlowBuilder<SimpleFlow>("DoesntWork");
FlowBuilder.SplitBuilder<SimpleFlow> split = myFlow.split(exe);

for (int i = 0; i < 10; i++) {
	split.add(createANewFlow(i)); // <<< Key change - call add() multiple times
}

However, this does work:

FlowBuilder<SimpleFlow> myFlow = new FlowBuilder<SimpleFlow>("Works");

List<Flow> flows = new ArrayList<>();
for (int i = 0; i < 10; i++) {
	flows.add(createANewFlow(i));
}

myFlow.split(exe).add(flows.toArray(new Flow[flows.size()]));   //Add flows all in one 'add' operation

Why would you call FlowBuild.SplitBuilder.add(Flow) multiple times??
I need to dynamically build my list of tasks - I don't know the size of the list or what the tasks are ahead of time. I do know, however, that they can all be run in parallel and there are a lot of them (hundreds). Calling FlowBuild.SplitBuilder.add(Flow) for each as I build up the list makes the most intuitive sense.

Environment
Note: I'm pretty sure this is related to how SpringBatch works, not any sort of JDK issue.
SpringBatchCore 4.3.1
OpenJDK 12
Mac OS 10.14

Steps to reproduce
See the code example in GitHub, below.

The first flows runs as expected, the second flow hangs after two steps are executed.

Expected behavior
Calling FlowBuild.SplitBuilder.add(Flow) should just add to the list of flows to run in parallel. It shouldn't matter if they are added all at once or add is called multiple times.

Minimal Complete Reproducible example
I created a Maven based demo project that shows this bug:
https://github.com/ee-usgs/springbatch_parallel_bug

This application should be runnable via Maven or any IDE.

@ee-usgs ee-usgs added status: waiting-for-triage Issues that we did not analyse yet type: bug labels Feb 24, 2021
@ee-usgs ee-usgs changed the title FlowBuild.split.add(Flow) causes hung execution in some cases SplitBuilder.add(Flow) causes hung execution in some cases Feb 24, 2021
@fmbenhassine
Copy link
Contributor

Thank you for reporting this issue and for providing a minimal example. I will let you know in which release this will be addressed.

@fmbenhassine fmbenhassine added has: minimal-example Bug reports that provide a minimal complete reproducible example in: core and removed status: waiting-for-triage Issues that we did not analyse yet labels Feb 25, 2021
@hpoettker
Copy link
Contributor

hpoettker commented Apr 27, 2021

Spring Batch treats the two methods of adding flows to the SplitBuilder quite differently:

  • If all flows are added at once, one SplitState with 10 flows will be created.
  • If the flows are added separately, a nested structure of 10 SplitState will be created, where each of them is linked to one StepState and the next SplitState, respectively. Except for the last SplitState which is only linked to a StepState and no further SplitState.

When Spring Batch traverses a flow structure with nested splits, it requires one thread for each nested SplitState that is visited at the same time. Thus, the job with the separate invocations of add does not block if the TaskExecutor allows at least 10 concurrent threads like e.g. a ThreadPoolTaskExecutor with a corePoolSize of 10 or more.

It should be noted, that 10 threads are not enough to run all tasks in parallel. This requires at least 19 threads. With 10 available threads, Spring Batch will start with 5 parallel tasks and steadily reduce the number of parallel tasks until it starts the last task to run for itself as the other 9 threads are used by the nested SplitState that are waiting on each other.

For hundreds of tasks, it's probably easiest to add them all at once and control the level of concurrency with the thread core pool size. If you add them individually, you must increase the thread core pool size as described above. But then you loose effective control over the maximum number of parallel steps.

@benas Have you already thought about how to address the issue? I think it would be a great enhancement, if the number of concurrent steps could be effectively controlled by the thread core pool size also in the presence of nested splits.

@fmbenhassine
Copy link
Contributor

@ee-usgs Thank you for reporting this issue and for providing a minimal example! This is a valid issue. Indeed, the execution hangs with multiple calls to SplitBuilder.add(Flow).

@hpoettker Great analysis! Thank you for sharing that and for opening a PR 👍 The fact that Spring Batch behaves differently when adding all flows at once or adding them by using multiple calls to add(Flow) is incorrect, inconsistent and should be fixed. And I agree that concurrency should be controlled by parametrizing the pool size and not by the way the flow is built. The change set in #3903 LGTM and fixes the issue based on the provided sample, which does not hang anymore (The sample is attached here for reference, which I updated to v5 and from which I removed the boot bits).


gh3857.patch

@fmbenhassine fmbenhassine modified the milestones: 5.1.0, 5.1.0-M1 Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
has: minimal-example Bug reports that provide a minimal complete reproducible example in: core type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants