-
Couldn't load subscription status.
- Fork 2.5k
Performance improvement for partitions with a large number of steps #716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
New method on JobRepository that enables better performance of SimpleStepExecutionSplitter. By retrieving all step executions at the time of partitioning outside the for loop, the number of calls to the database is reduced. Also updated JSR version to be performant. Deprecated the older step execution methods and cleaned up the javadocs. Stub implementations provided for common support classes https://jira.spring.io/browse/BATCH-2716
|
@NaanProphet Please sign the Contributor License Agreement! Click here to manually synchronize the status of this Pull Request. See the FAQ for frequently asked questions. |
|
@NaanProphet Thank you for signing the Contributor License Agreement! |
|
Hi @NaanProphet , Thank you for this PR! I will add some inline comments on your changes. I created a sample application to reproduce the issue here. This app will also serve as a baseline for our performance benchmark. Kr, |
| /** | ||
| * @return all step executions, from all job executions, for the given job instance. | ||
| */ | ||
| Collection<StepExecution> getStepExecutions(JobInstance jobInstance); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use Java 8 "default method in interface" for backward compatibility here. For example:
default Collection<StepExecution> getStepExecutions(JobInstance jobInstance) {
throw new UnsupportedOperationException();
}| * @return the last execution of step inside the list, null otherwise. | ||
| */ | ||
| @Nullable | ||
| StepExecution getLastStepExecution(Collection<StepExecution> stepExecutions, String stepName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filtering a collection of objects by name is not the job of JobRepository. This method could be in a utility class or private/protected to a given implementation, but not part of the JobRepository contract. Please remove the method from the interface and update the implementations accordingly.
Here is how I would do it in SimpleStepExecutionSplitter:
@Nullable
protected StepExecution getLastStepExecution(Collection<StepExecution> stepExecutions, String stepName) {
return stepExecutions.stream()
.filter(stepExecution -> stepExecution.getStepName().equals(stepName))
.min(new StepExecutionComparator())
.orElse(null);
}The code of the comparator is what is done here: https://github.com/spring-projects/spring-batch/blob/master/spring-batch-core/src/main/java/org/springframework/batch/core/repository/support/SimpleJobRepository.java#L236-L246
|
|
||
| @Override | ||
| public Collection<StepExecution> getStepExecutions(JobInstance jobInstance) { | ||
| return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method can be removed since we will use the default method in interface.
|
|
||
| @Override | ||
| public Collection<StepExecution> getStepExecutions(JobInstance jobInstance) { | ||
| return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should return an empty collection.
| } | ||
|
|
||
| @Override | ||
| public Collection<StepExecution> getStepExecutions(JobInstance jobInstance) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method can be removed since we will use the default method in interface.
|
|
||
| @Override | ||
| public Collection<StepExecution> getStepExecutions(JobInstance jobInstance) { | ||
| return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method can be removed since we will use the default method in interface.
| StepExecution currentStepExecution = jobExecution.createStepExecution(stepName); | ||
|
|
||
| boolean startable = isStartable(currentStepExecution, context.getValue()); | ||
| boolean startable = isStartable(currentStepExecution, context.getValue(), allPriorStepExecutions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isStartable requires the current step execution as well as the latest one with the same name. So there is no need to pass the entire collection of step executions here. You can get the last one with the protected utility method getLastStepExecution and pass it to isStartable.
|
@NaanProphet Have you got a chance to address my reviews? We are planning to release 4.2.0.M3 soon and we wanted to include this PR in it. If you agree with my comments, I can merge the PR and apply those changes when merging. Do you agree? |
|
@NaanProphet This PR will be merged as part of #725 . I implemented the minor changes discussed here on top of your work. Thank you very much for your contribution! |
|
@benas Thanks so much! I really appreciate the in-depth design comments and agree with all your suggestions. Sorry I couldn't get to it in time. Cheers! |
New method on JobRepository that enables better performance of SimpleStepExecutionSplitter. By retrieving all step executions at the time of partitioning outside the for loop, the number of calls to the database is reduced.
Also updated JSR version to be performant. Deprecated the older step execution methods and cleaned up the javadocs. Stub implementations provided for common support classes
https://jira.spring.io/browse/BATCH-2716