Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Optimized JdbcPagingItemReader configuration for process indicator pattern [BATCH-2317] #1284
For a new project I'm currently evaluating the performance of using JdbcPagingItemReader versus JdbcCursorItemReader.
The application will make use of a process indicator column in the input table, so saveState="false" will be configured.
In my tests the JdbcCursorItemReader is way faster (5x) than the JdbcPagingItemReader. But this is mostly due to the fact that the JdbcCursorItemReader is doing a simple "SELECT FROMWHERE AND processed = 0".
The JdbcPagingItemReader however is doing "SELECT FROM
When working with a process indicator column these " > " and "ORDER BY ASC" clauses are actually not required.
After doing some local hacking to the JdbcPagingItemReader and PagingQueryProvider to remove the sort key condition and order by clause from the query, the performance is comparable to that of JdbcCursorItemReader for this scenario.
The process indicator column is a pattern that is being promoted in the spring batch reference manual, so I think it would be nice to have both standard JDBC reader implementations support this pattern in a performant way.
No further details from BATCH-2317
Michael Minella commented
A couple initial thoughts:
Jimmy Praet commented
While this is true, even when using the process indicator pattern, there may be cases where you still want to do the ORDER BY. That is because the record processing order can significantly impact performance as well.
So, I think a more correct approach is, when you use the process indicator pattern:
But that approach would need a new dedicated 'processIndicator'=true configuration flag on the JdbcPagingItemReader.