JdbcJobInstanceDao.getJobInstances() does not optimize at the database level [BATCH-1784] #1805
Our deployment of Spring Batch is seeing monotonically increasing row reads and we were able to trace the issue down to
Backported to: 4.2.0.M3
5 votes, 3 watchers
The text was updated successfully, but these errors were encountered:
Shaun McBride commented
We're having a similar with job executions with large numbers of step executions created by partition handlers (things grind to a halt after about 2000 steps). We've gotten around this a number of ways: restarting a new instance after a step limit is reached, handling a single batch (typically a file) per job instance, or just processing multiple batches in a single step. The first approach is a bit of a hack and still causes a degredation in DB performance, the second approach means we need to purge BATCH_JOB_INSTANCE more than we'd like, and the third approach results in pretty ugly Java code.
Mahmoud Ben Hassine commented
Hi Clement Pang
Can you please provide more details (which DB server you use, how many instances you have, how do you start the next job instance) ? Can you confirm the issue is still happening with v4.x?
What is the concern exactly? Is it the query that starts to become slow after a given number of instances or is it the memory consumption that bothers you? I tried to reproduce the issue (with h2 and mysql, see this repo for detailed steps) with 100_000 instances and I don't see any performance problem in regards to the response time when starting the next instance (with both
FTR, there is an ongoing effort to optimize the aforementioned query (See here) in order to do the filtering in the database and grab only the last instance, but I want to make sure to address the right problem before applying any change.