Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OptimisticLockingFailureException updating step execution after commit failure #1826

Open
spring-projects-issues opened this issue Jul 7, 2011 · 51 comments

Comments

@spring-projects-issues
Copy link
Collaborator

spring-projects-issues commented Jul 7, 2011

Quinton McCombs opened BATCH-1767 and commented

It appears that if the commit fails, spring batch will get an OptimisticLockingFailureException when it tries to revert the changes to the step execution. In my particular case, I have a callback through EclipseLink to update history tables before a transaction is committed. If a failure occurs during this callback, the commit fails.

From looking through the code and the attached log file, the step execution is updated and committed before the main transaction is committed. When the commit fails, the old values for the step execution (including version) are updated to the values before the chuck started. When control returns to AbstractStep.execute(), the OptimisticLockingFailureException is thrown when the step execution is updated with the failed status because the new version had already been committed to the database.

2011-07-06 17:40:29,494 ERROR SimpleAsyncTaskExecutor-1 [org.springframework.batch.core.step.AbstractStep] Encountered an error saving batch meta data. This job is now in an unknown state and should not be restarted.
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=3225 with wrong version (35), where current version is 36
   at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:185)
   at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:171)
   at sun.reflect.GeneratedMethodAccessor130.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:309)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
   at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:110)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
   at $Proxy77.update(Unknown Source)
   at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:244)
   at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:135)
   at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:61)
   at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
   at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:144)
   at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:124)
   at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:91)
   at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:89)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at java.lang.Thread.run(Thread.java:662)

Affects: 2.1.7

Attachments:

Referenced from: pull request #591

23 votes, 31 watchers

@spring-projects-issues
Copy link
Collaborator Author

Dave Syer commented

You need to flush JPA changes before Spring Batch tries to commit its own meta data. The best way to do that is in the ItemWriter. Any reason you can't do that?

@spring-projects-issues
Copy link
Collaborator Author

Quinton McCombs commented

We are already using the JpaWriter supplied with Spring Batch which performs the flush. We have a requirement to audit all database changes and identify the changes made during a single transaction. For this reason, we implemented our auditing using an Eclipselink pre-commit hook (Hibernate can do this as well). While the transaction level auditing does not make much sense for a batch job, it is what is used by the rest of the application.

@spring-projects-issues
Copy link
Collaborator Author

Dave Syer commented

If you are using transaction hooks you are going to have to tread very carefully. You need to somehow ensure that the vendor hook runs before the Spring transaction synchronization, or else not use it. The easiest way would be to run your audit code in an ItemWriter or ItemWriteListener, but if you find a cleverer way and it can be exposed as a framework feature, please let us know. If you can provide a sample project (preferably self-contained, with an in-memory database, and built using maven, but we'll take whatever you have), that would help. Or if you want to make a suggestion about a patch you can fork on github and send a pull request (the README has more information).

@spring-projects-issues
Copy link
Collaborator Author

spring-projects-issues commented Jul 14, 2011

Ramkumar Kirhsnan commented

We've also got the same failure after the one of the steps failed to commit due to database exception thrown while writing into the database. Following is the stack trace when i tried to run the job for only one item

Jul 13 10:45:30,467 INFO  populateAccountSnapshotReportJob [SimpleAsyncTaskExecutor-2] [org.springframework.batch.core.job.SimpleStepHandler] 133 - Executing step: [processAccountsStep]
Jul 13 10:45:30,514 DEBUG populateAccountSnapshotReportJob [SimpleAsyncTaskExecutor-2] [com.om.dh.batch.core.partition.KeySetPartitioner] 79 - partition: partition-1, keys=[1000000069567]
Jul 13 10:45:30,576 DEBUG populateAccountSnapshotReportJob [workerThreadPool-1-1] [com.om.dh.batch.core.partition.PartitionKeySetPagingItemReader] 104 - Reading page 0
Jul 13 10:45:30,576 DEBUG populateAccountSnapshotReportJob [workerThreadPool-1-1] [com.om.dh.batch.item.adapter.AbstractDelegatingAdapter] 60 - processing started for 1000000069567
Jul 13 10:45:31,217 DEBUG populateAccountSnapshotReportJob [workerThreadPool-1-1] [com.om.dh.batch.item.adapter.AbstractDelegatingAdapter] 103 - processing complete for 1000000069567 elapsed=641ms
Jul 13 10:45:31,279 ERROR populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.hibernate.util.JDBCExceptionReporter] 465 - Data truncation: Data too long for column 'calculation_details' at row 1
Jul 13 10:45:31,279 ERROR populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.hibernate.event.def.AbstractFlushingEventListener] 543 - Could not synchronize database state with session
org.hibernate.exception.DataException: Could not execute JDBC batch update
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:100)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:275)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:266)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:167)
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:50)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1027)
at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:365)
at org.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:137)
at org.springframework.orm.hibernate3.HibernateTransactionManager.doCommit(HibernateTransactionManager.java:655)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:732)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:701)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:263)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:367)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:214)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:143)
at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:248)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:107)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.sql.BatchUpdateException: Data truncation: Data too long for column 'calculation_details' at row 1
at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1666)
at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1082)
at com.mysql.jdbc.jdbc2.optional.StatementWrapper.executeBatch(StatementWrapper.java:719)
at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeBatch(CachedPreparedStatement.java:476)
at org.jboss.resource.adapter.jdbc.WrappedStatement.executeBatch(WrappedStatement.java:774)
at org.hibernate.jdbc.BatchingBatcher.doExecuteBatch(BatchingBatcher.java:70)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:268)
... 25 more
Jul 13 10:45:31,310 INFO  populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.springframework.batch.core.step.tasklet.TaskletStep] 327 - Commit failed while step execution data was already updated. Reverting to old version.
Jul 13 10:45:31,357 ERROR populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.springframework.batch.core.step.tasklet.TaskletStep] 423 - JobRepository failure forcing exit with unknown status
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=671991 with wrong version (1), where current version is 2
at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:185)
at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:171)
at sun.reflect.GeneratedMethodAccessor143.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:106)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy77.update(Unknown Source)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:417)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:128)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:263)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:367)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:214)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:143)
at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:248)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:107)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Jul 13 10:45:31,357 INFO  populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.springframework.batch.core.step.tasklet.TaskletStep] 327 - Commit failed while step execution data was already updated. Reverting to old version.
Jul 13 10:45:31,357 ERROR populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.springframework.batch.core.step.AbstractStep] 212 - Encountered an error executing the step
org.springframework.batch.core.step.FatalStepExecutionException: JobRepository failure forcing exit with unknown status
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:426)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:128)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:263)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:367)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:214)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:143)
at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:248)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:107)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=671991 with wrong version (1), where current version is 2
at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:185)
at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:171)
at sun.reflect.GeneratedMethodAccessor143.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:106)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy77.update(Unknown Source)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:417)
... 15 more
Jul 13 10:45:31,404 ERROR populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.springframework.batch.core.step.AbstractStep] 250 - Encountered an error saving batch meta data. This job is now in an unknown state and should not be restarted.
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=671991 with wrong version (1), where current version is 2
at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:185)
at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:171)
at sun.reflect.GeneratedMethodAccessor143.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:106)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy77.update(Unknown Source)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:244)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:107)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Jul 13 10:45:31,404 ERROR populateAccountSnapshotReportJob [SimpleAsyncTaskExecutor-2] [org.springframework.batch.core.step.AbstractStep] 212 - Encountered an error executing the step
org.springframework.batch.core.JobExecutionException: Partition handler returned an unsuccessful step
at org.springframework.batch.core.partition.support.PartitionStep.doExecute(PartitionStep.java:110)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:135)
at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:61)
at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:144)
at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:124)
at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:135)
at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:281)
at com.om.dh.batch.service.impl.JobExecutionServiceImpl.createAndExecuteJobRunnable(JobExecutionServiceImpl.java:361)
at com.om.dh.batch.service.impl.JobExecutionServiceImpl.doExecuteJob(JobExecutionServiceImpl.java:325)
at com.om.dh.batch.service.impl.JobExecutionServiceImpl.access$000(JobExecutionServiceImpl.java:51)
at com.om.dh.batch.service.impl.JobExecutionServiceImpl$1.run(JobExecutionServiceImpl.java:267)
at java.lang.Thread.run(Thread.java:619)

@spring-projects-issues
Copy link
Collaborator Author

Dave Syer commented

I think that problem is unavoidable if you don't flush the Hibernate session in your ItemWriter. You have made life complicated by using a partition step, but the same failure mode is easy to see in a single threaded normal step. Can you just do the flush?

@spring-projects-issues
Copy link
Collaborator Author

Kristof Buts commented

maven project for reproducing the bug

@spring-projects-issues
Copy link
Collaborator Author

Kristof Buts commented

In our current project, we are also experiencing this problem in some of our batches, but until yesterday we couldn't reproduce it.
After two days trying to reproduce the problem, I finally managed to do it. I already uploaded my little (maven) project for reproducing the bug.

So, what did I figure out? Consider the following scenario for a batch running 3 threads:
Thread # | line | What is going on? | oldVersion.version | stepExecution.version | DB version | comments
th#1 | 380 | copy(stepExecution, oldVersion); | 1 | 1 | 1 |
th#2 | 380 | copy(stepExecution, oldVersion); | 1 | 1 | 1 |
th#3 | 380 | copy(stepExecution, oldVersion); | 1 | 1 | 1 |
th#1 | 429 | getJobRepository().updateExecutionContext(stepExecution); | 1 | 1 | will become 2 | step execution to be committed
th#1 | 430 | stepExecution.incrementCommitCount(); | 1 | 2 | will become 2 |
th#1 | / | transaction is commited | 1 | 2 | 2 |
th#2 | 429 | getJobRepository().updateExecutionContext(stepExecution); | 1 | 2 | will become 3 | step execution to be committed
th#2 | 430 | stepExecution.incrementCommitCount(); | 1 | 3 | will become 3 |
th#2 | / | transaction is rolled back | 1 | 2 | 2 | for some reason tx is rolled back
th#2 | 344 | copy(oldVersion, stepExecution); | 1 | 1 | 2 | step version is reverted to 1 -> WTF?
th#3 | 429 | getJobRepository().updateExecutionContext(stepExecution); | 1 | 1 | 2 | OptimisticLockException will be thrown
note:
-the line numbers are the ones of class org.springframework.batch.core.step.tasklet.TaskletStep and its inner class ChunkTransactionCallback
-the oldVersion.version is the one which is assigned in the current thread. Since I assume all 3 threads to first go through this piece of code, they will be 1 for all 3 threads in this case. At least thread number 2 should have value 1 for his old version to produce the bug. In that case the wrong version (1 instead of 2) will be reverted later on.
-the rollback of transaction 2 can be caused by:
-a timeout
-some trigger that still needs to be executed at commit-time
-deferred db constraint checks
-execution of some sql statements that were not flushed yet
-own registered transaction synchronization that fails
-...
-For the bug to occur, it is important that the rollback is triggered after line 422 ((stepExecutionUpdated = true), since that is the line which causes the TransactionSynchronization method to revert the stepExecution object (see if-statement at line 339 - if (stepExecutionUpdated) )

So, the problem is that copying the oldVersion variable is done outside the synchronized block. By putting it inside the block (say at line 407 instead of line 380, just after setting lock to true), I think the bug would be solved (I already tested this by debugging my test with a conditional breakpoint in eclipse which called this code at the right place and it worked).

Is it possible to get this fix in a patched version for spring-batch-core-2.1.8.RELEASE? We cannot upgrade to a higher spring version since we are only one month away from a big release.

Kind regards,
Kristof

@spring-projects-issues
Copy link
Collaborator Author

Kristof Buts commented

Added the scenario in a more readable form
(use monospace font to view it)

@spring-projects-issues
Copy link
Collaborator Author

spring-projects-issues commented Aug 13, 2013

Philippe Mouawad commented

Hello,
I am facing the same issue with Spring Batch 2.1.8 and 2.1.9.

Is there some plan to take into account what Kristof Buts.

Stacktrace:

threadPoolTaskExecutorForBatch-12] 12:07:10,363 ERROR org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:438) - JobRepository failure forcing exit with unknown status
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=8 with wrong version (217), where current version is 223
at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:185)
at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:171)
at sun.reflect.GeneratedMethodAccessor802.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:309)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:110)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
at com.sun.proxy.$Proxy48.update(Unknown Source)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:432)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:130)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:264)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate$ExecutingRunnable.run(TaskExecutorRepeatTemplate.java:258)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
[threadPoolTaskExecutorForBatch-12] 12:07:10,366 ERROR com.xxx.dpas.log.Logger.error(Logger.java:89) - ----- doROLLBACK Transaction -----
[threadPoolTaskExecutorForBatch-12] 12:07:10,367 ERROR com.xxx.dpas.log.Logger.error(Logger.java:89) - ----- doCLEANUP Transaction -----
[threadPoolTaskExecutorForBatch-13] 12:07:10,372 ERROR org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:438) - JobRepository failure forcing exit with unknown status
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=8 with wrong version (221), where current version is 223
at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:185)
at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:171)
at sun.reflect.GeneratedMethodAccessor802.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:309)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:110)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
at com.sun.proxy.$Proxy48.update(Unknown Source)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:432)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:130)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:264)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate$ExecutingRunnable.run(TaskExecutorRepeatTemplate.java:258)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)

@spring-projects-issues
Copy link
Collaborator Author

Philippe Mouawad commented

Please find attached the spring batch classes put in INFO and DEBUG fororg.springframework.batch.core.step.tasklet.

As you can see before failure we have this:
[threadPoolTaskExecutorForBatch-2] 15:30:46,282 INFO org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.afterCompletion(TaskletStep.java:342) - Commit failed while step execution data was already updated. Reverting to old version.

@spring-projects-issues
Copy link
Collaborator Author

Philippe Mouawad commented

I have tested with last 2.2.1 and I have the same problem.

@spring-projects-issues
Copy link
Collaborator Author

Philippe Mouawad commented

Hello,
Please find pull request here:
#214

Hope you can take it into account soon.
Regards
Philippe

@spring-projects-issues
Copy link
Collaborator Author

Philippe Mouawad commented

Hello,
After further analysis, I don't understand why copy method when restoring also copies the version number of oldVersion to stepExecution ?

Could you shed some light on this ?

Thank you
Regards

@spring-projects-issues
Copy link
Collaborator Author

Michael Minella commented

I added a comment about the pull request. In short, moving the copy to where it is won't work because the StepExecution could have already been modified in the step. There are a couple tests that are failing with that update that illustrate this.

With regards to why the copy method restores the version number...why wouldn't it? The intent is that there was a rollback so the state should match what was there before the rollback.

@spring-projects-issues
Copy link
Collaborator Author

Philippe Mouawad commented

Thanks for feedback. Further testing this morning on real world project showed me fix did not correct all cases, my real world case is still broken although attached test case was fixed by it.

Does it mean you confirm this bug is a real one ? Test project attached reproduces it and my real world also does.
Is there some plan to fix it ? within which delay ?

It seems anyway that the stepexecution is modified somewhere before restoring which generates the case.

@spring-projects-issues
Copy link
Collaborator Author

Kristof Buts commented

@Michael Minella,
I'm afraid there is another (hidden) issue with this code:
When running a batch multithreaded, one thread can commit stuff from another thread which can be rollbacked at the end. Shouldn't changes to the stepExecution object be recorded and executed against the actual stepExecution object at commit time instead of instantly applying to the real StepExecution object and reverting on rollback afterwards (just wondering) ?

@spring-projects-issues
Copy link
Collaborator Author

Gérald Quintana commented

We has this OptimisticLockingFailureException, when we noticed this log just before: "Commit failed while step execution data was already updated. Reverting to old version." . Turning org.springframework.batch.core.step.tasklet and org.springframework.transaction loggers level to debug helped us find a bug in our tasklet code.

In our case:

  • The tasklet produces wrong data but the chunk is property processed and no exception is raised here
  • Chunk transaction commit triggers a deferred database constraint which prevents commit (it may also occur with JPA where transaction commit triggers flushing)
  • This exception is not logged (maybe because of the RetryTemplate even if retry is not configured)
  • The TransactionManager decides to call doRollbackOnCommitException
  • The ChunkTransactionCallback transaction synchronization is called with status=ROLLBACKED

@spring-projects-issues
Copy link
Collaborator Author

zyro commented

i think i am currently facing this (or at least sth. similar) in the following scenario:

@spring-projects-issues
Copy link
Collaborator Author

member sound commented

Is this issue still active? I'm also getting the OptimisticLockFailureException. The cause for me was an insert on a column whose length was not big enough for the inserted data, thus throwing SQL Exception and thereby the rollback:

WARN org.hibernate.engine.jdbc.spi.SqlExceptionHelper: SQL Error: 0, SQLState: 22001
ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper: FEHLER: Wert zu lang für Typ character varying(2)
INFO org.hibernate.engine.jdbc.batch.internal.AbstractBatchImpl: HHH000010: On release of batch it still contained JDBC statements
ERROR org.springframework.batch.core.step.tasklet.TaskletStep: JobRepository failure forcing rollback
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=29 with wrong version (1), where current version is 2
	at org.springframework.batch.core.repository.dao.MapStepExecutionDao.updateStepExecution(MapStepExecutionDao.java:102) ~[spring-batch-core-3.0.2.RELEASE.jar:3.0.2.RELEASE]
	at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:191) ~[spring-batch-core-3.0.2.RELEASE.jar:3.0.2.RELEASE]
Caused by: org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=29 with wrong version (1), where current version is 2
	at org.springframework.batch.core.repository.dao.MapStepExecutionDao.updateStepExecution(MapStepExecutionDao.java:102) ~[spring-batch-core-3.0.2.RELEASE.jar:3.0.2.RELEASE]
	at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:191) ~[spring-batch-core-3.0.2.RELEASE.jar:3.0.2.RELEASE]
ERROR org.springframework.batch.core.step.AbstractStep: Encountered an error saving batch meta data for step step in job myJobConfig. This job is now in an unknown state and should not be restarted.
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=29 with wrong version (1), where current version is 2
	at org.springframework.batch.core.repository.dao.MapStepExecutionDao.updateStepExecution(MapStepExecutionDao.java:102) ~[spring-batch-core-3.0.2.RELEASE.jar:3.0.2.RELEASE]
	at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:191) ~[spring-batch-core-3.0.2.RELEASE.jar:3.0.2.RELEASE]

@spring-projects-issues
Copy link
Collaborator Author

Mohamed commented

Hi,
I have the same problem, and I don't know how to resolve it. I debuged my code but no way to find the cause.

12:21:42,400 DEBUG @ [FaultTolerantChunkProcessor] Attempting to write: [items=[StagingItem [475472]], skips=[]]
12:21:42,400 DEBUG @ [RetryTemplate] Retry: count=0
12:21:42,402 DEBUG @ [ChunkOrientedTasklet] Inputs not busy, ended: false
12:21:42,402 DEBUG @ [TaskletStep] Applying contribution: [StepContribution: read=1, written=1, filtered=0, readSkips=0, writeSkips=0, processSkips=0, exitStatus=EXECUTING]
12:21:42,402 DEBUG @ [JtaTransactionManager] Participating in existing transaction
12:21:42,404 DEBUG @ [TaskletStep] Saving step execution before commit: StepExecution: id=5026, version=14, name=step1, status=STARTED, exitStatus=EXECUTING, readCount=14, filterCount=0, writeCount=14 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=14, rollbackCount=0, exitDescription=
12:21:42,404 DEBUG @ [JtaTransactionManager] Participating in existing transaction
12:21:42,409 DEBUG @ [JtaTransactionManager] Participating transaction failed - marking existing transaction as rollback-only
12:21:42,409 DEBUG @ [JtaTransactionManager] Setting JTA transaction rollback-only
12:21:42,412 ERROR @ [TaskletStep] JobRepository failure forcing exit with unknown status
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=5026 with wrong version (14), where current version is 13

Is there any way to discover the origin of this error?
Thanks in advance

@spring-projects-issues
Copy link
Collaborator Author

Adam commented

Same thing as for @Mohamed. No way to find out what is the root cause of the rollback attempt:

12:28:49.420 [SimpleAsyncTaskExecutor-10] INFO  p.d.i.i.c.IncrementalImportJobFactory - Starting the incremental documents import.
12:51:01.738 [SimpleAsyncTaskExecutor-5] INFO  p.d.i.i.c.IncrementalImportJobFactory - Finished the incremental documents import.
12:51:01.756 [SimpleAsyncTaskExecutor-5] ERROR o.s.b.core.step.tasklet.TaskletStep - JobRepository failure forcing rollback
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=23580 with wrong version (1), where current version is 2
        at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:255) ~[spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:191) ~[spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at sun.reflect.GeneratedMethodAccessor136.invoke(Unknown Source) ~[na:na]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_55]
        at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_55]
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) ~[spring-aop-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190) ~[spring-aop-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) ~[spring-aop-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99) ~[spring-tx-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:281) ~[spring-tx-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96) ~[spring-tx-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:207) ~[spring-aop-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at com.sun.proxy.$Proxy152.update(Unknown Source) ~[na:na]
        at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:452) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:330) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133) [spring-tx-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:271) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:77) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:368) [spring-batch-infrastructure-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:215) [spring-batch-infrastructure-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:144) [spring-batch-infrastructure-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:257) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:198) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:64) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:67) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:165) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:144) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:93) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:90) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_55]
        at org.springframework.core.task.SimpleAsyncTaskExecutor$ConcurrencyThrottlingRunnable.run(SimpleAsyncTaskExecutor.java:251) [spring-core-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]

@spring-projects-issues
Copy link
Collaborator Author

Gérald Quintana commented

@Spyro @ medhedi_abd The problem occurs, not when the chunk read-process-write fails, but when the commit fails. In the past I've seen 2 reasons for that:

  • Using Hibernate or JPA: when the read-process-write ends, data has not been flushed yet, and the commit triggers the flush and may raise a an exception (SQL constraint not fulfilled...). The solution is to force a flush (call .flush() on entity manager) and the end of the writer.
  • Using Oracle deferred constraints: when the-process-writes ends, SQL constraints have not been checked by Oracle yet, and the commit triggers these SQL constraints verification and may raise an exception (same reason as above). The solution is to replace deferred constraint by immediate constraint.
    In the end, the commit shouldn't fail, if something goes wrong, it should fail before

@spring-projects-issues
Copy link
Collaborator Author

Mohamed commented

Hi,
@Gerald, I agree with you,
Indead, this error occurs also when using Jboss4 EAP. and the treatment of batch works well without any error when using "WebSphere 7".
Is there any modification into jboss to fix this problem?, or transformation in spring-batch adapting of jboss transaction?

Thanks in advance

@spring-projects-issues
Copy link
Collaborator Author

Gérald Quintana commented

@medhedi_Abd In your case it may because of a third reason: "using JTA". If the XA transactions fails to commit a resource for any resource (timeout or whatever).

Having logs of commit failure would help, but it's not the case.

@spring-projects-issues
Copy link
Collaborator Author

David J. M. Karlsen commented

Any movement on this issue?

@spring-projects-issues
Copy link
Collaborator Author

Sarvesh Katariya commented

Any updates/workaround on this issue ?

@spring-projects-issues
Copy link
Collaborator Author

Marcus Mattern commented

We did a simple workaround for us to avoid this problem.
I only patched the afterCompletion method of inner class org.springframework.batch.core.step.tasklet.TaskletStep.ChunkTransactionCallback.

{{ @Override
public void afterCompletion(int status) {
try {
if (status != TransactionSynchronization.STATUS_COMMITTED) {
if (stepExecutionUpdated) {
// Wah! the commit failed. We need to rescue the step
// execution data.
logger.info("Commit failed while step execution data was already updated. "

  • "Reverting to old version.");
    //FIX Start
    // WORKAROUND für https://jira.springsource.org/browse/BATCH-1767
    // OptimisticLockingFailureException updating step execution after commit failure
    // Description:
    // - save version number of step execution that should be updated
    // - copy old step details like before
    // - set new version number instead of old copied number
    final Integer savedVersion = stepExecution.getVersion();
    //FIX End
    copy(oldVersion, stepExecution);
    //FIX Start
    stepExecution.setVersion(savedVersion);
    //FIX End
    if (status == TransactionSynchronization.STATUS_ROLLED_BACK) {
    rollback(stepExecution);
    }
    }
    chunkListener.afterChunkError(chunkContext);
    }

      		if (status == TransactionSynchronization.STATUS_UNKNOWN) {
      			logger.error("Rolling back with transaction in unknown state");
      			rollback(stepExecution);
      			stepExecution.upgradeStatus(BatchStatus.UNKNOWN);
      			stepExecution.setTerminateOnly();
      		}
      	}
      	finally {
      		// Only release the lock if we acquired it, and release as late
      		// as possible
      		if (locked) {
      			semaphore.release();
      		}
    
      		locked = false;
      	}
      }
    

}}

Since then our problem was gone.

Marcus

@spring-projects-issues
Copy link
Collaborator Author

member sound commented

Any updates on this fix? I got the same problem...

@spring-projects-issues
Copy link
Collaborator Author

James Home commented

@Michael

Regarding the comment: "With regards to why the copy method restores the version number...why wouldn't it? The intent is that there was a rollback so the state should match what was there before the rollback."

The thing is, it's not actually a rollback in the database sense. You're really just resetting the counts the their previous values. It's a new update in a db sense, so surely it should use the appropriate version number. The policy of 'rolling back' the version number in a db update seems to be incompatible with the principle of version based optimistic locking that's being used by the dao.

@spring-projects-issues
Copy link
Collaborator Author

zyro commented

following the suggestion to flush the session in the writer, i think there is still a (slight..) chance for an exception being thrown (e.g. StaleObjectStateException) during the actual commit (which may flush again).
when using a skip policy, shouldnt that exception result in a skip and not break the whole job execution leaving it in an UNKNOWN state?

@Sashibhusan
Copy link

I am also getting
"org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=0 with wrong version (1), where current version is 2"
The cause for me was is, insert on a column whose length was not big enough for the inserted data, thus throwing SQL Exception.
Can anybody provide a work around for this or the fix release version? This is very urgent for me...

@fmbenhassine fmbenhassine removed this from the 4.3.0 milestone Jan 13, 2020
@ghost
Copy link

ghost commented Mar 24, 2020

I got the same exception
Is there a release date for this issue yet?
Still open in 4.3.0

@fmbenhassine
Copy link
Contributor

fmbenhassine commented Mar 24, 2020

No, I opened a PR with my attempt to fix the problem, see #591. I believe this is the most daunting issue in the backlog at the moment. The PR is not merged yet because we need to test the changes in depth. I added 3 integration tests to cover the case (which are passing without regressions) but I want to gather feedback from the community before applying the fix.

If you want to help moving things forward, I invite you to give the PR a try and share your feedback.

@javaHelper
Copy link

@snagunoori
Copy link

benas - Your PR still in Open status . When can we expect this fix ? even we are also getting same error .

org.springframework.batch.core.step.FatalStepExecutionException: JobRepository failure forcing rollbackrr - Job[at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:463)r at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:330)r at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133)r at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:271)r at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:81)at org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate$ExecutingRunnable.run(TaskExecutorRepeatTemplate.java:262)r at org.springframework.core.task.SimpleAsyncTaskExecutor$ConcurrencyThrottlingRunnable.run(SimpleAsyncTaskExecutor.java:268)r at java.lang.Thread.run(Thread.java:748)rCaused by: org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=4465436 with wrong version (1), where current version is 2r at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:256)r at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:191)r at sun.reflect.GeneratedMethodAccessor853.invoke(Unknown Source)r at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)r at java.lang.reflect.Method.invoke(Method.java:498)r at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:333)r at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190)r at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)r at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99)r at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:282)r at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96)r at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)r at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)r at com.sun.proxy.$Proxy70.update(Unknown Source)r at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:456)r ... 7 common frames omittedr
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=4465436 with wrong version (1), where current version is 2r at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:256)r at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobReposit

@fmbenhassine
Copy link
Contributor

@snagunoori As mentioned in my previous comment, the fix will be merged when we gather enough feedback from the community. For now, only one person tested the fix and gave feedback. If you want to help moving things forward, I invite you to give the PR a try and share your feedback as well (on the issue related to the PR, not here).

@kadamnitin06
Copy link

@benas We are also hit by the this exception, today. If you provide me the exact version to take for testing of it then we can also provide the feedback for this fix.

@fmbenhassine
Copy link
Contributor

we can also provide the feedback for this fix

That would be really appreciated! The fix is in #591 , it is not released so I can't give you an exact version to try out. You can apply the fix to the master branch, build the project and import the snapshot version in your project for testing.

Please note that fixing the problem does not mean the OptimisticLockingFailureException does not happen anymore, we can't prevent that (that's the idea behind an optimistic strategy in general). Fixing the problem means that when an optimistic locking exception happens, the job should end with a FAILED status and not UNKNOWN as it is now. This is required to be able to restart it where it left off. You can find more details here: #591 (comment).

Thank you upfront.

@darsh4tech
Copy link

Hi @benas
i am facing the same
i use spring boot and spring batch version [2.3.3] and i am trying to stop job programmatically and when i do that the step still execute and this is expected (within transaction boundries according to chunk size) but when the chunk size reach limit i expect the job will stop successfully which not happen and my log as below:

2020-09-11 14:25:27.816 ERROR 6280 --- [  restartedMain] o.s.batch.core.step.tasklet.TaskletStep  : JobRepository failure forcing rollback

org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=660 with wrong version (2), where current version is 3
	at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:279) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:196) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_261]
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:367) ~[spring-tx-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:118) ~[spring-tx-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at com.sun.proxy.$Proxy93.update(Unknown Source) ~[na:na]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_261]
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.batch.core.configuration.annotation.SimpleBatchConfiguration$PassthruAdvice.invoke(SimpleBatchConfiguration.java:127) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at com.sun.proxy.$Proxy62.update(Unknown Source) ~[na:na]
	at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:457) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:331) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140) [spring-tx-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:273) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:82) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:375) [spring-batch-infrastructure-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:215) [spring-batch-infrastructure-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:145) [spring-batch-infrastructure-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:258) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:208) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.job.AbstractJob.handleStep(AbstractJob.java:410) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.job.SimpleJob.doExecute(SimpleJob.java:136) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:319) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:147) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50) [spring-core-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:140) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_261]
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.batch.core.configuration.annotation.SimpleBatchConfiguration$PassthruAdvice.invoke(SimpleBatchConfiguration.java:127) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at com.sun.proxy.$Proxy66.run(Unknown Source) [na:na]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.execute(JobLauncherApplicationRunner.java:199) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.executeLocalJobs(JobLauncherApplicationRunner.java:173) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.launchJobFromProperties(JobLauncherApplicationRunner.java:160) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.run(JobLauncherApplicationRunner.java:155) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.run(JobLauncherApplicationRunner.java:150) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:786) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:776) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:322) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1237) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1226) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at com.lagorta.demo.SpringBatchDemoApplication.main(SpringBatchDemoApplication.java:29) [classes/:na]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_261]
	at org.springframework.boot.devtools.restart.RestartLauncher.run(RestartLauncher.java:49) [spring-boot-devtools-2.3.3.RELEASE.jar:2.3.3.RELEASE]

2020-09-11 14:25:27.827  INFO 6280 --- [  restartedMain] o.s.batch.core.step.tasklet.TaskletStep  : Commit failed while step execution data was already updated. Reverting to old version.
@AfterChunkError----->chunkContext.getStepContext().getId(): execution#660
2020-09-11 14:25:27.842 ERROR 6280 --- [  restartedMain] o.s.batch.core.step.AbstractStep         : Encountered an error executing step chunkStep in job customerReportJob

org.springframework.batch.core.step.FatalStepExecutionException: JobRepository failure forcing rollback
	at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:464) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:331) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140) ~[spring-tx-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:273) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:82) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:375) ~[spring-batch-infrastructure-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:215) ~[spring-batch-infrastructure-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:145) ~[spring-batch-infrastructure-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:258) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:208) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.job.AbstractJob.handleStep(AbstractJob.java:410) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.job.SimpleJob.doExecute(SimpleJob.java:136) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:319) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:147) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50) [spring-core-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:140) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_261]
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.batch.core.configuration.annotation.SimpleBatchConfiguration$PassthruAdvice.invoke(SimpleBatchConfiguration.java:127) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at com.sun.proxy.$Proxy66.run(Unknown Source) [na:na]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.execute(JobLauncherApplicationRunner.java:199) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.executeLocalJobs(JobLauncherApplicationRunner.java:173) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.launchJobFromProperties(JobLauncherApplicationRunner.java:160) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.run(JobLauncherApplicationRunner.java:155) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.run(JobLauncherApplicationRunner.java:150) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:786) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:776) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:322) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1237) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1226) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at com.lagorta.demo.SpringBatchDemoApplication.main(SpringBatchDemoApplication.java:29) [classes/:na]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_261]
	at org.springframework.boot.devtools.restart.RestartLauncher.run(RestartLauncher.java:49) [spring-boot-devtools-2.3.3.RELEASE.jar:2.3.3.RELEASE]
Caused by: org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=660 with wrong version (2), where current version is 3
	at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:279) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:196) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_261]
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:367) ~[spring-tx-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:118) ~[spring-tx-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at com.sun.proxy.$Proxy93.update(Unknown Source) ~[na:na]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_261]
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.batch.core.configuration.annotation.SimpleBatchConfiguration$PassthruAdvice.invoke(SimpleBatchConfiguration.java:127) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at com.sun.proxy.$Proxy62.update(Unknown Source) ~[na:na]
	at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:457) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	... 43 common frames omitted

2020-09-11 14:25:27.854  INFO 6280 --- [  restartedMain] o.s.batch.core.step.AbstractStep         : Step: [chunkStep] executed in 1m0s326ms
2020-09-11 14:25:27.873 ERROR 6280 --- [  restartedMain] o.s.batch.core.step.AbstractStep         : Encountered an error saving batch meta data for step chunkStep in job customerReportJob. This job is now in an unknown state and should not be restarted.

org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=660 with wrong version (2), where current version is 3
	at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:279) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:196) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_261]
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:367) ~[spring-tx-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:118) ~[spring-tx-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at com.sun.proxy.$Proxy93.update(Unknown Source) ~[na:na]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_261]
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.batch.core.configuration.annotation.SimpleBatchConfiguration$PassthruAdvice.invoke(SimpleBatchConfiguration.java:127) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at com.sun.proxy.$Proxy62.update(Unknown Source) ~[na:na]
	at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:275) ~[spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.job.AbstractJob.handleStep(AbstractJob.java:410) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.job.SimpleJob.doExecute(SimpleJob.java:136) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:319) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:147) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50) [spring-core-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:140) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_261]
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.batch.core.configuration.annotation.SimpleBatchConfiguration$PassthruAdvice.invoke(SimpleBatchConfiguration.java:127) [spring-batch-core-4.2.4.RELEASE.jar:4.2.4.RELEASE]
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) [spring-aop-5.2.8.RELEASE.jar:5.2.8.RELEASE]
	at com.sun.proxy.$Proxy66.run(Unknown Source) [na:na]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.execute(JobLauncherApplicationRunner.java:199) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.executeLocalJobs(JobLauncherApplicationRunner.java:173) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.launchJobFromProperties(JobLauncherApplicationRunner.java:160) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.run(JobLauncherApplicationRunner.java:155) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.run(JobLauncherApplicationRunner.java:150) [spring-boot-autoconfigure-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:786) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:776) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:322) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1237) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1226) [spring-boot-2.3.3.RELEASE.jar:2.3.3.RELEASE]
	at com.lagorta.demo.SpringBatchDemoApplication.main(SpringBatchDemoApplication.java:29) [classes/:na]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_261]
	at org.springframework.boot.devtools.restart.RestartLauncher.run(RestartLauncher.java:49) [spring-boot-devtools-2.3.3.RELEASE.jar:2.3.3.RELEASE]

2020-09-11 14:25:27.899  INFO 6280 --- [  restartedMain] o.s.b.c.l.support.SimpleJobLauncher      : **Job: [SimpleJob: [name=customerReportJob]] completed with the following parameters: [{run.id=1}] and the following **status: [UNKNOWN]**** in 1m0s633ms

is there any help?
my code to stop Job as below:

public String stopJob() throws NoSuchJobExecutionException, JobExecutionNotRunningException {
		Job job = this.context.getBean("customerReportJob", Job.class);
		logger.info("stopJob - job:" + job);
		Set<JobExecution> allRunningJobExecutions = this.jobExplorer.findRunningJobExecutions(job.getName());
		logger.info("stopJob - allRunningJobExecutions:" + allRunningJobExecutions);
		for (JobExecution jobExecution : allRunningJobExecutions) {
			logger.info("stopJob -----> {} , {} , {} ", jobExecution.getId(), jobExecution.getStatus().name(),
					jobExecution.isRunning());
			if (jobExecution.isRunning() 
					|| jobExecution.getStatus().equals(BatchStatus.STARTED)
					|| jobExecution.getStatus().equals(BatchStatus.STARTING)
					|| jobExecution.getStatus() == BatchStatus.STOPPING) {
				// stop stepExecution
				List<StepExecution> stepExecutions = (List<StepExecution>) jobExecution.getStepExecutions();
				for (StepExecution stepExecution : stepExecutions) {
					BatchStatus status = stepExecution.getStatus();
					if (status.isRunning() || status == BatchStatus.STOPPING) {
						stepExecution.setStatus(BatchStatus.STOPPED);
						stepExecution.setExitStatus(ExitStatus.STOPPED);
						jobRepository.update(stepExecution);
					}
				}
				logger.info("#-------{}", jobExecution.getId());
				boolean isStopped = jobOperator.stop(jobExecution.getId());
				logger.info("#---isStopped---->{}", isStopped);
			}
		}
		return "Job Stoped";
	}

@darsh4tech
Copy link

darsh4tech commented Sep 11, 2020

hi @benas
the issue disappeared when i commented the part of code that try to stop [StepExecution] as below:

				List<StepExecution> stepExecutions = (List<StepExecution>) jobExecution.getStepExecutions();
				for (StepExecution stepExecution : stepExecutions) {
					BatchStatus status = stepExecution.getStatus();
					if (status.isRunning() || status == BatchStatus.STOPPING) {
						stepExecution.setStatus(BatchStatus.STOPPED);
						stepExecution.setExitStatus(ExitStatus.STOPPED);
						jobRepository.update(stepExecution);
					}
				}

when i commented the code works successfully and the status of job became Completed and you know the exception give a sense of meaning as there is a concurrent access to locked object which i think the [StepExecution] as i tried to update it and in the same time my stop jobOperator.stop(jobExecution.getId()); try to update it also, am i right ?
also, I have a question if i worked in a multithreaded model and tried to stop any job will this exception will appear again?
thanks in advance

@ingleivj
Copy link

@snagunoori As mentioned in my previous comment, the fix will be merged when we gather enough feedback from the community. For now, only one person tested the fix and gave feedback. If you want to help moving things forward, I invite you to give the PR a try and share your feedback as well (on the issue related to the PR, not here).

Hi, at my workplace we have seen this issue now and then over the last few years (currently on Spring 4.3.29, Spring Batch 3.0.10), but usually it has not been consistently present or we have worked around it. Last week it started to hit all the time, so I tried a patched version today - with success!

@fmbenhassine fmbenhassine added this to the 5.0.0-M1 milestone Jan 14, 2021
@fmbenhassine fmbenhassine added the has: backports Legacy label from JIRA. Superseded by "for: backport-to-x.x.x" label Jan 14, 2021
@fmbenhassine fmbenhassine modified the milestones: 5.0.0-M1, 4.3.2 Jan 21, 2021
@fmbenhassine fmbenhassine removed the has: backports Legacy label from JIRA. Superseded by "for: backport-to-x.x.x" label Jan 21, 2021
@fmbenhassine fmbenhassine changed the title OptimisticLockingFailureException updating step execution after commit failure [BATCH-1767] OptimisticLockingFailureException updating step execution after commit failure Mar 15, 2021
@fmbenhassine fmbenhassine removed this from the 4.3.2 milestone Mar 18, 2021
@currentperson
Copy link

hope this issue can be fixed

@thouet-sigma
Copy link

@snagunoori As mentioned in my previous comment, the fix will be merged when we gather enough feedback from the community. For now, only one person tested the fix and gave feedback. If you want to help moving things forward, I invite you to give the PR a try and share your feedback as well (on the issue related to the PR, not here).

Hello,
We faced this issue since a long time. I implemented your PR and released a version (based on 3.0.10) about 3 years ago at my work.
Since, we don't get anymore this issue :)

Could you consider to release this "soon" ? That would allow us to migrate from 3.0.10 to 4.X

@MoHassem
Copy link

MoHassem commented Jan 2, 2024

my 2c worth, when working on a mainframe using IMS, that also uses the concept of transactions (syncpoints), we had a similair system that we wrote inhouse to manage restartability. Instead of using a database(table) for writing the job information, which causes all sorts of issues because it participates in the syncpoint, we wrote the job information to a flat file which is associated with the job. This keeps the job restart data outside the (syncpoint) and simplifies the management of the job restart data.

@thorstenhilker
Copy link

Adding ".faultTolerant()" to the StepBuilder helped here.
No idea why. :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.