Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JdbcItemReader restart behavior #103

Closed
ghost opened this issue Nov 10, 2017 · 4 comments
Closed

JdbcItemReader restart behavior #103

ghost opened this issue Nov 10, 2017 · 4 comments
Assignees

Comments

@ghost
Copy link

ghost commented Nov 10, 2017

Hi,

I think JdbcItemReader might have an issue with its checkpoint information. The method checkpointInformation() always returns currentRowNumber. So, in case of a transaction rollback, this would be the last record in the result set, that got successfully processed and committed. Therefore I'd expect open() to set the result set cursor to this exact position and continue processing (rs.next()) after the last successfully processed item.
Is there any other reason for open() to use checkpoint - 1 instead of just checkpoint?

Best,

-fd

@chengfang chengfang self-assigned this Nov 10, 2017
@chengfang
Copy link
Contributor

I think the reason is, in the readItem method: https://github.com/jberet/jsr352/blob/master/jberet-support/src/main/java/org/jberet/support/io/JdbcItemReader.java#L242

we can resultSet.next(), which moves the cursor forward by one, so checkpoint - 1 is to compensate that forward move.

For instance, if we read the first item, process it, but failed in writer, hence rollback. The reader checkpoint info is now 1. When restarting, we set reposition the resultSet to checkpoint - 1, which is 0. Then in readItem method, calling resultSet.next() will position it to the first item, and the subsequent resultSet.getObject() will return the first item in the resultSet.

Have you seen any actual unexpected behavior in restarting in your app? Any reproducible app will be even better for pinpointing the problem.

@ghost
Copy link
Author

ghost commented Nov 13, 2017

I think the point is, that it's not the current row that gets persisted as a checkpoint in your example, because the transaction gets rolled back. So the checkpoint will be the index of the last item that was successfully processed and committed depending on your chunk size.

I experienced unexpected behavior in the following szenario: Suppose you process 20 elements with a chunk size of 10. So for every 10 elements, the current row is used as a checkpoint and the entire chunk is passed to the writer. If processing of element 11 leads to an error, the last checkpoint is position 10. On retry, my app ends up processing element 10 twice, because checkpoint - 1 and resultSet.next() will effectively set the cursor to position 10 instead of 11.

The JpaItemReader has the correct positioning behavior. Therefore I'd recommend to just use checkpoint instead of checkpoint - 1 to reposition the cursor.

@chengfang
Copy link
Contributor

https://issues.jboss.org/browse/JBERET-364 was created to track this issue and fix.

@chengfang
Copy link
Contributor

This issue should be fixed with the above commit. Thanks for reporting it. Let's know if anything else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant