Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression due to the fix of BATCH-2442 [BATCH-2840] #775

Closed
spring-issuemaster opened this issue Sep 2, 2019 · 10 comments
Closed

Regression due to the fix of BATCH-2442 [BATCH-2840] #775

spring-issuemaster opened this issue Sep 2, 2019 · 10 comments

Comments

@spring-issuemaster
Copy link
Collaborator

@spring-issuemaster spring-issuemaster commented Sep 2, 2019

Conosci opened BATCH-2840 and commented

I have a scenario where I believe this fix is causing another dangerous bug, some items don't end properly their lifecycle and are silenty ignored. The scenario is the following:

  • Job with 1 Step With Chunk Processing (ChunkSize=10)
  • Reader, Processor, Writer, all with faultTolerant() and skipPolicy(..) (Always Skip) and skipListener(..) to listen to 'onSkipInWrite' and 'onSkipInProcess'
  • Source Data with 3 Records: the first is a good record, the second is giving Processing Error and the third is giving Write Error (duplicate primary key against the first record)

At runtime, after the BATCH-2442 fix, the result is the following:

  1. Reader reads the 3 records chunk

  2. Processor start processing and meet Processing Error in the second record

  3. The System will skip (according to the configured policy) the second record and will invoke the SkipListener (onSkipInProcess) giving to the user the possibility to manage the issue

  4. The System will take the remaining 2 records (the first and the third), will process again them and will try to write them in chunk (same transaction)

  5. During the write the database will return exception (duplicate primary key)

  6. The System will go in 'recovery' mode and will start processing the two items one by one

  7. Now during the scan the following code will execute:

    1. //BATCH-2442 : do not scan skipped items
      if (!inputs.getSkips().isEmpty()) {
        if (outputIterator.hasNext()) {    outputIterator.remove();    return;   }
      }
  8. The condition is then verified, there is a skip (the first Processor error for the second record) and one record from the remaining two (first and third) is then removed without any reason!

  9. At this point only 1 record is then available in outputIterator (the first record), and it will be written successfully.

  10. The job then ends with a non consistent state (there is a writeSkipCount element missing, the totals are wrong):

    1. readCount = 3
    2. processSkipCount = 1
    3. writeCount = 1
    4. writeSkipCount = 0
    5. rollbackCount = 2
    6. commitCount = 2
  11. Basically what happened is that removing the wrong element from the outputIterator, the system didn't catch the Write Exception and didn't execute the proper lifecycle (doOnWriteError/shouldSkip/onSkipInWrite) givinig no change to the user to know about the skipped element in the skipListener

 

The same scenario is working well before the fix version, using Spring Batch version 4.0.1.RELEASE all the listeners are correctly invoked and the ending counts are correct.


Affects: 4.2.0, 4.0.3, 4.1.2

Reference URL: https://jira.spring.io/browse/BATCH-2442

Attachments:

Backported to: 4.3.0.M1, 4.2.1, 4.1.3, 4.0.4

4 votes, 6 watchers

@spring-issuemaster
Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Oct 17, 2019

Conosci commented

Dear Mahmoud,

attached ([^workspace.zip]) a project to replicate the issue. The output of the same job using the different versions is the following:

+AFTER BATCH2442 FIX:+

. ____ _ __ _ _
/\ / _'_ __ _ ()_ __ __ _ \ \ \
( ( )_
| '_ | '| | ' / _` | \ \ \
\/ )| |)| | | | | || (_| | ) ) ) )
' |
| .__|| ||_| |___, | / / / /
=========||==============|/=///_/
:: Spring Boot :: (v2.1.4.RELEASE)

Starting Step...
ERROR PROCESSING >>record2<<
Step ended, let's check the numbers:
readCount = 3
processSkipCount = 1
writeCount = 1
writeSkipCount = 0
rollbackCount = 2
commitCount = 2
Yes, there is something wrong with the totals! onSkipInWrite has not been called!

 

+BEFORE THE BATCH2442 FIX:+

. ____ _ __ _ _
/\ / _'_ __ _ ()_ __ __ _ \ \ \
( ( )_
| '_ | '| | ' / _` | \ \ \
\/ )| |)| | | | | || (_| | ) ) ) )
' |
| .__|| ||_| |___, | / / / /
=========||==============|/=///_/
:: Spring Boot :: (v2.0.7.RELEASE)

Starting Step...
ERROR PROCESSING >>record2<<
ERROR WRITING >>record3<<
Step ended, let's check the numbers:
readCount = 3
processSkipCount = 1
writeCount = 1
writeSkipCount = 1
rollbackCount = 3
commitCount = 2
Totals seems OK!

 

 

 

@spring-issuemaster
Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Oct 24, 2019

Matías Morelli commented

Dear Mahmoud ,

I have found the same issue and it's blocking my project.

Any update?

Thanks!!

@spring-issuemaster
Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Oct 28, 2019

wayaba commented

Hi!! i m facing the same issue :(

Any new about it??

Thanks!!

 

@spring-issuemaster
Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Oct 29, 2019

Mahmoud Ben Hassine commented

This is planned for the next bug fix release. Please watch the issue for any update.

@spring-issuemaster
Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Oct 29, 2019

tranky82 commented

Dear Mahmoud,

how to know the planning for then next bug fix release?

Is release date of Version 4.0.4 already known?

 

Thank you in advance.

@spring-issuemaster
Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Oct 31, 2019

Wolfgang Klaus commented

Dear Mahmoud,

before I found this issue I added a comment to the issue BATCH-2442 which contains a possible solution for the problem.

Please fix it in version 3.0.10 too.

@spring-issuemaster
Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Oct 31, 2019

Mahmoud Ben Hassine commented

Hi,

There is a regression indeed due to the fix of BATCH-2442. This means we have a gap in our test suite, and this is a good opportunity to improve it!

Here is the missing case (similar to the app provided by Conosci, thanks btw) to add in FaultTolerantStepIntegrationTests:

	@Test(timeout = 3000)
	public void testExceptionInProcessAndWriteDuringChunkScan() throws Exception {
		// Given
		ListItemReader<Integer> itemReader = new ListItemReader<>(Arrays.asList(1, 2, 3));

		ItemProcessor<Integer, Integer> itemProcessor = new ItemProcessor<Integer, Integer>() {
			@Override
			public Integer process(Integer item) throws Exception {
				if (item.equals(2)) {
					throw new Exception("Error during process item " + item);
				}
				return item;
			}
		};

		ItemWriter<Integer> itemWriter = new ItemWriter<Integer>() {
			@Override
			public void write(List<? extends Integer> items) throws Exception {
				if (items.contains(3)) {
					throw new Exception("Error during write");
				}
			}
		};

		Step step = new StepBuilderFactory(jobRepository, transactionManager).get("step")
				.<Integer, Integer>chunk(5)
				.reader(itemReader)
				.processor(itemProcessor)
				.writer(itemWriter)
				.faultTolerant()
				.skipPolicy(new AlwaysSkipItemSkipPolicy())
				.build();

		// When
		StepExecution stepExecution = execute(step);

		// Then
		assertEquals(BatchStatus.COMPLETED, stepExecution.getStatus());
		assertEquals(ExitStatus.COMPLETED, stepExecution.getExitStatus());
		assertEquals(3, stepExecution.getReadCount());
		assertEquals(1, stepExecution.getWriteCount());
		assertEquals(1, stepExecution.getWriteSkipCount());
		assertEquals(1, stepExecution.getProcessSkipCount());
		assertEquals(3, stepExecution.getRollbackCount());
		assertEquals(2, stepExecution.getCommitCount());
	}

This test passes with 4.0.1 and fails with 4.0.2.

Wolfgang Klaus Your fix LGTM and makes this test pass (without breaking the test that was added to cover the infinite loop reported in BATCH-2442).

Can you please open a PR with your fix as well as this new test? I will review/merge it asap.

Please fix it in version 3.0.10 too.

We already announced that 3.0.10 is the last maintenance release of the 3.x line. Any reason you can't upgrade to 4.x?

tranky82 we did not set any date for the next bug fix release yet, we will let you know asap.

 

@spring-issuemaster
Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Oct 31, 2019

Wolfgang Klaus commented

I opened a PR for the fix including the test from previous comment.

Mahmoud Ben Hassine we need 3.x because we are currently stuck on Java7 because of other dependencies.

@spring-issuemaster
Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Nov 5, 2019

Mahmoud Ben Hassine commented

Thank you for opening the PR! We will merge it for the next bug fix release.

As mentioned previously, the 3.x line is not maintained anymore, so the fix will be applied for v4+. If you need support for the migration, we can help.

@spring-issuemaster
Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Nov 14, 2019

Mahmoud Ben Hassine commented

I merged the PR, Thank you! The fix has been backported to 4.0.4, 4.1.3 and 4.2.1 which are planned to be released on December 3rd, 2019.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.