Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a job during a spring boot web requests with a default db connection pool size leads to JDBC timeout [BATCH-2780] #825

Open
spring-issuemaster opened this issue Dec 16, 2018 · 2 comments

Comments

@spring-issuemaster
Copy link
Collaborator

@spring-issuemaster spring-issuemaster commented Dec 16, 2018

FlorianSW opened BATCH-2780 and commented

During the work with Spring Batch for a project, I ran into the following problem. The project consists of:

  • a MySQL database (mysqld 10.2.13-MariaDB)
  • Spring Boot (2.1.1.RELEASE)
  • Spring Batch (4.1.0.RELEASE)
  • Spring Batch is configured to use the same datasource as the business logic for the JobRepository
  • The hikari connection pool size for the datasource is configured to have a size of 4 (which is the default when pusing the app to our CloudFoundry instance and injected during the auto-reconfiguration of Spring)

As a reference you can take a look into the sample project where I created a minimal project to reproduce the problem (see reference URL).

The problem:
Given you've a controller, which handles one RequestMapping in which at least two things happen: The controller does an arbitrary action agains the business model database schema (e.g. saving or requesting an entity from the database) and afterwards starting a Spring Batch job through a call to JobLauncher#run. Spring Batch is configured to run the tasks asynchronous with a ThreadPoolTaskExecutor with the pool size of 1.

This works pretty fine, if the request mapping is queries only few times per second (round about 1-3 times). However, if the mapping is queried more than that, let's say 4+ (during testing I used 4 and up to 20 requests) times in a synchronized way (if testing locally), then the requests run into a deadlock, where some of them will be aborted with the following exception:

Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.springframework.dao.DataAccessResourceFailureException: Could not obtain last_insert_id(); nested exception is java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 5001ms.] with root cause

This problem can be mitigated when increasing the size of the connection pool to at least 7 (I don't know where this number could come from). After restarting the application and executing my test JavaScript code[1] to dispatch a number of requests I can easily increase the number of fetch requests to 150 and the application will handle these requests without any problems (as expected). Where, before increasing the pool size, numerous requests run into an timeout.

I'm not sure, if this qualifies as a bug or as an Improvement or whatever, and I'm not sure if the component is the documentation or something like that, however, my intention of this issue is:

Finding out, if Spring Batch, together with Spring Boot, requires a minimum number of available connections in the DB connection pool? If so, should this be mentioned in the documentation?
Or, if this is a bug as the JDBC connections used during the request processing are not returned to the pool in a reasonable short amount of time?

[1]

for (i = 0; i <= 150; i++) {fetch('http://localhost:8080/api/jobs/' + i, {method: 'PUT'})}

Affects: 4.1.0

Reference URL: https://github.com/FlorianSW/spring-batch-connection-issue

@spring-issuemaster
Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Dec 17, 2018

Mahmoud Ben Hassine commented

Thank you for opening this issue and for preparing a project to reproduce it. However, this is better asked on stack overflow as someone else might have already run through this issue and can answer your question. Please update the description with the full stack trace.

I looked at your code and was a bit confused by the test (using two CountDownLatch instances to synchronize threads, I would have used a task executor instead of creating threads manually). I will dig deeper and debug your code to see if this is an issue in Spring Batch or not. In the mean time, please add a comment if you managed to fix it.

Finding out, if Spring Batch, together with Spring Boot, requires a minimum number of available connections in the DB connection pool?

No, Spring Batch does not requires any specific setting on the datasource bean. It is up to you to configure the datasource according to your needs and Spring Batch will use it as is. However, it is natural that you need to provide it with at least one connection (min poolSize = 1) so that the job repository can interact with the database to persist job meta-data.

@spring-issuemaster
Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Dec 20, 2018

Mahmoud Ben Hassine commented

I debugged this use case and noticed that the timeout is happening even without calling modelRepository.save(model) in the controller. However, I cannot confirm that the issue is with Spring Batch because I see there are two transaction managers in the application context (transactionManager which is autoconfigured by spring boot and jpaTransactionManager which is created in SpringBatchConfiguration).

I am suspecting there are two (or more) components competing for the 4 available connections from hikari and one of them is being timed out.

This problem can be mitigated when increasing the size of the connection pool

Since this is not blocking, I will resume working on this ticket once it is in our priority list. @FlorianSW In the meantime, please let us know if you found the cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.