You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been testing performance of ElasticsearchOperations (Spring Data Elasticsearch, uses org.elasticsearch.client.RestClient underneath), by calling my controller endpoints and noticed a deadlock when using virtual threads and a custom IOReactorConfig with IoThreadCount set to a low number (1 in the current example). I used hatoo/oha and set a number of concurrent requests and a number of total requests to a 100. And there was a deadlock. As I found out after checking the thread dump from jcmd, pinned virtual threads were in a synchronized block:
And the elasticsearch thread that should notify threads waiting in a synchronized block, was also blocked by the ReentrantLock in the AbstractNIOConnPool:
virtual threads that got in the AbstractNIOConnPool.lease where unmounted due to some I/O operations inside the method, but still holding the AbstractNIOConnPool ReentrantLock
other virtual threads were mounted instead and got blocked by calling Object.wait().
no more platform threads were left
elasticsearch-rest-client-0-thread-2 could never acquire the needed lock again and was blocked forever.
Or maybe that is wrong and there were some other interleavings.
But the fact is that a deadlock was detected and it would be nice to change current RestClient implementation.
The tested program configurations:
dependencies:
Spring Boot 3.3.1, Spring Data Elasticsearch, Spring Boot MVC
other:
IOReactorConfig with IoThreadCount set to 1
Hardware: a processor with 8 physical cores.
How to fix the problem, but lose throughput?
The problem could be fixed with wrapping all the methods of RestClient with ReentrantLock.lock, ReentrantLock.unlock statements (for example, using Spring AOP). At least there were no deadlocks detected in that case, when manually testing with hatoo/oha with 10_000 of concurrent requests. But such a solution degrades the overall throughput.
The text was updated successfully, but these errors were encountered:
Hello! As you already figured out, this is a problem with the underlying RestClient used by the java client. Unfortunately, updating it probably won't be enough to fix this issue: many users have been reporting problems when using virtual threads and locks/synchronized blocks, this thread being one of many examples.
This is likely going to be fixed in Java 23, as explained in this JEP draft, until then we have no way of ensuring compatibility with virtual threads.
Java API client version
8.13.4
Java version
21
Elasticsearch Version
8.14.3
Problem description
I have been testing performance of ElasticsearchOperations (Spring Data Elasticsearch, uses org.elasticsearch.client.RestClient underneath), by calling my controller endpoints and noticed a deadlock when using virtual threads and a custom IOReactorConfig with IoThreadCount set to a low number (1 in the current example). I used hatoo/oha and set a number of concurrent requests and a number of total requests to a 100. And there was a deadlock. As I found out after checking the thread dump from jcmd, pinned virtual threads were in a synchronized block:
and some of the unmounted virtual threads, were waiting for the lock in the AbstractNIOConnPool:
And the elasticsearch thread that should notify threads waiting in a synchronized block, was also blocked by the ReentrantLock in the AbstractNIOConnPool:
Maybe it happened because:
Or maybe that is wrong and there were some other interleavings.
But the fact is that a deadlock was detected and it would be nice to change current RestClient implementation.
The tested program configurations:
Spring Boot 3.3.1, Spring Data Elasticsearch, Spring Boot MVC
-Djdk.virtualThreadScheduler.maxPoolSize=8
server.tomcat.threads.max: 8
spring.threads.virtual.enabled: true
IOReactorConfig with IoThreadCount set to 1
Hardware: a processor with 8 physical cores.
How to fix the problem, but lose throughput?
The problem could be fixed with wrapping all the methods of RestClient with ReentrantLock.lock, ReentrantLock.unlock statements (for example, using Spring AOP). At least there were no deadlocks detected in that case, when manually testing with hatoo/oha with 10_000 of concurrent requests. But such a solution degrades the overall throughput.
The text was updated successfully, but these errors were encountered: