Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redis-cache blocks the eventloop #36382

Closed
bartm-dvb opened this issue Oct 10, 2023 · 9 comments · Fixed by #36544
Closed

redis-cache blocks the eventloop #36382

bartm-dvb opened this issue Oct 10, 2023 · 9 comments · Fixed by #36544
Labels
area/cache area/redis kind/bug Something isn't working triage/upstream Used for issues which are caused by issues in upstream projects/dependency
Milestone

Comments

@bartm-dvb
Copy link
Contributor

Describe the bug

The Redis cache implementation seems to block the eventloop. It causes the HTTP throughput to be very poor, we see that the worker_pool_queue_delay_seconds_max metric keeps getting higher and higher, it was at 60s before we turned off the load test. It also seems to cause a very high number of open files on the Linux system.

We verified this by switching to caffeine cache, the throughput increased by a factor 3. The number of open files dropped and worker_pool_queue_delay_seconds_max stays low.

MicrosoftTeams-image

Screenshot Grafana

Expected behavior

The Redis cache implementation does not block the eventloop.

Actual behavior

The Redis cache implementation blocks the eventloop.

How to Reproduce?

  1. Create a sample application with Resteasy Reactive and Redis cache
  2. Create a simple endpoint and a method annotated with @CacheResult
  3. Run a load test with > 100req/s
  4. The console should now log thread blocked errors

Output of uname -a or ver

Linux 4.18.0-477.27.1.el8_8.x86_64

Output of java -version

openjdk version "17.0.5" 2022-10-18 LTS OpenJDK Runtime Environment (Red_Hat-17.0.5.0.8-2.el8_6) (build 17.0.5+8-LTS) OpenJDK 64-Bit Server VM (Red_Hat-17.0.5.0.8-2.el8_6) (build 17.0.5+8-LTS, mixed mode, sharing)

GraalVM version (if different from Java)

No response

Quarkus version or git rev

3.4.1

Build tool (ie. output of mvnw --version or gradlew --version)

3.9.2

Additional information

No response

@bartm-dvb bartm-dvb added the kind/bug Something isn't working label Oct 10, 2023
@quarkus-bot
Copy link

quarkus-bot bot commented Oct 10, 2023

/cc @cescoffier (redis), @gsmet (redis), @gwenneg (cache), @machi1990 (redis)

@geoand
Copy link
Contributor

geoand commented Oct 10, 2023

Without an example application that exhibits this problem, it will be very hard to track down the problem

@geoand geoand added the triage/needs-reproducer We are waiting for a reproducer. label Oct 10, 2023
@bartm-dvb
Copy link
Contributor Author

Without an example application that exhibits this problem, it will be very hard to track down the problem

I will add an example application a bit later

@bartm-dvb
Copy link
Contributor Author

Sample application:

https://github.com/bartm-dvb/quarkus-redis-bug

It seems to happen only when the Redis client is configured in cluster mode. Running a load test on this will cause it to run out of open files. I will run tests tomorrow with more open files allow to see if I will run in to thread-blocked warnings

@cescoffier
Copy link
Member

\CC @Ladicek

@geoand geoand removed the triage/needs-reproducer We are waiting for a reproducer. label Oct 11, 2023
@Ladicek
Copy link
Contributor

Ladicek commented Oct 11, 2023

It also seems to cause a very high number of open files on the Linux system.

I'm fairly sure this is caused by vert-x3/vertx-redis-client#365 (connection pooling basically doesn't work at all), which indeed only manifests when using clustered Redis client.

This should be fixed in the next Vert.x release, which should be this week. Once that is integrated in Quarkus, we can see whether there's more to this issue.

@bartm-dvb
Copy link
Contributor Author

bartm-dvb commented Oct 12, 2023

Thanks, I see the new Vert.x version was released yesterday. I will rerun the test when the new Quarkus version is released so we can close this issue

@cescoffier cescoffier added the triage/upstream Used for issues which are caused by issues in upstream projects/dependency label Oct 15, 2023
bartm-dvb added a commit to bartm-dvb/quarkus that referenced this issue Oct 16, 2023
- Fixes Redis connection pooling for quarkusio#36382
bartm-dvb added a commit to bartm-dvb/quarkus that referenced this issue Oct 16, 2023
- Fixes Redis connection pooling for quarkusio#36382
@cescoffier
Copy link
Member

Vert.x 4.4.6 seems to fix the issue, at least improve a lot.
I still see Redis error, but I believe it's a docker issue more than a Quarkus/Vert.x issue.

@bartm-dvb
Copy link
Contributor Author

That's good news! Thanks for taking the effort to run the test.

@quarkus-bot quarkus-bot bot added this to the 3.6 - main milestone Oct 19, 2023
@gsmet gsmet modified the milestones: 3.6 - main, 3.5.1 Oct 26, 2023
@aloubyansky aloubyansky modified the milestones: 3.5.1, 3.2.8.Final Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cache area/redis kind/bug Something isn't working triage/upstream Used for issues which are caused by issues in upstream projects/dependency
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants