Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spring Data R2DBC two times slower than Spring Data JDBC for Flux response #203

Closed
Aleksandr-Filichkin opened this issue Sep 30, 2019 · 16 comments
Labels
status: declined A suggestion or change that we don't feel we should currently apply

Comments

@Aleksandr-Filichkin
Copy link

Aleksandr-Filichkin commented Sep 30, 2019

Hi,

I've compared Spring Data R2DBC with Spring Data JDBC (Postgres DB) and see that add/get/delete performance almost the same. But for the method
public Flux<User> getWithLimit(@RequestParam int limit) { return userRepository.findWithLimit(limit); }

@Query("select * from blog.user limit $1") Flux<User> findWithLimit(int limit);

the throughput two times worse.

Here

@Aleksandr-Filichkin
Copy link
Author

@mp911de Not sure if it's possible to improve the performance.
Also maybe the problem with postgres-r2dbc, not with this repo.

@mp911de
Copy link
Member

mp911de commented Sep 30, 2019

Thanks for having a look. Please check out our performance efforts at pgjdbc/r2dbc-postgresql#158 that contain JMH benchmarks. Results clearly show that PGJDBC is significantly faster in extended query mode than R2DBC.

We're happy to receive contributions in the form of optimizations or if someone is able to pin down a specific issue that we can address.

On a related note: With a bare-bone netty client that just receives and sends Postgres frames (without decoding these and without all the reactive dance), we weren't able to significantly improve SELECT performance. The best value there was about 60% of the PGJDBC performance. We'd be happy to get you involved in that efforts.

@mp911de mp911de added the status: ideal-for-contribution An issue that a contributor can help us with label Sep 30, 2019
@Aleksandr-Filichkin
Copy link
Author

@mp911de, Thank you. Didn't see this great JMH benchmarks. I got pretty the same result.

@mp911de mp911de added status: declined A suggestion or change that we don't feel we should currently apply and removed status: ideal-for-contribution An issue that a contributor can help us with labels Oct 15, 2019
@mp911de
Copy link
Member

mp911de commented Oct 15, 2019

Closing this issue as this isn't something we can work on for now.

@mp911de mp911de closed this as completed Oct 15, 2019
@marccollin
Copy link

so actually no way to improve performance of select query?

@mp911de
Copy link
Member

mp911de commented Oct 15, 2019

Primarily, this is a gap in the Postgres driver. PGJDBC yields on localhost about ~15.000 queries per sec, the R2DBC driver ~11.000 queries/sec (measurements as of my machine). Putting Postgres into docker yields ~ 1360 queries/sec with PGJDBC and about ~1355 queries/sec (both on localhost). So inducing latency as in real-world applications narrows the gap from ~36% difference to 1% difference.

Please also note that due to heavy Mono and Flux creation we experience an increased GC pressure and a 25% performance drop in queries/second returning a small number of items is the normal mode of operation. Strengths of reactive data access come into play when the queried amount of data is huge, the database requires multiple roundtrips as with cursors, or the application can benefit from pipelining. These aren't the benchmark cases found in archetypical benchmark comparisons.

/cc @davecramer

@marccollin
Copy link

thanks a lot for theses explications

@davecramer
Copy link

It's actually fairly difficult to get better performance than the blocking driver when testing a single connection. I don't think anyone has tried to measure concurrent bandwidth

@marccollin
Copy link

marccollin commented Oct 15, 2019

it should have some... to help people to choose between two way of development

@davecramer
Copy link

@marccollin some ?
The bigger difference is how many threads you require to read N JDBC connections vs R2DBC connections. JDBC is 1:1 whereas R2DBC should be 1:many

@mp911de
Copy link
Member

mp911de commented Oct 15, 2019

You don't buy into reactive programming because you want your queries to run faster. You apply reactive programming patterns to improve your application scalability and resilience.

@marccollin
Copy link

@davecramer concurent bandwidth

@davecramer
Copy link

@marccollin I"d love to see some numbers. I think it's probably not easy to test given that the backend will have some effect on it as well

@ianynchen
Copy link

@mp911de and throughput too. reactive programming in general should slightly slow down your program when system is not pressured, if there is a noticeable difference.

@wplong11
Copy link

wplong11 commented Oct 29, 2022

@mp911de reply to #203 (comment)

You don't buy into reactive programming because you want your queries to run faster. You apply reactive programming patterns to improve your application scalability and resilience.

How does reactive programming relate to application scalability and resilience? Maybe you are confused with Reactive System? or joke?

@TheOnlyGodOfCoding
Copy link

@wplong11 Reactive programming and reactive systems are related concepts. Reactive programming is a programming paradigm that is designed to allow developers to build systems that are scalable, resilient, and responsive. It is based on the idea of creating software that reacts to changes in the environment, rather than being actively controlled by the developer.

Reactive systems, on the other hand, are systems that are designed and built using reactive programming principles. They are systems that are event-driven, non-blocking, and capable of handling a high volume of concurrent requests and events. Reactive systems are often characterized by their ability to scale horizontally and their use of asynchronous, message-driven communication between components.

Overall, reactive programming is a way of building software, while reactive systems are systems that are built using reactive programming principles. Reactive programming can be used to build a wide variety of systems, including web applications, distributed systems, and mobile apps. Reactive systems are often used to build highly scalable and resilient systems that can handle a high volume of concurrent requests and events.

Can you tell your thoughts on why reactive programming is not related to the application scalability and resilience?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: declined A suggestion or change that we don't feel we should currently apply
Projects
None yet
Development

No branches or pull requests

7 participants