-
Notifications
You must be signed in to change notification settings - Fork 10.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Low performance of C++ multi-channel streaming and unary scenarios on nightly benchmarks #14899
Comments
Agreed - I just noticed this as well... |
Hello, If I am not mystaken, only the cpp benchmark is unexpectedly slow right? Not the cpp async server itself? I have tested locally using a basic bench with
With this I get around 73k-51k qps, depending on the latency and "nb of req in flight". What can I do to help? PS: just I case you made the same mystake I did: make sure to build everything in Release, Debug builds can't go much higher than 10k req/s. NOTE: tested on localhost, haswell 4 cores, 32GB RAM, latency set with |
Whats the status on this? |
Have the same question. Why on gRPC official benchmark, c++ performance is much pooler than Java? |
I'm very curious on what happened - I use to see 2.5M+++ QoS for C++. This
should be prioed - to at least understand whats going on and if there's a
plan to get the same performance, or was it a fluke?
On Sun, Nov 11, 2018 at 11:51 AM moiyer ***@***.***> wrote:
Have the same question. Why on gRPC official benchmark, c++ performance is
much pooler than Java?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#14899 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACQev8yqF8S2j_E9RjbGaxb5UOl5dMz0ks5ut57TgaJpZM4TDIhD>
.
--
[image: logo] <http://www.kintohub.com>
*Joseph Cooper*
*Co-founder and CEO*
Email: joseph@kintohub.com <joseph@kintohub.com>
HK Phone: +852 9131 0727
US Phone: 1.626.720.4115
|
I din't see much difference for my own c++ async benchmark code between grpc v.18 and v.1.6. roughly 13w/s qps for one thread stream ping-pong and 45w/s for 4 threads. But the diagram show java performs much better than c++ which was really confusing. |
I ran one of the scenarios on a single GCE instance and found CPP perform better than Java in a local environment by 100,000 QPS. Also the CPU utilization for both server and client were equivalent for CPP and Java. Something I didn't observe in the dashboards. Continuing my investigation by setting 1 server - 1 client GCE instances and seeing if there is a regression on network communication |
Internal issue b/125367392 |
This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 180 days. It will be closed automatically if no further update occurs in 1 day. Thank you for your contributions! |
this is still a problem - let's leave open. |
Lowering priority as this is not an issue with the C++ or the benchmark itself. Its a setup issue where the benchmark clients are under utilized and we are working on root causing it. |
Can you please elaborate? Did you not find a regression on network communication? Can we be confident that C++ async program will perform well? |
The reason why we think it's a problem of the test setup and not an actual gRPC C++ performance regression is that we have a set of internal benchmarks that perform much better than the ones available externally and the internal benchmarks the problem described in this issue. (Unfortunately we cannot publish the internal numbers because details of the internal google infrastructure are confidential). We're still investigating the root cause of the unexpectedly low test results in some C++ scenarios in the external benchmarks suite - it's quite complicated. We'll update this issue once we find something. |
This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 30 days. It will be closed automatically if no further update occurs in 7 day. Thank you for your contributions! |
this is still a problem - let's leave open. |
Please answer these questions before submitting your issue.
Should this be an issue in the gRPC issue tracker?
Yes
What version of gRPC and what language are you using?
c++, grpc 1.10
What operating system (Linux, Windows, …) and version?
Linux CentOS 7.2, kernel 3.10
What runtime / compiler are you using (e.g. python version or version of gcc)
gcc 7.2
What did you do?
Perform insecure unary throughput qps test using grpc's c++ qps driver
What did you expect to see?
Reasonable qps result.
What did you see instead?
According to the grpc performance test page for latest release version (https://performance-dot-grpc-testing.appspot.com/explore?dashboard=5636470266134528), c++ unary and stream performance is much poorer than Java's (less than half of java's).
This was not the case when I last checked this page for grpc 1.6. C++ qps was twice as much as Java's back then, now it's less than 1/2. What's with this drastic drop in throughout performance?
I also conducted the unary qps test myself with grpc's qps driver. The default poll engine is now epoll1 for me, and I observe heavy load on a single thread named "gpr_executor", while other threads are relatively idle. The qps result is only 250k on two machines with 64 logic cores.
The text was updated successfully, but these errors were encountered: