Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low performance of C++ multi-channel streaming and unary scenarios on nightly benchmarks #14899

Open
hellishfire opened this issue Apr 2, 2018 · 16 comments

Comments

@hellishfire
Copy link

Please answer these questions before submitting your issue.

Should this be an issue in the gRPC issue tracker?

Yes

What version of gRPC and what language are you using?

c++, grpc 1.10

What operating system (Linux, Windows, …) and version?

Linux CentOS 7.2, kernel 3.10

What runtime / compiler are you using (e.g. python version or version of gcc)

gcc 7.2

What did you do?

Perform insecure unary throughput qps test using grpc's c++ qps driver

What did you expect to see?

Reasonable qps result.

What did you see instead?

According to the grpc performance test page for latest release version (https://performance-dot-grpc-testing.appspot.com/explore?dashboard=5636470266134528), c++ unary and stream performance is much poorer than Java's (less than half of java's).

This was not the case when I last checked this page for grpc 1.6. C++ qps was twice as much as Java's back then, now it's less than 1/2. What's with this drastic drop in throughout performance?

I also conducted the unary qps test myself with grpc's qps driver. The default poll engine is now epoll1 for me, and I observe heavy load on a single thread named "gpr_executor", while other threads are relatively idle. The qps result is only 250k on two machines with 64 logic cores.

@Disturbing
Copy link

Agreed - I just noticed this as well...

@ghost
Copy link

ghost commented Jun 14, 2018

Hello,

If I am not mystaken, only the cpp benchmark is unexpectedly slow right? Not the cpp async server itself?

I have tested locally using a basic bench with

  • greeter_async_server.cc: raw from github
  • greeter_async_client2.cc: modified to send 1M requests; configurable nb of req in flight; and wait for all the responses at the end

With this I get around 73k-51k qps, depending on the latency and "nb of req in flight".
That is with only one server thread.
I'm guessing there is a concurrency issue in the QPS tests, 11k QPS/core seems awfully slow[=88kQPS on 8 cores].

What can I do to help?

PS: just I case you made the same mystake I did: make sure to build everything in Release, Debug builds can't go much higher than 10k req/s.

NOTE: tested on localhost, haswell 4 cores, 32GB RAM, latency set with tc qdisc replace dev lo root netem delay XXms

@ykameshrao
Copy link

Whats the status on this?

@jtattermusch
Copy link
Contributor

CC @jtattermusch

@moiyer
Copy link

moiyer commented Nov 11, 2018

Have the same question. Why on gRPC official benchmark, c++ performance is much pooler than Java?

@Disturbing
Copy link

Disturbing commented Nov 12, 2018 via email

@moonspirit
Copy link

I din't see much difference for my own c++ async benchmark code between grpc v.18 and v.1.6. roughly 13w/s qps for one thread stream ping-pong and 45w/s for 4 threads.

But the diagram show java performs much better than c++ which was really confusing.

@vjpai vjpai assigned mhaidrygoog and unassigned ncteisen Jan 29, 2019
@mhaidrygoog
Copy link
Member

I ran one of the scenarios on a single GCE instance and found CPP perform better than Java in a local environment by 100,000 QPS. Also the CPU utilization for both server and client were equivalent for CPP and Java. Something I didn't observe in the dashboards. Continuing my investigation by setting 1 server - 1 client GCE instances and seeing if there is a regression on network communication

@jtattermusch
Copy link
Contributor

Internal issue b/125367392

@stale
Copy link

stale bot commented Sep 4, 2019

This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 180 days. It will be closed automatically if no further update occurs in 1 day. Thank you for your contributions!

@jtattermusch
Copy link
Contributor

this is still a problem - let's leave open.

@stale stale bot removed the disposition/stale label Sep 4, 2019
@mhaidrygoog mhaidrygoog changed the title Poor c++ performance benchmark result Low performance of C++ multi-channel streaming and unary scenarios on nightly benchmarks Sep 20, 2019
@mhaidrygoog
Copy link
Member

Lowering priority as this is not an issue with the C++ or the benchmark itself. Its a setup issue where the benchmark clients are under utilized and we are working on root causing it.

@smahapatra1
Copy link

Can you please elaborate? Did you not find a regression on network communication? Can we be confident that C++ async program will perform well?

@jtattermusch
Copy link
Contributor

Can you please elaborate? Did you not find a regression on network communication? Can we be confident that C++ async program will perform well?

The reason why we think it's a problem of the test setup and not an actual gRPC C++ performance regression is that we have a set of internal benchmarks that perform much better than the ones available externally and the internal benchmarks the problem described in this issue. (Unfortunately we cannot publish the internal numbers because details of the internal google infrastructure are confidential). We're still investigating the root cause of the unexpectedly low test results in some C++ scenarios in the external benchmarks suite - it's quite complicated. We'll update this issue once we find something.

@stale
Copy link

stale bot commented May 6, 2020

This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 30 days. It will be closed automatically if no further update occurs in 7 day. Thank you for your contributions!

@xunliu
Copy link

xunliu commented Apr 26, 2021

this is still a problem - let's leave open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests