Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpcbox-ct: Stress test grpcbox #29

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

vasu-dasari
Copy link
Contributor

Added a stress_test to stress number of concurrent gRPC sessions that can be run. BY default it is set to 10. But can be modified by environment variable GRPCBOX_STRESS_TEST

# To run 25 concurrent sessions:
$ GRPCBOX_STRESS_TEST=25 rebar3 ct --suite grpcbox_SUITE --case stress_test

On my system, it above command breaks around 30-35.

Signed-off-by: Vasu Dasari vdasari@gmail.com

Added a stress_test to stress number of concurrent gRPC sessions that can be run. BY default it is set to 10. But can be modified by environment variable GRPCBOX_STRESS_TEST

# To run 25 concurrent sessions:
$ GRPCBOX_STRESS_TEST=25 rebar3 ct --suite grpcbox_SUITE --case stress_test

On my system, it above command breaks around 30-35.

Signed-off-by: Vasu Dasari <vdasari@gmail.com>
@tsloughter
Copy link
Owner

Thanks, I was wanting to add something for this that utilized an existing tool used by other grpc servers to stress test.

@vasu-dasari
Copy link
Contributor Author

I am using grpcbox for my project. I have a gRPC server and multiple gRPC clients(~10-15) and they use bidirectional streams for communication. When all clients initiate bidirectional transfer at the same time, I see that some calls are not successful. I am still debugging that issue. Meanwhile thought having some way of performing stress test in grpcbox's ct is a good idea as well. And hence this PR.

As I mentioned in my commit, once concurrent sessions go beyond 30-35, we can see failures. It would be interesting to see if there are any parameters I could fine tune to bump that number or if there are any blockades in dependencies, etc.

@tsloughter
Copy link
Owner

Do you know what the error is? Maybe it is related to joedevivo/chatterbox#136 which I still haven't gotten around to fixing :(

@vasu-dasari
Copy link
Contributor Author

Actually, I am debugging around h2_connection side of things. In my case, some messages/connections are getting dropped. I need to characterize this completely.

Thanks.

@vasu-dasari
Copy link
Contributor Author

@tsloughter Revisiting this issue after a long time.

Here is my theory. There is a single default_channel between grpc-client application and a grpc-server. But there are multiple processes within client application(C1,C2,C3) trying to make gRPC calls to the server. I see that default_channel is a single h2 connection and hence single TCP connection. As number of processes initiating gRPC calls increases, there is a possibility that one process's (say C1) call might step on another process's call(say C2). I believe this is causing the test case failure. I am arriving at this conclusion after matching h2_connection's receive data and the data that is processed in h2_stream. Most of the time it matches but sometimes it fails.

           +-----------+
           |   server  |
           +-----+-----+
                 |
                 |
                 |
           +-----+-----+
   +-------+   client  +-------+
   |       +------+----+       |
   |              |            |
   |              |            |
+--+-+         +--+-+        +-+--+
| C1 |         | C2 |        | C3 |
+----+         +----+        +----+

Here is what I am thinkinking how to fix the issue:

  1. Without any code changes to grpcbox, have client application specify maximum number or channels it would like to use. In this case, it would be 3 channels looking exactly same but with different names, channel_1, channel_2, etc. And whenever making a unary/stream call, have the caller specify which channel to use for the call.
  2. Modify grpcbox to be able to add API like, grpcbox_channel:add_channel and delete_channel so that application need not know before hand how many gRPC clients it might be needing before hand, and have it manager channel contexts. This model also helps to specify client gRPC channels during runtume

I am inclining towards 2, let me know what you think as it is a bit cleaner.

I will have a pretty of this and will create a separate pull request.

vasu-dasari added a commit to vasu-dasari/grpcbox that referenced this pull request Nov 16, 2020
This commit addresses stress-test failure mentioned in tsloughter#29.

Added two new APIs:
add_channel(Name, Endpoints, Options)
delete_channel(Pid)

This would give ability to user to add and delete channels on the fly.

Also modified stress_test test case to use this logic. With out this change, stress test fails around 10 simultaneous connections. With this change I can see around 90 simultaneous connections.

Signed-off-by: Vasu Dasari <vdasari@gmail.com>
Base automatically changed from master to main March 22, 2021 19:49
@codecov
Copy link

codecov bot commented Dec 27, 2021

Codecov Report

Merging #29 (c5a5d37) into main (0166760) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main      #29   +/-   ##
=======================================
  Coverage   38.99%   38.99%           
=======================================
  Files          28       28           
  Lines        2090     2090           
=======================================
  Hits          815      815           
  Misses       1275     1275           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0166760...c5a5d37. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants