-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] too many connections #125
Comments
Hi @rudy2steiner. Thanks for your interest in Confluo! Can you provide more concrete steps to reproduce the bug? E.g., what were the exact steps you took to run Confluo, and your client program to trigger the issue you have outlined above? |
sure, i will provide concrete steps to reproduce the bug @anuragkh |
@rudy2steiner Checking back on this. |
ok ,i will finish this in next few day |
too many connections is not a bug, but caused by incorrect test. setting used in my experiment showed in below.and run on a 16 core(Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz),32threads ,256G server:
i got some preliminary result(produce duration 3 minutes ), as follow:
i notice that 28 producer(concurrency) can get the most optimization QPS, and top command show as below(CPU info show in bold). since we produce 86468192*1Kb= 86G message in memory, the performance down to 139191/s( ?) if we continue to produce (a new produce task as before):PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND if we continue to increase producer up to 32, the QPS only equal to 16 producers, because of the cpu overloadwe can do more case test based on the benchmark branch( https://github.com/rudy2steiner/confluo/blob/benchmark) |
@anuragkh hi there |
Hey @rudy2steiner, thanks for sharing your findings. I suspect you will be able to achieve higher qps using batching. Would you be interested in adding in your implementation for the Producer/Consumer to Confluo by submitting a PR? You would need to clean up the implementation and add documentation, but I would be happy to review your code if you submit a PR. |
it's my pleasure to submit a PR,i will try |
Hi @rudy2steiner, things should improve with #136. Let me know if you are able to confirm this! |
Closing this due to lack of activity. #136 improves how Confluo handles multiple client connections, and should remove this issue. |
Describe the bug
I try to do a benchmark test on confluo as a pub/sub system, and have default conf.
100 proudcer and enough Memory(more than 100G), with a single partition, duration 5min
but server crashed after two minutes and notice two strange things as follow:
ERROR: signal 11
ERROR: signal 11
ERROR: signal 11
ERROR: signal 11
ERROR: signal 11
ERROR: signal 11
ERROR: signal 11
ERROR: signal 11
ERROR: signal 11
2019-01-10 20:00:01 ERROR: Could not start server listening on 0.0.0.0:60088: pthread_create failed
ERROR: signal 11
ERROR: signal 11
ERROR: signal 11
ERROR: signal 11
ERROR: signal 11
ERROR: signal 11
ERROR: signal 11
ERROR: signal 11
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20957>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20958>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20959>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20960>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20961>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20962>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20963>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20964>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20965>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20966>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20967>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20968>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20969>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20970>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20971>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20972>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20973>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20974>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20975>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
SocketInfo: <Host: 10.190.90.32 Port: 20976>
PeerHost: 10.190.90.32
PeerAddress: 10.190.90.32
--------------------
[root@A02-R05-I143-108-BM9PLP2 confluo]# cat log/confluo.stderr |grep '10.190.90.32'|wc -l
2607
How to reproduce the bug?
https://github.com/rudy2steiner/confluo/blob/benchmark/javaclient/src/main/java/confluo/streaming/ConfluoProducer.java
Expected behavior
i thought rpc has a long connection with confluo server,will be use in the same producer until end,so connection should equal or about
Platform Details
i run confluo java client on mac
any one can help me?
The text was updated successfully, but these errors were encountered: