-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plugout network lead to server cannot receive packet #2224
Comments
4.0.4 is very old at this point. Does this still happen with 4.2.0? |
there is only 4.0.4 available for Windows installer: http://zeromq.org/distro:microsoft-windows |
Have you managed to try with 4.2.x? |
Sorry for late reply. I tested with ZeroMQ 4.2, it doesn't work as well. |
Yes, using the C# version, please replace |
The protocol itself is compatible between those versions. If as you're saying you can't reproduce the problem using only the libzmq library, then I would recommend moving the issue to the tracker of the binding you are using as there's not much we can do here. If you can reproduce the issue using only the libzmq library, or find that the root cause is here somewhere, please feel free to reopen this issue. |
Hi, there are following test cases: Test Case 2) Test Case 3) It could be reproduced each time, I use visual C++ 20013, ZeroMQ 4.2, Win10. the sample code is similar as following. void client_run()
} void server_run()
} |
Can you provide a test case that does not use cppzmq bindings? That's an external binding and as such could skew the results |
Thanks a lot for your continuously help. I made an example based on libzmq directly with C. The server cannot receive any packet when plug out (more than one minutes) the network link and plug in again. the following is the code that I used. int client_dealer_dealer_c()
} void zmq_server_task_router_c()
} |
I use the wireshark capture the TCP communication, and find that
So it looks good but application server didn't receive any packet. is there any timeout in ZeroMQ? because if we disconnect the network and connect again in short time, say 10 seconds. the communication works after network is plugged in. I analysed the wireshark capture (both client/server based on Win10) with 10 second and 1 minutes, and found that after reconnecting.
I also run the client in virtual machine (based on Linux) to communicate with host (Windows 10), I found that even it is disconnect fore more than 1 minutes, the communication is still okay. So it looks like there are some difference between Windows/Linux ZeroMQ communication. |
I did some more test with following test case, Dealer (Linux VM, client), Router (Win10 host, server).
so it might related with TCP keep alive, but when I check the parameter of Linux TCP
so the communication lost might related with port changes (i.e. TCP connection changes), but I don't know detail how ZeroMQ detail with this. |
I tried change the TCP keep alive parameter, it doesn't work, but fortunately, there is an option ZMQ_ROUTER_HANDOVER support from 4.1 which can switch to new connection with same identity. I tested with this option, and it works !!! This issue can be closed, thanks for supporting. |
only one issue in this case is that there are some messages will lost during the hand over, at least all message in buffer of old TCP connection will lost. But I didn't find any rule which message will lost in fact of time or others. |
Great, happy to know that option helps your use case, thanks for reporting back |
Hi all,
I wrote a Windows (Win 10) service that using c#, the service itself is quite simple, which is something like log service. the expectation is client send request to service and service don't need response to the client, so dealer/route mode are selected (http://zguide.zeromq.org/page:all#The-Asynchronous-Client-Server-Pattern), but we don't have bottom worker because no load balance is need.
the service is written with C# as Win10 service and client is written with C++. and the ZeroMQ version is 4.0.4, which is the latest. server code is similar as https://github.com/metadings/zguide/blob/master/examples/C%23/flserver3.cs.
it works well in normal case, when we test in some network abnormal case:
Is anyone has similar issues, and is there any idea how to solve this issue? what confused me is that why TCP communicaiton looks fine but server could not receive packet?
Best Regards,
Kevin
The text was updated successfully, but these errors were encountered: