-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
router stops after some time #375
Comments
Hi @guni9191 , Error caseThe error is occurring only when trying to close the DDS Router application, or the application just stops and then you are not able to stop it? DDS networkPlease, let us know the data types and rates that you are using, and also the QoS of your topics. Network architectureAre you working in local, WAN, in the same host? What is you bandwidth? All the information that you are able to give us will help us to solve your problem. |
=> The application just stops and then I am not able to stop it. I have tried echo participant you have introduced me and it stops showing information too. when ^c is pressed, "Stopping DDS Router" only shows up.
=> i am using custom ROS2 msg types, 25 topics (nine 2hz, five 5hz, nine 1hz, two 0.1hz). i'm not sure about the data length, but my wireshark detects that the packets are 7052 frames/sec and a single frame contains 1304 bytes. Most of the qos setting is ROS2 QOS default setting, except one topic uses liveliness qos. This is very unusually large amout of data since my local rtps frame only have 300bytes on average, and not much frames(only about 200 compared to 7052frames). Also from wireshark i see a single frame that contains multiple duplicated messages (tcp payload). is this normal?
=> not sure about how to check the bandwidth but i'm guessing it's at least 100mbps. there seems to be some kind of firewall for my wifi but not sure about my environment. As i've said earlier i'm using azure cloud server so it's WAN. I was testing the round trip time by using system stamp, and when it stops, the rtt reaches to almost 20seconds. |
also my config for the tcp client is as below allowlist:
participants:
|
@guni9191, thank you for the detailed information. If you could help us further, it will be important to know if the freeze is produced due to CPU usage and/or memory usage. An Finally, I guess the large size of your frames is related with TCP. Would you be able to run it with UDP? |
Can you guys test fastdds router in heavy traffic, low bandwidth environment? as far as i know, dds should work robustly in such a difficult situation, and most of all, the application should not stop. thx in advance for your response |
@jparisu Let me explain how I've found out.
To limit the bandwidth intentionally, I have used "wondershaper" tool and limited "PC A" bandwidth with downspeed 6mbps and upload speed 2mbps. then, "node C" in "PC A" got some of the message from "topic B", and eventually it stopped receiving any messages. When I tried to stop fastdds router of "PC A" in this state, I got the message "Stopping DDS Router" but it did not stop gracefully. If i close "node A" the router stopped correctly, but closing "node C" didn't stop router from gracefully stopping. Can you guess why the "node C" gradually stopped from subscribing topics and router ^C message also got stuck? If my environment have such a limited bandwidth, then is there another way to avoid this behavior? |
Hi @guni9191 I think we know what could happen in your scenario. We see two problems here: BandwidthIn an scenario with a limited bandwidth, it could happen that the DDS Router receives messages faster than it can route them. This will slow the whole application, arriving to a point where some messages have to be discarded for memory issues. Check the following documentation: https://eprosima-dds-router.readthedocs.io/en/latest/rst/user_manual/configuration.html#maximum-history-depth DDS Router closureWe think we found a bug in the DDS Router thread management that makes application to not close until all messages have been forwarded. Thus, if messages arrive faster than they are delivered, this behavior could happen. (We are not sure about this but it could be the case). New DDS Router updateIt is not related with this issue, but we have importantly update the DDS Router so the core logic is moved to a different repository (https://github.com/eProsima/DDS-Pipe). CommentAre you using different domains or Discovery-Server in order to force different nodes to communicate through the router? |
hi, while I was using DDS-router some time about 10 minutes, the dds-router is not able to gracefully end its process and just stopped. both azure and pc's dds-router application is stuck and ^c is not working properly.
my test environment is azure ubuntu 20.04 that has public ip and ubuntu pc, tcp connected. I'm sending some ros2 topics, and am not sending anything that is heavy such as videos.
Can anybody guess why it suddenly stops?
The text was updated successfully, but these errors were encountered: