Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ros2 topic echo] does not echo a published message by [ros2 topic pub] if joining late, even with --qos-durability=transient_local #523

Closed
craigh92 opened this issue Jun 2, 2020 · 18 comments · Fixed by #528

Comments

@craigh92
Copy link

craigh92 commented Jun 2, 2020

Bug report

Required Info:

  • Operating System:
    • Ubuntu 18.04
  • Installation type:
    • Binary
  • Version or commit hash:
    • Eloquent
  • DDS implementation:
    • Default
  • Client library (if applicable):
    • rclpy

Steps to reproduce issue

If I start a publisher with the ros2cli tool at 0.1Hz, with the option --qos-durability=transient_local, and then wait 10 seconds for it to publish a message, and then I start a topic echoer with the same qos settings, I would expect to see the last message published by the publisher, even though the echoer wasn't around when the message was published. However, this doesn't happen.

i.e: First:

ros2 topic pub -r 0.1 --qos-durability=transient_local hello std_msgs/msg/String

Then after 10 seconds in a new terminal:

ros2 topic echo --qos-durability=transient_local hello std_msgs/msg/String

Expected behavior

As soon as the ros2 topic echo node starts, it should print the message that was published by the publisher before this node was started.

Actual behavior

It ignores this first message, and only prints a new message after another 10 seconds when the publisher publishes a new message

Implementation considerations

I created my own echo node with the transient_local durability option to see if this would work. This behaves with the expected behavior I mentioned above.

This is probably an awful implementation, and it does not accept other qos options, but it proves that even when late joining, an echo node can get the last message of a publisher.

import rclpy
from rclpy.node import Node
from rclpy.executors import MultiThreadedExecutor
from rclpy.qos import QoSProfile, QoSDurabilityPolicy
import sys
from typing import List, Any

"""
Use:
    ros-topic-echo-transient-local.py  <namespace/path/topic_name> <fully/qualified/message_type>

Theres a bug with the command line utility `ros2 topic echo` when used with the `--qos-durability=transient_local` policy.
This script behaves in the same way that the command:
    ros2 topic echo --qos-durability=transient_local <namespace/path/topic_name> <fully/qualified/message_type>
should.

It uses dynamic programing to modify the source code at run time so that the node can subscribe to the type passed
in as an argument.
"""


def main(topicName : str, messageType : Any):
    """
    Start the 'topic-echo-node' and spin untill exception is thrown
    """
    rclpy.init()
    echoNode = Node('topic_echo_node')

    latching_qos = QoSProfile(depth=1,
        durability=QoSDurabilityPolicy.RMW_QOS_POLICY_DURABILITY_TRANSIENT_LOCAL)

    echoNode.create_subscription(messageType, topicName,
        lambda msg : print(str(msg)),
        qos_profile=latching_qos)

    exe = MultiThreadedExecutor()
    exe.add_node(echoNode)
    exe.spin()

if __name__ =="__main__":


    topicName = sys.argv[1]
    messageTypePathStr = sys.argv[2]

    #convert the fully qualified message type into a list of its path and name
    messageTypePathStr = messageTypePathStr.split('/')

    fullPath = messageTypePathStr[0] #the first part of the path
    for path in messageTypePathStr[1:-1]: #from the second to one before last element
        fullPath += '.' + path
    messageTypeStr = messageTypePathStr[-1] #the last element

    print("importing " + messageTypeStr + " from " + fullPath)
    cmd = "from " + fullPath + " import " + messageTypeStr
    exec(cmd)
    
    messageType = None
    cmd = "messageType = " + messageTypeStr
    exec(cmd)
    print("using message type " + str(messageType))
    print("subscribing to " + topicName + '\n')

    main(topicName, messageType)

This can be run with

python3 ros2-topic-echo-transient-local.py hello std_msgs/msg/String

Looking at the code: https://github.com/ros2/ros2cli/blob/master/ros2topic/ros2topic/verb/echo.py

It seems that the only key difference is that the qos profile is created with:

    qos_profile = qos_profile_from_short_keys(
        args.qos_profile, reliability=args.qos_reliability, durability=args.qos_durability)

And in my example it is:

    latching_qos = QoSProfile(depth=1,
        durability=QoSDurabilityPolicy.RMW_QOS_POLICY_DURABILITY_TRANSIENT_LOCAL)

So perhaps the qos_profile_from_short_keys generates a QoS profile with depth 0? That's the only thing I can think of that would cause this.

Comments

This utility would be really useful to me because it allows me to query the state of my system. For example, I have a bunch of nodes representing robots, that publish the state of the robots when they change. I want to quickly query the state of one of the robots so I use ros2 topic echo to print the messages on that topic. At the moment I will only be able to see the state of my robots if the state changes after I have run ros2 topic echo. If the transient_local policy worked I would be able to get the state even if it hasn't changed during the time the echo node has been running (as long as it is published once, for example when the publisher first starts up).

@fujitatomoya
Copy link
Collaborator

problem confirmed with source build ros2/ros2@0444bff

@clalancette
Copy link
Contributor

If you manually specify the reliability, does it work for you?

@fujitatomoya
Copy link
Collaborator

if I change the following, it works fine.

diff --git a/ros2topic/ros2topic/api/__init__.py b/ros2topic/ros2topic/api/__init__.py
index 0c8234f..682225a 100644
--- a/ros2topic/ros2topic/api/__init__.py
+++ b/ros2topic/ros2topic/api/__init__.py
@@ -25,7 +25,11 @@ from ros2cli.node.strategy import NodeStrategy
 from rosidl_runtime_py import get_message_interfaces
 from rosidl_runtime_py import message_to_yaml
 from rosidl_runtime_py.utilities import get_message
-
+from rclpy.qos import QoSDurabilityPolicy
+from rclpy.qos import QoSHistoryPolicy
+from rclpy.qos import QoSLivelinessPolicy
+from rclpy.qos import QoSProfile
+from rclpy.qos import QoSReliabilityPolicy
 
 def get_topic_names_and_types(*, node, include_hidden_topics=False):
     topic_names_and_types = node.get_topic_names_and_types()
@@ -149,12 +153,11 @@ def qos_profile_from_short_keys(
 ) -> rclpy.qos.QoSProfile:
     """Construct a QoSProfile given the name of a preset, and optional overrides."""
     # Build a QoS profile based on user-supplied arguments
-    profile = rclpy.qos.QoSPresetProfiles.get_from_short_key(preset_profile)
-    if durability:
-        profile.durability = rclpy.qos.QoSDurabilityPolicy.get_from_short_key(durability)
-    if reliability:
-        profile.reliability = rclpy.qos.QoSReliabilityPolicy.get_from_short_key(reliability)
-
+    profile = QoSProfile(
+        depth=10,
+        history=QoSHistoryPolicy.RMW_QOS_POLICY_HISTORY_KEEP_ALL,
+        reliability=QoSReliabilityPolicy.RMW_QOS_POLICY_RELIABILITY_BEST_EFFORT,
+        durability=QoSDurabilityPolicy.RMW_QOS_POLICY_DURABILITY_TRANSIENT_LOCAL)
     return profile

@fujitatomoya
Copy link
Collaborator

This is because qos profile on publication(in this case ros2 topic pub) is set as system_default, and depth is 0.

changing qos profile, it can work as following,

# ros2 topic pub -r 1.0 --qos-profile=sensor_data --qos-durability=transient_local /chatter std_msgs/msg/String '{data: hello}'
...<snip>
# ros2 topic echo --qos-durability=transient_local /chatter std_msgs/msg/String
...<snip>

I do not think this is bug but just a setting.

@clalancette
Copy link
Contributor

@fujitatomoya Good work finding that! The out-of-the-box experience here is kind of bad, so I'm thinking we should change the default. What do you think about changing the publisher to use system_default, but with a depth of 1, instead?

@fujitatomoya
Copy link
Collaborator

@clalancette

The out-of-the-box experience here is kind of bad,

definitely this is important,

What do you think about changing the publisher to use system_default, but with a depth of 1, instead?

i think that may be better, i may come up with PR. btw, is that supposed to mean we want to change rmw default? or only on ros2cli pub? I believe the latter one, since rmw has effect for all of the components including application.

@fujitatomoya
Copy link
Collaborator

the question is why do we not have option to set history and depth as well? this is kind of constrain for user.

@clalancette
Copy link
Contributor

or only on ros2cli pub?

Yeah, let's just keep it confined to ros2cli pub for now.

the question is why do we not have option to set history and depth as well?

Those would also be nice to have, though they wouldn't help the out-of-the-box experience. I think the reason we don't have them is that these don't typically cause too many problems for the CLI tools, so we haven't implemented them yet.

@fujitatomoya
Copy link
Collaborator

yeah, maybe it is likely to set profile but nothing else. it is always trading-off flexibility or complication...

@fujitatomoya
Copy link
Collaborator

@clalancette

could you check #528 if you got time?

if we prefer it to adjust the depth(not exposing it), i can do that too. but it is likely that user requests the flexibility something like this, so that is the current implementation.

btw, ros2 topic info prints also depth and history could be the follow-up PR.

@clalancette
Copy link
Contributor

could you check #528 if you got time?

if we prefer it to adjust the depth(not exposing it), i can do that too. but it is likely that user requests the flexibility something like this, so that is the current implementation.

I think we should do both. That is, I can take a look at #528, but I'd prefer if we also changed the depth on ros2 topic echo by default. That way it will work better out-of-the-box.

@fujitatomoya
Copy link
Collaborator

@clalancette

I think we should do both.

sounds reasonable, will be back with fix in today.

changed the depth on ros2 topic echo by default.

actually this is ros2 topic pub, ros2 topic echo is subscription so that depth will be set to 5 in default profile which is sensor_data. the original cause is publication depth is set to 0 as default, so that subscription cannot get the published data before participation even with durability.

@craigh92
Copy link
Author

This is because qos profile on publication(in this case ros2 topic pub) is set as system_default, and depth is 0.

changing qos profile, it can work as following,

# ros2 topic pub -r 1.0 --qos-profile=sensor_data --qos-durability=transient_local /chatter std_msgs/msg/String '{data: hello}'
...<snip>
# ros2 topic echo --qos-durability=transient_local /chatter std_msgs/msg/String
...<snip>

I do not think this is bug but just a setting.

Thank you! This works for me now. I did actually try to change the qos_profile to sensor_data, but I did it for the subscriber, not the publisher.


I think we should do both. That is, I can take a look at #528, but I'd prefer if we also changed the
depth on ros2 topic echo by default. That way it will work better out-of-the-box.

As a new user, this is more intuitive to me. It's also a great way to test/demonstrate the transient_local policy without having to write any code.

@clalancette
Copy link
Contributor

actually this is ros2 topic pub

Heh, yes, good point. Sorry, this was a thinko :).

@bergercookie
Copy link

I think this issue still affects older versions of ROS2, which are still supported though, like foxy.
Could we backport the corresponding commits of #528?

@clalancette
Copy link
Contributor

I think this issue still affects older versions of ROS2, which are still supported though, like foxy.
Could we backport the corresponding commits of #528?

I think that is reasonable enough. @bergercookie would you mind opening a backport PR?

@jtbandes
Copy link
Member

jtbandes commented Dec 6, 2021

I'm trying this combination of commands on Galactic, but it's not working for me.

First I run:

ros2 topic pub --qos-history keep_last --qos-depth 1 --qos-durability transient_local -1 --keep-alive 10 /foo std_msgs/String 'data: "hi"'

I wait until I see publishing #1: std_msgs.msg.String(data='hi'), then run in another terminal:

ros2 topic echo  /foo

Is this expected to work?

@clalancette
Copy link
Contributor

Is this expected to work?

In general, yes. In Rolling, this works exactly as expected, in that no command-line arguments are required and ros2 topic echo /foo gets the data exactly as expected.

It's possible that Galactic doesn't have all of the changes necessary to make it work out-of-the-box; I don't remember at this time when exactly we put those in. That said, I can make this work fine in Galactic if I do the following:

ros2 topic echo --qos-durability transient_local --qos-reliability reliable /foo

defunctzombie added a commit to RobotWebTools/rosbridge_suite that referenced this issue Dec 9, 2021
**Public API Changes**
Subscriber QoS is automatically chosen based on available publishers to achieve compatible QoS settings in more cases.

**Description**
When looking into https://github.com/foxglove/studio/issues/1987 I discovered that rosbridge was not receiving "latched" `/tf_static` messages. This led us to learn that subscribers need specific qos profile settings for certain publishers (ros2/ros2cli#523 (comment)).

This PR attempts to use a qos profile that would accept any published message.

Closes #654
Closes #582
Fixes #551 

Co-authored-by: Jacob Bandes-Storch <jacob@foxglove.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants