-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make unpublished topics unrecorded by default #968
Make unpublished topics unrecorded by default #968
Conversation
a3e138a
to
f065db8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass
rosbag2_transport/test/rosbag2_transport/test_record_all_include_unpublished_topics.cpp
Outdated
Show resolved
Hide resolved
rosbag2_transport/test/rosbag2_transport/test_record_all_include_unpublished_topics.cpp
Outdated
Show resolved
Hide resolved
@@ -60,6 +60,11 @@ def add_arguments(self, parser, cli_name): # noqa: D102 | |||
'-x', '--exclude', default='', | |||
help='Exclude topics containing provided regular expression. ' | |||
'Works on top of --all, --regex, or topics list.') | |||
parser.add_argument( | |||
'--include-unpublished-topics', action='store_true', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@theseankelly nit^2: would --wait-for-publishers
be a more accurate name for this flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new behavior (don't add unpublished topics to the bag) is off by default in the current implementation. Naming it --wait-for-publishers
would imply that the new behavior should be on by default.
Given the description in the underlying issue report, I feel like the most commonly desired behavior is to not record bags until they are published so that we avoid any QoS incompatibilities.
I could be persuaded if there's strong desire -- I do realize this change alters the default behavior of ros2 bag record -a
w.r.t unpublished topics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new behavior (don't add unpublished topics to the bag) is off by default in the current implementation. Naming it --wait-for-publishers would imply that the new behavior should be on by default.
Yeah, fair point.
I could be persuaded if there's strong desire -- I do realize this change alters the default behavior of
ros2 bag record -a
w.r.t unpublished topics.
That's true, but I agree that the current default behavior is less than intuitive at best and harmful at worst (e.g. if /tf_static
messages are silently dropped).
It does beg the question though, what would be the use case for early subscription with default QoS policies? Grabbing the first few messages sent by a publisher? If there's no other subscriber from which to derive the topic type, DDS discovery latency and graph polling will make it impossible. CC @emersonknapp for feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also interested in hearing what @emersonknapp has to say.
I think there are a couple (maybe contrived..) scenarios where the feature has value. For example, spinning up a background rosbag snapshotter as part of a robot topology - it might be useful to not have to worry about boot order of the various nodes in the system.
Definitely niche, and might only be useful with a couple other flags (--qos-profiles-override-path
and --no-discovery
or something).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emersonknapp Gentle ping in this issue if you have any concerns or suggestions to the proposed solution it's good time to express it.
Otherwise if CI will be green I think we will merge it Monday or Tuesday.
IMO alternative scenario described in #967 is a valid case which should be handled correctly
1. In terminal 1 create a subscriber by running ros2 topic echo --qos-reliability best_effort /foo std_msgs/msg/Int32
2. In terminal 2 run ros2 record -a
a. NOTE in the output that the topic /foo was added to the bag -- found by the DataReader
3. In terminal 3 create a publisher via ros2 topic pub --qos-reliability best_effort /foo std_msgs/msg/Int32 '{value: 1}'
The DataWriter (and its offered QoS) will be discovered, and terminal 2 (rosbag record) will display the warning about incompatible publishers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for late reply - my perspective is that "will not create a subscription until at least one publisher is discovered" should be the default behavior of discovery That seems more intelligent behavior overall. In fact, I'm not sure I see any reason to turn it off, unless we're worried that the discovery loop is a performance overhead that a user may want to explicitly disable?
Even in that case, the default QoS is kind of an arbitrary baseline subscription to fall back to. It would probably make more sense that "if you provide a QoS override for an explicitly specified subscription topic, then the subscription will be created immediately using that override and will not use discovery".
Given that, I'm not sure we need a new commandline option for this feature! I'll note that this is a good thing probably, we already have an unwieldy number of options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emersonknapp I also don't envision the case when we might need to use --include-unpublished-topics
parameter.
@theseankelly Do you have any scenario in mind or in practice when we might need to use --include-unpublished-topics
?
If not I would ask you to remove this extra parameter from command line option.
IMO Possible concern that the discovery thread affecting performance doesn't count. Since it's very minor and used to be only for the begging of the recording.
I agree with @emersonknapp that
"if you provide a QoS override for an explicitly specified subscription topic, then the subscription will be created immediately using that override and will not use discovery".
However I would moved it as follow up feature request for the sake of saving time to merge current PR which is considering as a "bugfix".
rosbag2_transport/test/rosbag2_transport/test_record_all_include_unpublished_topics.cpp
Outdated
Show resolved
Hide resolved
Thanks very much for the review -- took care of what I could. The rest are pending the conversations above.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM but for a few comments.
rosbag2_transport/test/rosbag2_transport/test_record_all_include_unpublished_topics.cpp
Outdated
Show resolved
Hide resolved
fc14787
to
0bef227
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@theseankelly Thank you for your contribution.
While implementation for fix looks good to me, I have some significant concerns in regards to the unit tests.
First of all waiting for 2-4 seconds in each test doesn't look good for me. We have hundreds of tests on CI and if wait for 4 seconds in each test our CI will be running for ever.
The second my concern is non-deterministic behavior in tests. Relying on sleep for 2 seconds in tests tend to make them flaky or false negative.
Although I have suggestion how to get rid from delay in tests and make them deterministic.
Need create MockReader
derived from rosbag2_transport::Reader
. Similar way how I made MockPlayer
I envision MockReader
to be as
class MockRecorder : public rosbag2_transport::Recorder
{
public:
MockRecorder(
std::shared_ptr<rosbag2_cpp::Writer> writer,
const rosbag2_storage::StorageOptions & storage_options,
const rosbag2_transport::RecordOptions & record_options)
: Recorder(writer, storage_options, record_options)
{}
template<typename DurationRepT = int64_t, typename DurationT = std::milli>
bool wait_for_topic_to_be_discovered(
const std::string & topic_name_to_wait_for,
std::chrono::duration<DurationRepT, DurationT> timeout = std::chrono::seconds(10))
{
bool discovered = false;
using clock = std::chrono::steady_clock;
auto start = clock::now();
do {
auto topic_names_and_types = this->get_topic_names_and_types();
for (const auto &[topic_name, topic_types] : topic_names_and_types) {
if (topic_name_to_wait_for == topic_name) {
discovered = true;
break;
}
}
std::this_thread::sleep_for(std::chrono::milliseconds(20));
} while (!discovered && (clock::now() - start) < timeout);
return discovered;
}
bool topic_available_for_recording(const std::string & topic_name)
{
bool available_for_recording = false;
auto topics_to_subscribe = get_requested_or_available_topics();
for (const auto & topic_and_type : topics_to_subscribe) {
if (topic_and_type.first == topic_name) {
available_for_recording = true;
break;
}
}
return available_for_recording;
}
};
As example one of the test rewritten with MockRecorder
TEST_F(
RecordIntegrationTestFixture,
record_all_include_unpublished_false_includes_later_published_default_qos)
{
const std::string string_topic = "/string_topic";
auto node = std::make_shared<rclcpp::Node>("test_string_msg_listener_node");
auto string_msgs_sub = node->create_subscription<test_msgs::msg::Strings>(
string_topic, 10, [](test_msgs::msg::Strings::ConstSharedPtr) {});
rosbag2_transport::RecordOptions record_options = {true, false, {}, "rmw_format", 100ms};
record_options.include_unpublished_topics = false;
auto recorder = std::make_shared<MockRecorder>(writer_, storage_options_, record_options);
recorder->record();
start_async_spin(recorder);
ASSERT_TRUE(recorder->wait_for_topic_to_be_discovered(string_topic));
ASSERT_FALSE(recorder->topic_available_for_recording(string_topic));
// Start up a publisher on our topic *after* the recording has started
auto string_message = get_messages_strings()[0];
string_message->string_value = "Hello World";
rosbag2_test_common::PublicationManager pub_manager;
// Publish 10 messages at a 30ms interval for a steady 300 milliseconds worth of data
pub_manager.setup_publisher(
string_topic, string_message, 10, rclcpp::QoS{rclcpp::KeepAll()}, 30ms);
ASSERT_TRUE(pub_manager.wait_for_matched(string_topic.c_str()));
pub_manager.run_publishers();
ASSERT_TRUE(recorder->topic_available_for_recording(string_topic));
}
Thanks very much for the detailed feedback @MichaelOrlov -- My test implementation follows the patterns established in other unit tests in this package, however I share your concerns and I ought to have not taken the easy road out. I'll dig in to the suggested |
96a5899
to
950f2ff
Compare
Thanks again, @MichaelOrlov -- I have added your |
950f2ff
to
89c1576
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@theseankelly Thank you for collaboration and support. Now it looks good enough to go. I approve this PR.
We will need to make sure that CI passing green and I think after that will be able to merge it.
@theseankelly Unfortunately CI fails with error maessage:
I was able to reproduce this error locally with Could you please dig deeper to figure out why this specific test not able to pass with default parameters? |
@jacobperron It seems you was the author of the ros2bag/test/test_record.py. Could you please explain how the test works? |
@MichaelOrlov yeah I'll dig in |
It seems with this PR, ros2bag no longer automatically subscribes to the |
@jacobperron Thanks for the explanation. |
If I understand correctly, the |
@jacobperron Thanks for the explanation now it became clear for me. @theseankelly I think it will be legitimate to add |
Hey, thanks for the details and sorry I've been slow on this one. I'm in agreement with @jacobperron that I don't understand what about this change would have caused a test regression. For example, the output of just running
Which ought to satisfy the unit test's checks. Furthermore while the recording is active, there is indeed a publisher on the topic:
I will do some further investigation into the root cause before adding this parameter. |
Test consistently passes for me.
|
Oh interesting, I can reproduce this if I use FastDDS instead of CycloneDDS. I'll keep digging now that I have a repro. |
@theseankelly Thanks for catching it up. |
Signed-off-by: Sean Kelly <skelly@seegrid.com>
Signed-off-by: Sean Kelly <skelly@seegrid.com>
Signed-off-by: Sean Kelly <skelly@seegrid.com>
89c1576
to
4c47b7b
Compare
Change made, rebased, force pushed. |
Hey @MichaelOrlov assuming the signoff is still good, could this be merged prior to humble being spun up? |
Sorry, we are in a feature and API freeze for Humble at this point. We can merge this in once Humble has been branched off. |
Thanks for the info @clalancette -- is that to say this fix will miss Humble entirely, or it will just need to come in after the Humble branches are created? |
If it doesn't break (public) API and isn't a new feature, the maintainers of rosbag2 can consider it for Humble. |
@emersonknapp @MichaelOrlov hoping to get the port-to-humble conversation started early. It seems to me that this is a pretty fundamental problem (we currently have to specify custom qos override files for all bag record/playback) and it'd be great to get it in to humble. This doesn't break public API, but I wonder if the new command line flag would be a 'new feature'. Could you comment? I'd be open to spinning up a version of this change which hard-codes the new default behavior without creating a new command line flag if that's going to be a problem. |
Running CI |
@clalancette @emersonknapp Could you please advise how to proceed with this PR? I wouldn't consider this PR as a new feature. This is more like a bug fix. BTW. I have tried to run CI for the current branch. But it seems CI launcher is broken or get stuck somewhere for unknown reason. |
Can you be more specific? The nightly builds seem to be working fine, what problem are you having with the builds? In general, the way forward with this PR is to get CI passing, get it approved, and merge it onto the master branch. Once that is done, we can consider backporting it to Humble. We are still before the release, so we can still break ABI at this point, but the time in which to do that is short; we'll stop taking ABI breaks on May 19th. |
@clalancette Thank you for information. Perhaps we will be able to merge it soon.
Sorry for the confusion. At the time when I was writing those comment the CI launcher was reporting status "not running" for all three jobs and was hanging out in such state for a 30 minutes. I want to clarify procedure for backporting PR to the Humble. |
Oh, right. We don't always have enough executors (particularly on Windows), so it can just take some time for the jobs to run.
Yes, that is generally correct. At this point of the release, the one additional thing to do is to make sure to get @audrow 's approval before merging it into the |
This is needed to try to fix Windows build
* Make unpublished topics unrecorded by default Topics with no publishers do not offer any QoS profiles. This causes the recorder to subscribe with a default QoS, which is often disadvantageous. This change makes subscribing to unpublished topics optional, off by default. Once a publisher is discovered for a topic, discovery picks it up and adds it to the bag. This addresses ros2#967 Signed-off-by: Sean Kelly <skelly@seegrid.com> * Feedback from review Signed-off-by: Sean Kelly <skelly@seegrid.com> * Additional review feedback Signed-off-by: Sean Kelly <skelly@seegrid.com> * Narrow scope of tests and make more deterministic Signed-off-by: Sean Kelly <skelly@seegrid.com> * Fix test expectation for unpublished topics Signed-off-by: Sean Kelly <skelly@seegrid.com> * Add ROSBAG2_TRANSPORT_EXPORT for get_requested_or_available_topics() This is needed to try to fix Windows build Co-authored-by: Sean Kelly <skelly@seegrid.com> Co-authored-by: Michael Orlov <morlovmr@gmail.com>
* Make unpublished topics unrecorded by default Topics with no publishers do not offer any QoS profiles. This causes the recorder to subscribe with a default QoS, which is often disadvantageous. This change makes subscribing to unpublished topics optional, off by default. Once a publisher is discovered for a topic, discovery picks it up and adds it to the bag. This addresses #967 Signed-off-by: Sean Kelly <skelly@seegrid.com> * Feedback from review Signed-off-by: Sean Kelly <skelly@seegrid.com> * Additional review feedback Signed-off-by: Sean Kelly <skelly@seegrid.com> * Narrow scope of tests and make more deterministic Signed-off-by: Sean Kelly <skelly@seegrid.com> * Fix test expectation for unpublished topics Signed-off-by: Sean Kelly <skelly@seegrid.com> * Add ROSBAG2_TRANSPORT_EXPORT for get_requested_or_available_topics() This is needed to try to fix Windows build Co-authored-by: Sean Kelly <skelly@seegrid.com> Co-authored-by: Michael Orlov <morlovmr@gmail.com> Co-authored-by: Sean Kelly <skelly@seegrid.com> Co-authored-by: Michael Orlov <morlovmr@gmail.com>
Topics with no publishers do not offer any QoS profiles. This causes the
recorder to subscribe with a default QoS, which is often disadvantageous.
This change makes subscribing to unpublished topics optional, off by
default. Once a publisher is discovered for a topic, discovery picks it
up and adds it to the bag.
This addresses #967