Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix matched event issues #683

Merged
merged 8 commits into from
Apr 12, 2023

Conversation

MiguelCompany
Copy link
Contributor

An alternative for #682 that should fix #681

@Barry-Xu-2018
Copy link
Contributor

There is a problem on take_event.
e.g.

https://github.com/eProsima/Fast-DDS/blob/master/src/cpp/fastdds/publisher/DataWriterImpl.cpp#L1190-L1208

In get_publication_matched_status(), current_count_change and total_count_change will be set as 0.

eturnCode_t DataWriterImpl::get_publication_matched_status(
        PublicationMatchedStatus& status)
{
    if (writer_ == nullptr)
    {
        return ReturnCode_t::RETCODE_NOT_ENABLED;
    }

    {
        std::unique_lock<RecursiveTimedMutex> lock(writer_->getMutex());

        status = publication_matched_status_;
        publication_matched_status_.current_count_change = 0;
        publication_matched_status_.total_count_change = 0;
    }

    user_datawriter_->get_statuscondition().get_impl()->set_status(StatusMask::publication_matched(), false);
    return ReturnCode_t::RETCODE_OK;
}

While calling take_event after matched event happen, it will call get_publication_matched_status() to get status. At this time, current_count_change and total_count_change are already set as 0.

https://github.com/eProsima/rmw_fastrtps/blob/5918930fe50e6754ddb97619a55eae1b1eb2af80/rmw_fastrtps_shared_cpp/src/custom_publisher_info.cpp#L170-L192

So I think if change may be as below (Only avoid triggering event)

    matched_status_.total_count = total_count;
    matched_status_.total_count_change += total_count_change;
    matched_status_.current_count = current_count;
    matched_status_.current_count_change += current_count_change;

    matched_changes_ = true;
  if (on_new_event_cb_[RMW_EVENT_PUBLICATION_MATCHED]) {
    trigger_event(RMW_EVENT_PUBLICATION_MATCHED);
  }

@MiguelCompany
Copy link
Contributor Author

@Barry-Xu-2018 I think you are right!

Updated the code in 0c60ec2

@fujitatomoya
Copy link
Collaborator

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@fujitatomoya
Copy link
Collaborator

CC: @iuhilnehc-ynos

@fujitatomoya
Copy link
Collaborator

fujitatomoya commented Apr 5, 2023

@MiguelCompany can you address cpplint warning?

@Barry-Xu-2018
Copy link
Contributor

I tried to run matched demo and find a new problem. It seems repeated event was received.
I will investigate it.

ros2 run demo_nodes_cpp matched_event_detect 
[INFO] [1680752983.277148039] [multi_sub_node]: Create a new subscription.
[INFO] [1680752983.278800649] [matched_event_detect_node]: First subscription is connected.
[INFO] [1680752983.278846840] [multi_sub_node]: Create a new subscription.
[INFO] [1680752983.280026632] [matched_event_detect_node]: The changed number of connected subscription is 1 and current number of connected subscription is 2.
[INFO] [1680752983.280062182] [multi_sub_node]: Destroy a subscription.
[INFO] [1680752983.281545152] [matched_event_detect_node]: The changed number of connected subscription is -1 and current number of connected subscription is 1.
[INFO] [1680752983.281580592] [multi_sub_node]: Destroy a subscription.
[INFO] [1680752983.282452240] [matched_event_detect_node]: Last subscription is disconnected.
[INFO] [1680752983.282486114] [multi_pub_node]: Create a new publisher.
[INFO] [1680752983.283396538] [multi_pub_node]: Create a new publisher.
[INFO] [1680752983.284131755] [matched_event_detect_node]: First publisher is connected.
[INFO] [1680752983.284165433] [multi_pub_node]: Destroy a publisher.
[INFO] [1680752983.285238736] [matched_event_detect_node]: The changed number of connected publisher is -1 and current number of connected publisher is 1.
[INFO] [1680752983.285273466] [multi_pub_node]: Destroy a publisher.
[INFO] [1680752983.286016443] [matched_event_detect_node]: Last publisher is disconnected.

@Barry-Xu-2018
Copy link
Contributor

Barry-Xu-2018 commented Apr 6, 2023

You can use this code to reproduce the problem.
https://github.com/Barry-Xu-2018/bug_reproduce/blob/master/rclcpp_bug/src/rmw_fastrtps_681/rmw_fastrtps_681.cpp

While creating or destroying a subscription, publisher get 2 notification of matched event.

[INFO] [1680787176.004999904] [multi_sub_node]: Create a new subscription.
[INFO] [1680787176.007356863] [matched_event_detect_node]: Matched Event: 1, 1, 1, 1
[INFO] [1680787176.007778429] [matched_event_detect_node]: Matched Event: 1, 0, 1, 0
[INFO] [1680787176.009060380] [multi_sub_node]: Destroy a subscription.
[INFO] [1680787176.010856996] [matched_event_detect_node]: Matched Event: 1, 0, 0, -1
[INFO] [1680787176.011048782] [matched_event_detect_node]: Matched Event: 1, 0, 0, 0

Repeated notification is from

https://github.com/eProsima/rmw_fastrtps/blob/0c60ec2856ce41a039fafec95319c71e5be4a79a/rmw_fastrtps_shared_cpp/src/rmw_wait.cpp#L194-L199

And repeated notification isn't from DataWriterImpl::InnerDataWriterListener::onWriterMatched().
I didn't find the root cause.

@fujitatomoya
Copy link
Collaborator

@MiguelCompany still it still generates the 2 events, is this expected behavior? or we should add if *_change is zero, it should be ignored?

@MiguelCompany
Copy link
Contributor Author

@Barry-Xu-2018 @fujitatomoya I think I finally got it right. Please check on your side.

@fujitatomoya
Copy link
Collaborator

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@Barry-Xu-2018
Copy link
Contributor

Barry-Xu-2018 commented Apr 7, 2023

@MiguelCompany
After testing, the repeat notification also be found.

In __rmw_init_event, you change code as the below.

  bool is_matched_event =
    (RMW_EVENT_PUBLICATION_MATCHED == event_type) ||
    (RMW_EVENT_SUBSCRIPTION_MATCHED == event_type);
  if (!is_matched_event) {
    eprosima::fastdds::dds::StatusMask status_mask =
      event->get_listener()->get_statuscondition().get_enabled_statuses();
    status_mask |= rmw_fastrtps_shared_cpp::internal::rmw_event_to_dds_statusmask(event_type);
    event->get_listener()->get_statuscondition().set_enabled_statuses(status_mask);
  }

According to my understanding, you don't want to get status notification for matched event.
But I found the below code also can run (active is set).

https://github.com/ros2/rmw_fastrtps/blob/master/rmw_fastrtps_shared_cpp/src/rmw_wait.cpp#L191-L199

@Barry-Xu-2018
Copy link
Contributor

@MiguelCompany

Status mask already enable matched.
e.g.

info->data_writer_ = publisher->create_datawriter(
info->topic_,
writer_qos,
info->data_writer_listener_,
eprosima::fastdds::dds::StatusMask::publication_matched());

Maybe we cannot disable it.
So maybe we have to add check at __wait_set(). Status check ignore matched event or matched event only depend on trigger value.

if (ReturnCode_t::RETCODE_OK == ret_code) {
eprosima::fastdds::dds::Entity * entity = status_condition.get_entity();
eprosima::fastdds::dds::StatusMask changed_statuses = entity->get_status_changes();
if (changed_statuses.is_active(
rmw_fastrtps_shared_cpp::internal::rmw_event_to_dds_statusmask(
event->event_type)))
{
active = true;
}
if (guard_condition.get_trigger_value()) {
active = true;
guard_condition.set_trigger_value(false);
}
}

@Barry-Xu-2018
Copy link
Contributor

Friendly ping @MiguelCompany

@Barry-Xu-2018
Copy link
Contributor

@MiguelCompany

According to the release schedule of Iron

Mon. April 10, 2023 - Alpha + RMW freeze
Preliminary testing and stabilization of ROS Base 1 packages, and API and feature freeze for RMW provider packages.

I am not sure whether this problem should be fixed today.
Maybe we can use this temporary method to resolve it (#683 (comment)).
If have better way, we can update codes late.

MiguelCompany and others added 8 commits April 10, 2023 07:16
Signed-off-by: Miguel Company <MiguelCompany@eprosima.com>
…en set.

Signed-off-by: Miguel Company <MiguelCompany@eprosima.com>
Signed-off-by: Miguel Company <MiguelCompany@eprosima.com>
Signed-off-by: Miguel Company <MiguelCompany@eprosima.com>
Signed-off-by: Miguel Company <MiguelCompany@eprosima.com>
Signed-off-by: Miguel Company <MiguelCompany@eprosima.com>
Signed-off-by: Miguel Company <MiguelCompany@eprosima.com>
Signed-off-by: Miguel Company <MiguelCompany@eprosima.com>
@MiguelCompany
Copy link
Contributor Author

@Barry-Xu-2018 @fujitatomoya One last try before using the workaround in #683 (comment).

I rebased and changed the way the information of the event is taken, so the status on the entity is always reset.

@Barry-Xu-2018
Copy link
Contributor

@MiguelCompany Thank you. Your change fixed it after testing.
I think you can merge this fix.

@iuhilnehc-ynos
Copy link
Contributor

re-run CI based on #683 (comment)

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@iuhilnehc-ynos
Copy link
Contributor

rerun CI with my repos which use the latest ros2 with current PR.

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@MiguelCompany
Copy link
Contributor Author

@fujitatomoya @clalancette This bugfix is ready

@fujitatomoya fujitatomoya merged commit 91d280a into ros2:rolling Apr 12, 2023
@MiguelCompany MiguelCompany deleted the fix-matched-event-issues branch June 12, 2024 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Receive repeated matched event
4 participants