Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about conditional statements in graph.c file #934

Closed
YoungJin-gurum opened this issue Sep 14, 2021 · 11 comments
Closed

Question about conditional statements in graph.c file #934

YoungJin-gurum opened this issue Sep 14, 2021 · 11 comments
Assignees
Labels
enhancement New feature or request tests

Comments

@YoungJin-gurum
Copy link

YoungJin-gurum commented Sep 14, 2021

Bug report

Required Info:

  • Operating System:
    • Ubuntu 20.04
  • Installation type:
    • from source
  • Version or commit hash:
    • 3.2.0(master)
  • DDS implementation:
    • GurumDDS(after 2.7.2774), rmw_gurumdds_cpp(rolling, galactic)
  • Client library (if applicable):
    • rcl

Steps to reproduce issue

Execute test of rcl (colcon test) with gurumdds, rmw_gurumdds_cpp.

Expected behavior

rcl test_graph_query_functions
At check_graph_state after subscription fini
expected_in_tnat result passed (expect topic isn't in tnat)

Actual behavior

failed

Additional information

I have a question about the code at line 491 and 548 of <graph.c>

  if (expected_count <= count) {
    *success = true;
    return RCL_RET_OK;
  }

Variable count is result of count_entities_func.
After entity fini, if it has not yet received signal of deleted entity, count will be 1.
Then the condition is satisfied (expected_count 0 <= count 1).
So it will return OK immediately, then rcl_get_topic_names_and_types gets topic_cahce.
As I think, condition should be changed to

 expected_count == count

I think that makes waitset operate normally until the graph changes.
If there is a valid reason for the inequality sign in the conditional statement, please let me know.

@aprotyas
Copy link
Member

For context, directing you to #907 where this change was merged.

@YoungJin-gurum
Copy link
Author

For context, directing you to #907 where this change was merged.

Thank you for your reply. I'm already read #907, but the reason for the conditional statement is not explained.

I want to understand why success is when count is greater than expected_count.

@aprotyas
Copy link
Member

aprotyas commented Sep 14, 2021

I want to understand why success is when count is greater than expected_count.

I don't have much context, but I'm quoting a section from the rcl_wait_for_publishers function header

rcl/rcl/include/rcl/graph.h

Lines 596 to 597 in 7b899da

* This function blocks and will return when the number of publishers for `topic_name`
* is greater than or equal to the `count` parameter, or the specified `timeout` is reached.

Note that rcl_wait_for_publishers calls _rcl_wait_for_entities (which contains the code you originally referenced)

rcl/rcl/src/rcl/graph.c

Lines 594 to 611 in 7b899da

rcl_ret_t
rcl_wait_for_publishers(
const rcl_node_t * node,
rcl_allocator_t * allocator,
const char * topic_name,
const size_t expected_count,
rcutils_duration_value_t timeout,
bool * success)
{
return _rcl_wait_for_entities(
node,
allocator,
topic_name,
expected_count,
timeout,
success,
rcl_count_publishers);
}

With it passing in the expected_count, inferring from the expected behavior specified in the function header, I would expect the wait function to be successful when count >= expected_count, where count is obtained through the following snippet with count_entities_func == rcl_count_publishers

ret = count_entities_func(node, topic_name, &count);

Not sure if that addresses your question though. Let me know!

@YoungJin-gurum
Copy link
Author

YoungJin-gurum commented Sep 14, 2021

With it passing in the expected_count, inferring from the expected behavior specified in the function header, I would expect the wait function to be successful when count >= expected_count, where count is obtained through the following snippet with count_entities_func == rcl_count_publishers

I looked at all the connected functions as well.
Even the comments in the function header don't solve my question.

rcl/rcl/src/rcl/graph.c

Lines 484 to 494 in 7b899da

// We can avoid waiting if there are already the expected number of publishers
size_t count = 0u;
ret = count_entities_func(node, topic_name, &count);
if (ret != RCL_RET_OK) {
// Error message already set
return ret;
}
if (expected_count <= count) {
*success = true;
return RCL_RET_OK;
}

I understand this comment.
If there are already the expected number of entities(There is a typo in the comments, not only publishers), we don't have to wait.
I don't think there are any particular problems with normal use.

// Wait for expected number of publishers
bool success = false;
ret = rcl_wait_for_publishers(
node_ptr, &allocator, topic_name.c_str(), expected_publisher_count, timeout.count(), &success);
ASSERT_EQ(ret, RCL_RET_OK);
EXPECT_TRUE(success);
// Wait for expected number of subscribers
success = false;
ret = rcl_wait_for_subscribers(
node_ptr, &allocator, topic_name.c_str(), expected_subscriber_count, timeout.count(), &success);
ASSERT_EQ(ret, RCL_RET_OK);
EXPECT_TRUE(success);

According to the comments in the code block above, wait for a certain value of the number of publishers or subscribers.
As I mentioned at the beginning, in a situation where 0 is expected_count after using the entity fini function, I think it should not return when real_count is 1.
I don't think it's suitable as a function for testing.
If I'm thinking wrong, please let me know, thanks.

@aprotyas
Copy link
Member

aprotyas commented Sep 14, 2021

If I may rephrase your thoughts: the concern is that there may be kind of a "race" between expected_count being set to 0 (after all entity fini calls are made) and rcl_count_[pub/sub] actually returning 0? I will leave that question to the maintainers, but I think I see where you're getting at with the question.

@YoungJin-gurum
Copy link
Author

Let me summarize.
After all entity fini calls expected_count being set to 0.
But the following situations may exist: before the finalize signal is reached
count(the result of count_entities_func like rcl_count_[pub/sub]) being set to 1.

In this case, I think it is right to wait for count to be set to 0.

  if (expected_count <= count) {
    *success = true;
    return RCL_RET_OK;
  }

But by the above condition(0<=1) return OK immediately, so rcl_get_topic_names_and_types function is called before the topic name is cleared from the topic cache.
As a result, test_graph_query_function failed.

As I think, I changed the condition to the following

  if (expected_count == count) {
    *success = true;
    return RCL_RET_OK;
  }

and ran the test, and it was confirmed that it was successful.
Because the purpose of the test is to check whether the graph state is changed correctly after pub and sub fini, when it has to wait for change, I think it should wait.

@aprotyas
Copy link
Member

aprotyas commented Sep 14, 2021

As I think, I changed the condition to the following

  if (expected_count == count) {
    *success = true;
    return RCL_RET_OK;
  }

I totally see why you proposed this, but what if count > expected_count?

@fujitatomoya
Copy link
Collaborator

@YoungJin-gurum thanks for the explanation.

As I think, I changed the condition to the following

  if (expected_count == count) {
    *success = true;
    return RCL_RET_OK;
  }

I believe that rcl_wait_for_publishers and rcl_wait_for_subscribers are implemented as designed. If we change the condition statement as described above, we also need to change the design. which i do not think that we need to do.

rcl/rcl/include/rcl/graph.h

Lines 621 to 622 in eae50c9

* \param[out] success `true` if the number of publishers is equal to or greater than count, or
* `false` if a timeout occurred waiting for publishers.

According to the comments in the code block above, wait for a certain value of the number of publishers or subscribers.
As I mentioned at the beginning, in a situation where 0 is expected_count after using the entity fini function, I think it should not return when real_count is 1.
I don't think it's suitable as a function for testing.

true, this cannot test if the entity is actually changed into 0 (specified number) precisely. as you pointed out, that was the purpose for this test case. I would use rcl_count_publishers and rcl_count_subscribers instead.

There is a typo in the comments, not only publishers

could you tell us where?

@aprotyas
Copy link
Member

aprotyas commented Sep 14, 2021

There is a typo in the comments, not only publishers

could you tell us where?

I believe the typo in question is inside _rcl_wait_for_entities. The comment should read "... expected number of entities". I can submit a quick patch for this if needed.

rcl/rcl/src/rcl/graph.c

Lines 484 to 486 in 7b899da

// We can avoid waiting if there are already the expected number of publishers
size_t count = 0u;
ret = count_entities_func(node, topic_name, &count);


I would use rcl_count_publishers and rcl_count_subscribers instead.

I guess this also addresses @YoungJin-gurum's concern about the mismatch between expected_count and count, since rcl_count_[pub/sub] should return 0, as expected.

@YoungJin-gurum
Copy link
Author

@fujitatomoya galactic branch also needs the patch, I think.

@fujitatomoya
Copy link
Collaborator

I am not inclined to do that unless we have the actual problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request tests
Projects
None yet
Development

No branches or pull requests

3 participants