Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the tests on cyclonedds by translating qos duration values #606

Merged
merged 2 commits into from Jan 21, 2021

Conversation

emersonknapp
Copy link
Collaborator

@emersonknapp emersonknapp commented Jan 21, 2021

Makes the build start passing again - it has been failing since https://github.com/ros2/rosbag2/actions/runs/479072133 on 1/11 due to ros2/ros2#1058

Should unblock other open PRs - notably #603 #543 #605

Here's the full breakdown:

  • Providing 0 as input duration to a publisher results in 'unspecified' for all rmw implementations for all currently supported QoS policies
  • FastDDS reports this special value (2147483647, 4294967295) when you query a publisher that has an 'unspecified' duration
    • FastDDS accepts this value as 'unspecified' when creating a publisher (the same as it does 0) - but it WON'T parse the CycloneDDS value
  • CycloneDDS reports the special value (9223372036, 854775807) when you query a publisher that has an 'unspecified' duration
    • CycloneDDS accepts this value as 'unspecified' when creating a publisher (the same as it does 0) - but it WON'T parse the FastDDS value

The rmw_cyclonedds_cpp error looks like this when trying to read a bag recorded with rmw_fastrtps_cpp

2021-01-12T01:40:46.2371892Z 2: 1610415583.125080 [0] dq.builtin: invalid parameter list (vendor 1.16, version 2.1): pid 23 (DEADLINE) invalid, input = 3,0,0,128,6,250,130,75
2021-01-12T01:40:46.2372518Z 2: 1610415583.125090 [0] dq.builtin: SPDP (vendor 1.16): invalid qos/parameters
2021-01-12T01:40:46.2373131Z 2: 1610415583.126850 [0] dq.builtin: invalid parameter list (vendor 1.16, version 2.1): pid 23 (DEADLINE) invalid, input = 3,0,0,128,6,250,130,75
2021-01-12T01:40:46.2373762Z 2: 1610415583.126857 [0] dq.builtin: SPDP (vendor 1.16): invalid qos/parameters

The rmw_fastrtps_cpp error looks like this when trying to read a bag recorded with rmw_cyclonedds_cpp

[WARN] [1611257485.439465477] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.439495434] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.439502850] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.440722168] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.440737894] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.440758600] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.441663946] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.441677598] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX

This is a bad situation - I think we need to definitively resolve ros2/rmw#255 in order to actually solve this problem. It seems I may need to bump up the priority on that.

FOR TODAY - I think the unblocking resolution is to just make the test-sample data use "0" across the board, which can be parsed and tested on any implementation - and track the fact that QoS profiles aren't cross-implementation parseable.

@emersonknapp emersonknapp requested a review from a team as a code owner January 21, 2021 01:19
@emersonknapp emersonknapp requested review from Karsten1987 and jikawa-az and removed request for a team January 21, 2021 01:19
@emersonknapp emersonknapp force-pushed the emersonknapp/fix-metadata-durations-cyclone branch 2 times, most recently from 38ae3c1 to 7608a8d Compare January 21, 2021 02:41
@emersonknapp
Copy link
Collaborator Author

emersonknapp commented Jan 21, 2021

Gist: https://gist.githubusercontent.com/emersonknapp/719356394006587cf55d28e9563f2153/raw/cf46bf7027e033806bf3fde565a665781f01bcae/ros2.repos
BUILD args: --packages-up-to ros2bag rosbag2_py rosbag2_transport rosbag2_tests
TEST args: --packages-select ros2bag rosbag2_py rosbag2_transport rosbag2_tests
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/7494

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

Copy link
Collaborator

@Karsten1987 Karsten1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this given that it potentially unblocks other PRs. However, there's a lot of magic going on here and I would love to have this documented somehow. I feel this is all going to be pretty confusing for me if I look at this again in 6 months.

namespace
{
// These values are returned to mean "unspecified" in FastDDS
// 0 is equivalent and is portable to other implementation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If 0 is equivalent, why do we need this change?
I feel generally we should document this in a more central way - also on the actual description of this PR for future references.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the full breakdown:

  • Providing 0 as input duration to a publisher results 'unspecified' for all rmw implementations for all currently supported QoS policies
  • FastDDS reports this special value (2147483647, 4294967295) when you query a publisher that has an 'unspecified' duration
    • FastDDS accepts this value as 'unspecified' when creating a publisher (the same as it does 0) - but it WON'T parse the CycloneDDS value
  • CycloneDDS reports the special value (9223372036, 854775807) when you query a publisher that has an 'unspecified' duration
    • CycloneDDS accepts this value as 'unspecified' when creating a publisher (the same as it does 0) - but it WON'T parse the FastDDS value

The rmw_cyclonedds_cpp error looks like this when trying to read a bag recorded with rmw_fastrtps_cpp

2021-01-12T01:40:46.2371892Z 2: 1610415583.125080 [0] dq.builtin: invalid parameter list (vendor 1.16, version 2.1): pid 23 (DEADLINE) invalid, input = 3,0,0,128,6,250,130,75
2021-01-12T01:40:46.2372518Z 2: 1610415583.125090 [0] dq.builtin: SPDP (vendor 1.16): invalid qos/parameters
2021-01-12T01:40:46.2373131Z 2: 1610415583.126850 [0] dq.builtin: invalid parameter list (vendor 1.16, version 2.1): pid 23 (DEADLINE) invalid, input = 3,0,0,128,6,250,130,75
2021-01-12T01:40:46.2373762Z 2: 1610415583.126857 [0] dq.builtin: SPDP (vendor 1.16): invalid qos/parameters

The rmw_fastrtps_cpp error looks like this when trying to read a bag recorded with rmw_cyclonedds_cpp

[WARN] [1611257485.439465477] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.439495434] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.439502850] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.440722168] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.440737894] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.440758600] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.441663946] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX
[WARN] [1611257485.441677598] [rmw_dds_common]: nanoseconds value too large for 32-bits, saturated at UINT_MAX

On this further investigation, it's actually worse than I thought - I think we need to definitively resolve ros2/rmw#255 in order to actually solve this problem.

FOR TODAY - I think the unblocking resolution would be to revert any special-case code here and just make the test-sample data use "0" across the board, which can be parsed and tested on any implementation

Edit: I updated the PR and the description, according to this analysis

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, thanks for the clarification.

ros2bag/test/resources/incomplete_qos_duration.yaml Outdated Show resolved Hide resolved
@emersonknapp emersonknapp force-pushed the emersonknapp/fix-metadata-durations-cyclone branch from 7608a8d to e35b731 Compare January 21, 2021 19:40
…o 0 for cross-implementation readability, pending a clarification of the rmw qos duration api

Signed-off-by: Emerson Knapp <eknapp@amazon.com>
@emersonknapp emersonknapp force-pushed the emersonknapp/fix-metadata-durations-cyclone branch from e35b731 to e55fa98 Compare January 21, 2021 19:42
Signed-off-by: Emerson Knapp <eknapp@amazon.com>
@emersonknapp
Copy link
Collaborator Author

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@emersonknapp emersonknapp merged commit de11589 into master Jan 21, 2021
@delete-merged-branch delete-merged-branch bot deleted the emersonknapp/fix-metadata-durations-cyclone branch January 21, 2021 22:10
emersonknapp added a commit that referenced this pull request Feb 2, 2021
* Change all 'fastrtps-unspecified' duration values in test resources to 0 for cross-implementation readability, pending a clarification of the rmw qos duration api

Signed-off-by: Emerson Knapp <eknapp@amazon.com>
emersonknapp added a commit that referenced this pull request Feb 17, 2021
* Change all 'fastrtps-unspecified' duration values in test resources to 0 for cross-implementation readability, pending a clarification of the rmw qos duration api

Signed-off-by: Emerson Knapp <eknapp@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants