Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve wide string (de)serialization #113

Merged
merged 14 commits into from
Mar 20, 2024

Conversation

MiguelCompany
Copy link
Contributor

This PR changes (de)serialization of wide strings so they don't perform u16string <-> wstring conversions

@MiguelCompany
Copy link
Contributor Author

@clalancette This one is ready for review. It is similar to ros2/rmw_fastrtps#740 (already merged) but for the static type support. I've run the tests in the test_communication package and they passed.

We will base the upgrade to Fast CDR 2 on the changes here.

Copy link

@EduPonz EduPonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM. I'm only wondering if we should consider removing u16string_to_wstring and wstring_to_u16string

Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments about the sizes of things that we should think about.

Just so I understand; is the goal here to make sure we don't convert from rosidl_runtime_c__U16String to std::wstring, and then serialize the wstring into Cdr? Instead, we serialize directly from the rosidl_runtime_c__U16String into Cdr? I don't have any problem with it, but can you explain why?

Finally, to address @EduPonz point about removing u16string_to_wstring and wstring_to_u16string; I don't think we should do that, at least not for this release. That's because they are public functions and someone may be depending on them. If we think they don't have value anymore, we can mark them as deprecated for this release, and then remove them in the next.

auto len = static_cast<uint32_t>(u16str.size);
cdr << len;
for (uint32_t i = 0; i < len; ++i) {
uint32_t c = static_cast<uint32_t>(u16str.data[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular reason to use uint32_t here? I know that rosidl_runtime_c__U16String uses uint_least16_t for the data, but on all platforms we support we have a true uint16_t. So it seems to me that we could change this to:

Suggested change
uint32_t c = static_cast<uint32_t>(u16str.data[i]);
uint16_t c = static_cast<uint16_t>(u16str.data[i]);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason is interoperability / backward compatibility. The serialization of the wstring is using 4 bytes per character.

I initially did what you suggest, and used a uint16_t, but changed it when the tests in test_communication failed between rmw_cyclonedds and rmw_fastrtps for WStrings type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: we can change this in the future if we don't mind breaking backward compatibility, but should do it at the same time in all DDS-based RMW implementations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. OK, interoperability is a really good reason.

In that case, I'm totally fine with leaving this as-is, but please put in a comment saying that we are using uint32_t for the interoperability reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 54 to 56
uint32_t c;
cdr >> c;
u16str.data[i] = static_cast<char16_t>(c);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same questions here about the sizes; this seems like we should extract a uint16_t, then cast it into a uint_least16_t to put it into the array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reasoning as in here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a check that throws an exception if the deserialized uint32_t does not fit into a uint16_t in commit c4eb425. Note that this would be captured up in rmw_fastrtps code, since it is a fastcdr exception

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MiguelCompany and others added 7 commits March 18, 2024 10:52
…t_fastrtps_c.

Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
…t_fastrtps_cpp.

Signed-off-by: Miguel Company <MiguelCompany@eprosima.com>
Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
@MiguelCompany
Copy link
Contributor Author

Instead, we serialize directly from the rosidl_runtime_c__U16String into Cdr? I don't have any problem with it, but can you explain why?

  1. due to performance reasons, since we avoid copying into the wstring, and then into the serialization buffer.
  2. It allows us to change the serialization in the future if we want to (see my comment here)

Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code here looks good to me. I'm just going to request some comments about why we are using uint32_t instead of uint16_t when serializing/deserializing, as well as marking the functions we are no longer using as deprecated. Then we can run CI on this.

Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
@MiguelCompany
Copy link
Contributor Author

The code here looks good to me. I'm just going to request some comments about why we are using uint32_t instead of uint16_t when serializing/deserializing, as well as marking the functions we are no longer using as deprecated. Then we can run CI on this.

@clalancette I added two commits addressing the two points you mention here.

Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, looking good to me now. I'll go ahead and run full CI on this.

@clalancette
Copy link
Contributor

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@clalancette
Copy link
Contributor

Oh. Right. Tests are throwing warnings because they are using the deprecated methods.

I think what we should do here is to update those tests to use the new helpers instead, which will still test what we are trying to test, and fix the warnings.

Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
Signed-off-by: Miguel Company <miguelcompany@eprosima.com>
@MiguelCompany
Copy link
Contributor Author

@clalancette I updated the tests here and also fixed some linting issues.

@clalancette
Copy link
Contributor

All right, here is another shot at CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@clalancette clalancette merged commit b4f63c0 into ros2:rolling Mar 20, 2024
2 of 3 checks passed
@MiguelCompany MiguelCompany deleted the improve-u16-strings branch March 20, 2024 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants