Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fuse -> ROS 2 fuse_core: Partial port of fuse_core #281

Merged

Conversation

methylDragon
Copy link
Collaborator

@methylDragon methylDragon commented Oct 8, 2022

See: #276

This PR gets fuse_core building partially. It is NOT a complete port, since there are some changes that need to be made that affect many libraries downstream (e.g. time and waitables.)

This is a common branching off point for many of those downstream features, so it'll be extremely nice to get this in so that those PRs can use this as a base.

I've incorporated a lot of changes from @BrettRD (credited in the commit messages as appropriate.) And edited some of them to fit with the rest of the commits/PRs. (I know there are some (mostly time related) changes here that are still nascent (and don't reflect the final state of either of the existing branches), I intend to tackle those and repackage the changes in a separate time related PR.)

Also, there are a whole lot of warnings that get emitted, they'll get dealt with in downstream PRs. It'll get messier before it gets neater :>

One way to help reduce the number of warnings is to get #280 merged.

Also currently I think this should play well with the rest of the PRs currently open? But it'll be good to try to get this and #280 merged as quickly as possible. (Probably before #279 , though #279 should get in immediately after, I think)

Also pinging @svwilliams for visibility.

@methylDragon
Copy link
Collaborator Author

methylDragon commented Oct 8, 2022

@BrettRD could I double check what you meant by unsafe time packing and deprecated time API here: https://github.com/locusrobotics/fuse/pull/281/files#diff-4d7269f5ad306527f06d77e64f5230b27087a3be72516585a9e0dc9286db6116 ?

I'm guessing deprecated time API is from the ROS1 split of time into nsec and sec? What about packing?

@BrettRD
Copy link
Collaborator

BrettRD commented Oct 8, 2022

Unsafe packing is endianness portability, the same timestamp will produce different results on different processor architectures

The time API is just the sec/nsec split, ROS2 and chrono offer clearer options

@methylDragon
Copy link
Collaborator Author

methylDragon commented Oct 8, 2022

Oof, I suppose it's time to use boost endian then... (we're on C++17 so we can't use endian from std yet :( )

I'm also pretty tempted to just use the nanoseconds from ROS2 instead of both nanoseconds and seconds. Does anyone from Locus know if the byte buffer inputs to the UUID generator HAVE to encode seconds AND nanoseconds? I would presume no 🤔

@BrettRD
Copy link
Collaborator

BrettRD commented Oct 8, 2022

We don't need an endianness lib, just an explicit byte packing order instead of the technically undefined behaviour of reinterpret_cast.
The compiler will produce the same behaviour and binary on little endian, but it'll have consistent output when compiled for other architectures. (Good habits from the embedded systems world)

The uuid functions as a hash, so it only needs to use every meaningful bit in the time value

Copy link
Collaborator

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good start

fuse_core/CMakeLists.txt Outdated Show resolved Hide resolved
fuse_core/CMakeLists.txt Outdated Show resolved Hide resolved
fuse_core/CMakeLists.txt Outdated Show resolved Hide resolved
fuse_core/CMakeLists.txt Outdated Show resolved Hide resolved
fuse_core/CMakeLists.txt Outdated Show resolved Hide resolved
*/
ThrottledCallback(Callback&& keep_callback = nullptr, // NOLINT(whitespace/operators)
Callback&& drop_callback = nullptr, // NOLINT(whitespace/operators)
const ros::Duration& throttle_period = ros::Duration(0.0), const bool use_wall_time = false)
const rclcpp::Duration& throttle_period = rclcpp::Duration(0,0),
const bool use_wall_time = false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend requiring the clock to be passed in instead of a use_wall_time flag.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This so far has been deferred: https://github.com/locusrobotics/fuse/pull/283/files#diff-250689f3de589b83ad938a77e9a67784dd33d4321363870ba7c1cf742182c44c

The way it's implemented currently is that there's a boolean ROS parameter (throttle_use_wall_time) though (it then eventually gets passed as the flag).. If we are to pass in a clock type, does that parameter have to change as well? If so what parameter should it become? (String? Int?)

I'm inclined to keep the flag and leave the boolean ROS parameter alone so there's parameter parity between the ROS 1 and ported version of Fuse.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to keep the flag and leave the boolean ROS parameter alone so there's parameter parity between the ROS 1 and ported version of Fuse.

I agree about keeping the parameter throttle_use_wall_time the same.

A clock will need to be passed in to use anything other than wall time because a ROS clock needs to be attached to a time source. It can't be created inside of ThrottledCallback(). I'm not sure how to plumb that here though. Maybe fuse_core::AsyncSensorModel can have a way for subclasses to get a ROS clock off of it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, I'll be adding node interfaces and node interface getters to the classes that are currently storing node handles. I'll probably use them to obtain the NodeClockInterface and then get the clock that way 🤔

I'll defer this and add a TODO pending that change (which should be the next thing to change for fuse_core anyway!)

fuse_core/src/fuse_echo.cpp Outdated Show resolved Hide resolved
fuse_core/src/fuse_echo.cpp Outdated Show resolved Hide resolved
fuse_core/src/fuse_echo.cpp Outdated Show resolved Hide resolved
fuse_core/src/uuid.cpp Outdated Show resolved Hide resolved
@methylDragon
Copy link
Collaborator Author

methylDragon commented Oct 14, 2022

We don't need an endianness lib, just an explicit byte packing order instead of the technically undefined behaviour of reinterpret_cast.
The compiler will produce the same behaviour and binary on little endian, but it'll have consistent output when compiled for other architectures. (Good habits from the embedded systems world)

The uuid functions as a hash, so it only needs to use every meaningful bit in the time value

Apologies, I'm not too sure if I fully understand.

Do you mean iterating from the ref pointer (e.g. with an explicit for loop.. though we'd still end up needing to cast somehow, like say... (unsigned char*)&nanoseconds;, copying each time), or would moving away from reinterpret cast and into, say.. memcpy work?

    std::memcpy(
         iter, &nanoseconds, (buffer.size() - 1) * sizeof(unsigned char) + sizeof(std::array<unsigned char,1>)
    );

@BrettRD
Copy link
Collaborator

BrettRD commented Oct 14, 2022

All forms of pointer-based type-conversion magic cause undefined behaviour (including reinterpret_cast, memcpy, and Quake's fast inverse square root)
Mask and Shift is the most the most straightforward way of getting consistent behaviour across all architectures

I've set up an example over on compiler explorer: https://godbolt.org/z/hdGh73395
At optimisation level -O2, two of the three functions will produce identical binaries
foo_endian_unsafe matches foo_little_endian on x86 and ARM
foo_endian_unsafe matches foo_big_endian on Power

example source

#include <stdint.h>
#include <stddef.h>
#include <string.h>

void foo_endian_unsafe(unsigned char* buffer, uint32_t payload) {
  memcpy(buffer, reinterpret_cast<const unsigned char*>(&payload), sizeof(payload));
}


void foo_little_endian(unsigned char* buffer, uint32_t payload) {
  buffer[0] = (unsigned char)((payload & 0x000000FF) >> 0);
  buffer[1] = (unsigned char)((payload & 0x0000FF00) >> 8);
  buffer[2] = (unsigned char)((payload & 0x00FF0000) >> 16);
  buffer[3] = (unsigned char)((payload & 0xFF000000) >> 24);
}


void foo_big_endian(unsigned char* buffer, uint32_t payload) {
  buffer[0] = (unsigned char)((payload & 0xFF000000) >> 24);
  buffer[1] = (unsigned char)((payload & 0x00FF0000) >> 16);
  buffer[2] = (unsigned char)((payload & 0x0000FF00) >> 8);
  buffer[3] = (unsigned char)((payload & 0x000000FF) >> 0);
}

@methylDragon
Copy link
Collaborator Author

methylDragon commented Oct 14, 2022

All forms of pointer-based type-conversion magic cause undefined behaviour (including reinterpret_cast, memcpy, and Quake's fast inverse square root) Mask and Shift is the most the most straightforward way of getting consistent behaviour across all architectures

I've set up an example over on compiler explorer: https://godbolt.org/z/hdGh73395 At optimisation level -O2, two of the three functions will produce identical binaries foo_endian_unsafe matches foo_little_endian on x86 and ARM foo_endian_unsafe matches foo_big_endian on Power

example source

#include <stdint.h>
#include <stddef.h>
#include <string.h>

void foo_endian_unsafe(unsigned char* buffer, uint32_t payload) {
  memcpy(buffer, reinterpret_cast<const unsigned char*>(&payload), sizeof(payload));
}


void foo_little_endian(unsigned char* buffer, uint32_t payload) {
  buffer[0] = (unsigned char)((payload & 0x000000FF) >> 0);
  buffer[1] = (unsigned char)((payload & 0x0000FF00) >> 8);
  buffer[2] = (unsigned char)((payload & 0x00FF0000) >> 16);
  buffer[3] = (unsigned char)((payload & 0xFF000000) >> 24);
}


void foo_big_endian(unsigned char* buffer, uint32_t payload) {
  buffer[0] = (unsigned char)((payload & 0xFF000000) >> 24);
  buffer[1] = (unsigned char)((payload & 0x00FF0000) >> 16);
  buffer[2] = (unsigned char)((payload & 0x0000FF00) >> 8);
  buffer[3] = (unsigned char)((payload & 0x000000FF) >> 0);
}

Ahh, thank you for this, this has been very insightful. I have not worked much with embedded systems, so considerations about packing bits are a little new to me.

I suppose I'll just adapt this for the int64_t we're working with and pick one of the packing orders.

So:

void foo_little_endian(unsigned char* buffer, int64_t payload) {
  buffer[0] = (unsigned char)((payload & 0x00000000'000000FF) >> 0);
  buffer[1] = (unsigned char)((payload & 0x00000000'0000FF00) >> 8);
  buffer[2] = (unsigned char)((payload & 0x00000000'00FF0000) >> 16);
  buffer[3] = (unsigned char)((payload & 0x00000000'FF000000) >> 24);
  buffer[4] = (unsigned char)((payload & 0x000000FF'00000000) >> 32);
  buffer[5] = (unsigned char)((payload & 0x0000FF00'00000000) >> 40);
  buffer[6] = (unsigned char)((payload & 0x00FF0000'00000000) >> 48);
  buffer[7] = (unsigned char)((payload & 0xFF000000'00000000) >> 56);
}

What I am confused a little bit about though, is, wouldn't the byte representation of the number be reversed depending on the endianness of the machine? Wouldn't that mean that, if we had an explicit byte packing order (in this case little endian), then wouldn't the byte buffer that eventually makes it to the uuid generation call differ? I feel like I'm missing something 🙇

EDIT: I think I got it. I think I had a brain fart and forgot that the hex representation
(that is, the hex literal) of the number is independent of the actual in-memory representation. The array should be the same regardless of system endianness. 🤦‍♂️

@BrettRD
Copy link
Collaborator

BrettRD commented Oct 14, 2022

Yep, endianness only affects types larger than 8b
UUID expects byte-array input, so it will produce consistent output based on array index order regardless of system arch.
unsafe casts like pointer casts will reverse the string order of an integer (and so change the hash output), depending on architecture

This is unlikely to pop up as an issue in the immediate future; very few people will need to synchronise a fuse_transaction across a mixed-arch robot fleet just yet, but it's a good habit to be in, especially with micro-ros gaining popularity.

You can run into a similar issue on a single machine too when doing pointer hacks over structs; certain compiler flags will include or exclude padding bytes between struct members for memory-alignment reasons, making the raw-memory representations of a struct incompatible between builds depending on build settings.

Explicit packing and parsing is almost always worth the effort, but you might like to back-log it until the CI testing comes online.

@methylDragon
Copy link
Collaborator Author

methylDragon commented Oct 14, 2022

Explicit byte packing for uuids are in (in the time PR):
72f51e8

fuse_core/CMakeLists.txt Outdated Show resolved Hide resolved
fuse_core/CMakeLists.txt Outdated Show resolved Hide resolved
@@ -68,14 +67,16 @@ class DelayedThrottleFilter : public ros::console::FilterBase
*/
bool isEnabled() override
{
const auto now = ros::Time::now().toSec();
#warn "migrated from ros1, using default clock"
const auto now = rclcpp::Clock().now().seconds();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see throttling on system time making sense. A downside is fewer messages logged when simulating at faster than real time.

Signed-off-by: methylDragon <methylDragon@gmail.com>
Signed-off-by: methylDragon <methylDragon@gmail.com>
Co-authored-by: Brett Downing
Signed-off-by: methylDragon <methylDragon@gmail.com>
Signed-off-by: methylDragon <methylDragon@gmail.com>
Signed-off-by: methylDragon <methylDragon@gmail.com>
Signed-off-by: methylDragon <methylDragon@gmail.com>
Signed-off-by: methylDragon <methylDragon@gmail.com>

rclcpp_components_register_node(fuse_echo_component
PLUGIN "fuse_core::FuseEcho"
EXECUTABLE fuse_echo_node
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, I'd recommend keeping the executable name as fuse_echo.

TARGETS fuse_echo
RUNTIME DESTINATION ${CATKIN_PACKAGE_BIN_DESTINATION}
TARGETS fuse_echo_component
DESTINATION lib/${PROJECT_NAME}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fuse_echo_component is a library, so it should have these destinations:

install(TARGETS fuse_echo_component
  ARCHIVE DESTINATION lib
  LIBRARY DESTINATION lib
  RUNTIME DESTINATION bin)

The original rule was installing the executable fuse_echo, but the rclcpp_components_register_node macro already does that for us.

*/
ThrottledCallback(Callback&& keep_callback = nullptr, // NOLINT(whitespace/operators)
Callback&& drop_callback = nullptr, // NOLINT(whitespace/operators)
const ros::Duration& throttle_period = ros::Duration(0.0), const bool use_wall_time = false)
const rclcpp::Duration& throttle_period = rclcpp::Duration(0,0),
const bool use_wall_time = false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to keep the flag and leave the boolean ROS parameter alone so there's parameter parity between the ROS 1 and ported version of Fuse.

I agree about keeping the parameter throttle_use_wall_time the same.

A clock will need to be passed in to use anything other than wall time because a ROS clock needs to be attached to a time source. It can't be created inside of ThrottledCallback(). I'm not sure how to plumb that here though. Maybe fuse_core::AsyncSensorModel can have a way for subclasses to get a ROS clock off of it?

Signed-off-by: methylDragon <methylDragon@gmail.com>
Copy link
Collaborator

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good as a base for follow up PRs

@methylDragon methylDragon merged commit fcbc939 into locusrobotics:rolling Nov 2, 2022
@methylDragon methylDragon deleted the rolling-fuse_core-common branch November 2, 2022 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants