Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opportunistically enable SCHED_FIFO #748

Merged
merged 5 commits into from
Jul 7, 2022
Merged

Conversation

tylerjw
Copy link
Contributor

@tylerjw tylerjw commented Jun 30, 2022

Signed-off-by: Tyler Weaver maybe@tylerjw.dev

Checklist

  • Fork the repository.
  • Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
  • Ensure local tests pass. (colcon test and pre-commit run (requires you to install pre-commit by pip3 install pre-commit)
  • Commit to your fork using clear commit messages.
  • Send a pull request, answering any default questions in the pull request interface.
  • Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.

Description

To use this you need a RT Kernel. The two easiest ways are either:

If you choose to go the first route it is free for non-commercial use. For commercial use the pricing starts at $225/year/server. This is the easiest solution because you can just use Ubuntu 22.04 like you normally would. Note that you'll need to run ros2_control as root for it to be able to make these syscalls successfully.

For the second option we are running Ubuntu 22.04 in a docker container. In a docker container you can send syscalls to configure your threads if you enable the capacity CAP_SYS_ADMIN and run as root.

In my testing with these changes on a 4 core intel based NUC I'm seeing ~200us of jitter. There are a ton more things that could be done to make this even more deterministic but this small change is a huge improvement for locking down the loop rate of ros2_control.

Here is an example for how you might start docker to run this:

sudo docker run -it --rm \
  --net=host \
  --privileged \
  --cap-add=CAP_SYS_ADMIN \
  ros:rolling-ros-base /bin/bash

struct sched_param schedp;
memset(&schedp, 0, sizeof(schedp));
schedp.sched_priority = priority;
if (sched_setscheduler(0, SCHED_FIFO, &schedp))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really wanted to use SCHED_DEADLINE here but the problem is that the code to configure it does not compile without the headers from the RT kernel. For that reason I used FIFO which will compile on normal linux.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add this, maybe in another file or as the option somewhere users may use if they have RT kernel.

Can we use pre-processor directive to detect that and set different scheduler if possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes, I'll move these into a separate file with a header so others can easily use them in their code.
  2. You wouldn't want to do that because then binaries built in the buildfarm wouldn't have this feature.

@bmagyar bmagyar changed the title Opertunistically enable SCHED_FIFO Opportunistically enable SCHED_FIFO Jun 30, 2022
Copy link
Member

@destogl destogl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we should add another node ros2_control_node_rt.cpp with this code? Probably not needed. I leave that decision to you.

Nevertheless, we should add a line about this new feature in the doc. For example, short explanation and instructions how to activate it.

struct sched_param schedp;
memset(&schedp, 0, sizeof(schedp));
schedp.sched_priority = priority;
if (sched_setscheduler(0, SCHED_FIFO, &schedp))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add this, maybe in another file or as the option somewhere users may use if they have RT kernel.

Can we use pre-processor directive to detect that and set different scheduler if possible?

std::thread cm_thread(
[cm]()
{
RCLCPP_INFO(cm->get_logger(), "update rate is %d Hz", cm->get_update_rate());
if (has_realtime_kernel())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So FIFO needs also real-time kernel? but can compile also on normal kernel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it compiles just fine because the code for configuring FIFO has made it into the mainline linux headers because they are part of the POSIX standard (just a part that mainline linux will error at runtime if you try to use).

Copy link
Contributor Author

@tylerjw tylerjw Jun 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SCHED_DEADLINE however has not made it into mainline linux even though it is POSIX so you wouldn't be able to compile this code with normal linux if I used DEADLINE.

How I'm using this is building it in CI into a docker where there is just normal linux, but then I pull that docker down onto my Debian machine that has a realtime kernel and it runs and uses these features.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SCHED_FIFO and SCHED_RR are POSIX 1003.1, SCHED_DEADLINE is a Linux-only scheduler, and while I've head it has been ported to some other OSes, it's not part of any standard AFAIK. So you can't assume that SCHED_DEADLINE is available on other POSIXs/UNIXs like QNX or VxWorks either unfortunately.

@tylerjw
Copy link
Contributor Author

tylerjw commented Jun 30, 2022

Not sure if we should add another node ros2_control_node_rt.cpp with this code?

I don't think there is a need to add extra steps for the user, this won't harm anyone running on normal linux and will just work for those that have a RT kernel.

@tylerjw
Copy link
Contributor Author

tylerjw commented Jun 30, 2022

Nevertheless, we should add a line about this new feature in the doc. For example, short explanation and instructions how to activate it.

Specifically where in the docs should I add a note about this?

@tylerjw tylerjw force-pushed the sched_fifo branch 2 times, most recently from 7b0b812 to 4533c0f Compare June 30, 2022 22:46
@codecov-commenter
Copy link

codecov-commenter commented Jun 30, 2022

Codecov Report

Merging #748 (9782efa) into master (925f5f3) will decrease coverage by 1.62%.
The diff coverage is 36.84%.

@@            Coverage Diff             @@
##           master     #748      +/-   ##
==========================================
- Coverage   34.61%   32.98%   -1.63%     
==========================================
  Files          52       89      +37     
  Lines        2981     8542    +5561     
  Branches     1855     5693    +3838     
==========================================
+ Hits         1032     2818    +1786     
- Misses        310      662     +352     
- Partials     1639     5062    +3423     
Flag Coverage Δ
unittests 32.98% <36.84%> (-1.63%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
controller_manager/src/realtime.cpp 0.00% <0.00%> (ø)
controller_manager/src/ros2_control_node.cpp 0.00% <0.00%> (ø)
..._interface/include/hardware_interface/actuator.hpp 100.00% <ø> (ø)
...re_interface/include/hardware_interface/sensor.hpp 100.00% <ø> (ø)
...re_interface/include/hardware_interface/system.hpp 100.00% <ø> (ø)
hardware_interface/src/resource_manager.cpp 50.09% <ø> (-3.53%) ⬇️
hardware_interface/src/sensor.cpp 50.52% <ø> (ø)
hardware_interface/src/system.cpp 55.45% <ø> (ø)
...rface/test/fake_components/test_generic_system.cpp 8.60% <ø> (ø)
...dware_interface/test/test_component_interfaces.cpp 33.26% <ø> (+5.08%) ⬆️
... and 109 more

@tylerjw tylerjw requested a review from destogl July 5, 2022 17:33
@destogl
Copy link
Member

destogl commented Jul 5, 2022

Specifically where in the docs should I add a note about this?

Sorry for the late replay, a few lines about the ros2_control_node would be great to have it here: https://github.com/tylerjw/ros2_control/blob/sched_fifo/controller_manager/doc/userdoc.rst.

We are fully missing docs on it, sorry to push this on you, but I think now it's great time to add few lines, just for users to know that we have also this functionality.

@tylerjw
Copy link
Contributor Author

tylerjw commented Jul 6, 2022

@destogl I added some statements to the documentation. Lmk if you would like me to change it in any way.

Signed-off-by: Tyler Weaver <maybe@tylerjw.dev>
Signed-off-by: Tyler Weaver <maybe@tylerjw.dev>
Signed-off-by: Tyler Weaver <maybe@tylerjw.dev>
Signed-off-by: Tyler Weaver <maybe@tylerjw.dev>
@tylerjw
Copy link
Contributor Author

tylerjw commented Jul 6, 2022

I just rebased this on master with hope that it'll pass CI.


// Use nanoseconds to avoid chrono's rounding
rclcpp::Duration period(std::chrono::nanoseconds(1000000000 / cm->get_update_rate()));
rclcpp::Duration period(std::chrono::nanoseconds(1'000'000'000 / cm->get_update_rate()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rclcpp::Duration period(std::chrono::nanoseconds(1'000'000'000 / cm->get_update_rate()));
const rclcpp::Duration period(std::chrono::nanoseconds(1'000'000'000 / cm->get_update_rate()));

Signed-off-by: Tyler Weaver <maybe@tylerjw.dev>
@bmagyar
Copy link
Member

bmagyar commented Jul 7, 2022

Thanks a lot for this, very neat! I'm happy with the changes, will be fully happy when the CI turns green ;)

@tylerjw
Copy link
Contributor Author

tylerjw commented Jul 7, 2022

Thanks a lot for this, very neat! I'm happy with the changes, will be fully happy when the CI turns green ;)

Do you know what the deal is with the codecov one is, it was failing when installing ROS 2.

Also, today I'm finally going to test this on hardware to verify it. Up until today I've been testing with this patch applied to galactic.

@bmagyar
Copy link
Member

bmagyar commented Jul 7, 2022

I'm rerunning the failing CI job ATM

@tylerjw
Copy link
Contributor Author

tylerjw commented Jul 7, 2022

I tested this on hardware today and it works as expected. @destogl

Copy link
Member

@destogl destogl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very-mery much!

@destogl destogl merged commit 8283124 into ros-controls:master Jul 7, 2022
@tylerjw tylerjw deleted the sched_fifo branch July 10, 2022 17:20
tylerjw added a commit to tylerjw/ros2_control that referenced this pull request Jul 10, 2022
tylerjw added a commit to tylerjw/ros2_control that referenced this pull request Jul 10, 2022
…ortunistically enable SCHED_FIFO (ros-controls#748)

Signed-off-by: Tyler Weaver <tylerjw@gmail.com>
bmagyar pushed a commit that referenced this pull request Jul 11, 2022
…ortunistically enable SCHED_FIFO (#748) (#768)

Signed-off-by: Tyler Weaver <tylerjw@gmail.com>
@AndyZe
Copy link
Contributor

AndyZe commented Jul 15, 2022

If I'm reading the manual correctly, you can elevate the thread priority without any special privileges, i.e. without sudo:

On Linux, the RLIMIT_NICE resource limit can be used to define a
limit to which an unprivileged process's nice value can be
raised; see setrlimit(2) for details.

https://man7.org/linux/man-pages/man7/sched.7.html

I'm having some trouble determining what the default limit is, though. Regardless, I'll make an issue about this.

@ciandonovan
Copy link

ciandonovan commented Oct 26, 2022

Should the program memory also be prefaulted with mlockall to prevent pagefaults?

https://docs.ros.org/en/dashing/Tutorials/Real-Time-Programming.html#id4
https://design.ros2.org/articles/realtime_background.html#memory-management

@AndyZe
Copy link
Contributor

AndyZe commented Oct 27, 2022

@ciandonovan I don't know a ton about it but it sounds worth looking into. Can you create an issue so there's more visibility?

Posting on a merged PR is only going to trigger email notifications to a few people.

pac48 pushed a commit to pac48/ros2_control that referenced this pull request Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants