Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for timers on reset callback #1979

Merged

Conversation

mauropasse
Copy link
Collaborator

Based on ros2/rcl#995, this PR adds support for the "on reset" callback for timers.

The issue which originated this need, is to wake up an events based executor when a timer is reset (which could also be caused by a time jump).

@alsora

@alsora
Copy link
Collaborator

alsora commented Aug 2, 2022

CI including rcl PR

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

Copy link
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm, a few comments.

rclcpp/src/rclcpp/timer.cpp Show resolved Hide resolved
rclcpp/include/rclcpp/timer.hpp Show resolved Hide resolved
rclcpp/src/rclcpp/timer.cpp Show resolved Hide resolved
@Flova
Copy link

Flova commented Oct 17, 2022

Any updates here?

@alsora
Copy link
Collaborator

alsora commented Oct 27, 2022

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@mauropasse mauropasse force-pushed the mauro/add-timer-on-reset-callback branch from 052aa82 to 0673586 Compare October 28, 2022 10:56
@clalancette clalancette added the more-information-needed Further information is required label Nov 17, 2022
@Flova
Copy link

Flova commented Nov 24, 2022

The CI fails because ros2/rcl#995 is needed.

@alsora
Copy link
Collaborator

alsora commented Nov 24, 2022

The main issue here is the Windows CI that is failing.
However, I see that the same failure is also present in a completely unrelated PR #1982

@Flova
Copy link

Flova commented Nov 24, 2022

When I looked at the Win CI output it seemed that something related to the parameters interface in tf2_ros was failing. This seems to be kind of unrelated. Can we maybe try to tun the CI against some documentation PR to see if this is an issue with the current master of tf2?

@ivanpauno
Copy link
Member

The main issue here is the Windows CI that is failing. However, I see that the same failure is also present in a completely unrelated PR #1982

I think that was fixed in #2018, the error is not happening in nightlies anymore.
It would be nice to rebase and rerun to confirm.

Copy link
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test would be ideal, lgtm.

@fujitatomoya
Copy link
Collaborator

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@Flova
Copy link

Flova commented Nov 28, 2022

There seems to be another Windows build issue:

19:58:26 Running handlers complete
19:58:26 [2022-11-28T18:58:25+00:00] ERROR: Exception handlers complete
19:58:26 Cinc Client failed. 65 resources updated in 12 minutes 01 seconds
19:58:26 [2022-11-28T18:58:26+00:00] FATAL: LoadError: cannot load such file -- inspec/utils/deprecation/global_method
19:58:26 The command 'cmd /S /C c:\cinc-project\cinc\bin\cinc-solo.bat -c C:\TEMP\solo.rb -Eros2ci -j C:\TEMP\install_ros2_%ROS_DISTRO%.json' returned a non-zero code: 1

@Flova
Copy link

Flova commented Nov 28, 2022

Also seems unrelated imo...

@ivanpauno
Copy link
Member

Seems like a random CI network issue, running it again:

  • Windows Build Status

@Flova
Copy link

Flova commented Nov 29, 2022

Seems like a random CI network issue, running it again:

The Windows CI still seems to have Network issues:

Relevant Jenkins Traceback
17:31:39 FATAL: command execution failed
17:33:44 java.nio.channels.ClosedChannelException
17:33:44 	at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
17:33:44 	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)
17:33:44 	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:790)
17:33:44 	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
17:33:44 	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
17:33:44 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
17:33:44 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
17:33:44 	at java.lang.Thread.run(Thread.java:748)
17:33:44 Caused: java.io.IOException: Backing channel 'JNLP4-connect connection from ip-10-0-1-137.ec2.internal/10.0.1.137:49694' is disconnected.
17:33:44 	at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:216)
17:33:44 	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:286)
17:33:44 	at com.sun.proxy.$Proxy76.isAlive(Unknown Source)
17:33:44 	at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1213)
17:33:44 	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1205)
17:33:44 	at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:194)
17:33:44 	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:144)
17:33:44 	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
17:33:44 	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
17:33:44 	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:806)
17:33:44 	at hudson.model.Build$BuildExecution.build(Build.java:198)
17:33:44 	at hudson.model.Build$BuildExecution.doRun(Build.java:163)
17:33:44 	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:514)
17:33:44 	at hudson.model.Run.execute(Run.java:1888)
17:33:44 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
17:33:44 	at hudson.model.ResourceController.execute(ResourceController.java:99)
17:33:44 	at hudson.model.Executor.run(Executor.java:432)
17:33:44 FATAL: Unable to delete script file C:\Windows\TEMP\jenkins469429339840863473.bat
17:33:44 java.nio.channels.ClosedChannelException
17:33:44 	at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
17:33:44 	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:179)
17:33:44 	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:790)
17:33:44 	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
17:33:44 	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
17:33:44 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
17:33:44 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
17:33:44 	at java.lang.Thread.run(Thread.java:748)
17:33:44 Caused: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@30159803:JNLP4-connect connection from ip-10-0-1-137.ec2.internal/10.0.1.137:49694": Remote call on JNLP4-connect connection from ip-10-0-1-137.ec2.internal/10.0.1.137:49694 failed. The channel is closing down or has closed down
17:33:44 	at hudson.remoting.Channel.call(Channel.java:994)
17:33:44 	at hudson.FilePath.act(FilePath.java:1165)
17:33:44 	at hudson.FilePath.act(FilePath.java:1154)
17:33:44 	at hudson.FilePath.delete(FilePath.java:1681)
17:33:44 	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:162)
17:33:44 	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
17:33:44 	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
17:33:44 	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:806)
17:33:44 	at hudson.model.Build$BuildExecution.build(Build.java:198)
17:33:44 	at hudson.model.Build$BuildExecution.doRun(Build.java:163)
17:33:44 	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:514)
17:33:44 	at hudson.model.Run.execute(Run.java:1888)
17:33:44 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
17:33:44 	at hudson.model.ResourceController.execute(ResourceController.java:99)
17:33:44 	at hudson.model.Executor.run(Executor.java:432)
17:33:44 Build step 'Execute Windows batch command' marked build as failure
17:33:44 ERROR: Step ‘Record compiler warnings and static analysis results’ aborted due to exception: 
17:33:44 java.io.IOException: No workspace found for ci_windows #18344
17:33:44 	at io.jenkins.plugins.analysis.core.steps.IssuesRecorder.perform(IssuesRecorder.java:577)
17:33:44 	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
17:33:44 	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:806)
17:33:44 	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:755)
17:33:44 	at hudson.model.Build$BuildExecution.post2(Build.java:178)
17:33:44 	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:699)
17:33:44 	at hudson.model.Run.execute(Run.java:1913)
17:33:44 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
17:33:44 	at hudson.model.ResourceController.execute(ResourceController.java:99)
17:33:44 	at hudson.model.Executor.run(Executor.java:432)
17:33:44 ERROR: Step ‘Publish Cobertura Coverage Report’ failed: no workspace for ci_windows #18344
17:33:44 ERROR: Step ‘Publish xUnit test result report’ failed: no workspace for ci_windows #18344
17:33:44 Finished: FAILURE

@ivanpauno
Copy link
Member

The Windows CI still seems to have Network issues:

Yes, I will run it again in some hours, after checking with the infra team if there are any windows CI issues.

@ivanpauno
Copy link
Member

  • Windows Build Status

@ivanpauno
Copy link
Member

it seems we are still having problems with the windows agents ...

@nuclearsandwich
Copy link
Member

Thanks for everyone's patience on these infrastructure issues. We use low-bid spot instances in order to keep costs down but these cause us trouble every year around the shopping season. The ROS infra team is currently working to add some on-demand capacity to allow jobs to continue.

In order to run that transition I am unfortunately going to abort Build Status but I will requeue it again and link it here once the transition is complete.

@nuclearsandwich
Copy link
Member

Alright. Because Build Status is running on a relatively fresh agent, there may still be issues related to other reliability issues caused by the holiday activity but this agent should at least stick around.

@Flova
Copy link

Flova commented Nov 30, 2022

The run still fails :/, but it at least failed relatively early with an infrastructure related error.

@clalancette
Copy link
Contributor

Build Status is the same job and running now.

@alsora
Copy link
Collaborator

alsora commented Dec 6, 2022

CI

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@alsora
Copy link
Collaborator

alsora commented Dec 7, 2022

CI after fixing rcl PR

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@alsora
Copy link
Collaborator

alsora commented Dec 7, 2022

CI

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@Flova
Copy link

Flova commented Dec 8, 2022

Finally 🎉

@alsora
Copy link
Collaborator

alsora commented Dec 8, 2022

@fujitatomoya @ivanpauno @hidmic it looks like all comments have been addressed; can you have a final look before we merge this PR and its related rcl PR?

@fujitatomoya
Copy link
Collaborator

I am good to go with this, but requesting another review from maintainers. CC: @ivanpauno @hidmic @wjwwood

be advised that this must be merged with ros2/rcl#995

Mauro Passerino added 2 commits December 16, 2022 11:16
Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
@mauropasse mauropasse force-pushed the mauro/add-timer-on-reset-callback branch from d5f1949 to 43dee48 Compare December 16, 2022 14:52
@mauropasse
Copy link
Collaborator Author

Rebased after #2059 got merged

@alsora
Copy link
Collaborator

alsora commented Jan 22, 2023

Ping; it looks like the rcl PR (ros2/rcl#995) has been approved, can we get an additional review here?

this protects against the callback being modified while we are resetting the timer

Signed-off-by: Alberto Soragna <alberto.soragna@gmail.com>
@alsora
Copy link
Collaborator

alsora commented Feb 10, 2023

Hi, any update on this?

@fujitatomoya fujitatomoya merged commit ce5a261 into ros2:rolling Feb 10, 2023
@mauropasse mauropasse deleted the mauro/add-timer-on-reset-callback branch February 13, 2023 11:05
tonynajjar pushed a commit to pixel-robotics/rclcpp that referenced this pull request Apr 3, 2023
Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
Co-authored-by: Alberto Soragna <alberto.soragna@gmail.com>
tonynajjar pushed a commit to pixel-robotics/rclcpp that referenced this pull request Apr 3, 2023
Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
Co-authored-by: Alberto Soragna <alberto.soragna@gmail.com>
alsora added a commit to irobot-ros/rclcpp that referenced this pull request Apr 29, 2023
Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
Co-authored-by: Alberto Soragna <alberto.soragna@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
more-information-needed Further information is required
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants