Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add debug logging #187

Merged
merged 16 commits into from
Nov 22, 2017
Merged

Add debug logging #187

merged 16 commits into from
Nov 22, 2017

Conversation

dhood
Copy link
Member

@dhood dhood commented Nov 21, 2017

Requires ros2/ament_cmake_ros#2 because ROS_PACKAGE_NAME is being used for the logger name to avoid hardcoding "rcl" everywhere

To note:

  • this is the first core package with debug logging added but I plan to add it to others
  • currently debug messages will be "compiled in" by default (and therefore their logging adds overhead), but their printing will be disabled by default. We can update the min severity compiled in at a later date if appropriate.
  • some calls could definitely be more informative, e.g. there isn't any context about which thread is doing the waiting, so the rclcpp background graph listener's waits "look like" waits from spin. I imagine that logging output from rclcpp will be helpful for providing context, but there's certainly room for additional contextual information in the future.
  • @wjwwood and I were speaking about how we might want to have another severity level that has even more debugging output, but perhaps compiled out by default. I have tried to not add that type of detailed logging in this PR but if there are log calls that you think would be too verbose for typical debugging then we can keep them in mind as candidates for the "superdebug"/"trace" level
  • no subloggers are used, but we could add one for e.g. rcl.wait if we think that users will want to filter out the repetitive logging from repeated waiting. I haven't added that here: my thinking was that we can revisit this later once all debug logging is in place.

Example output of rclpy talker with rcl debug logging enabled (add rclpy.logging.set_logger_severity_threshold('rcl', rclpy.logging.LoggingSeverity.DEBUG) to the script):

$ ros2 run demo_nodes_py talker
[DEBUG] [rcl]: Initializing node 'talker' in namespace ''
[DEBUG] [rcl]: Using domain ID of '70'
[DEBUG] [rcl]: Using security: false
[DEBUG] [rcl]: Node initialized
[DEBUG] [rcl]: Initializing publisher for topic name 'chatter'
[DEBUG] [rcl]: Expanded topic name '/chatter'
[DEBUG] [rcl]: Publisher initialized
[DEBUG] [rcl]: Initializing timer with period: 1000000000ns
[DEBUG] [rcl]: Updated timer period from '1000000000ns' to '1000000000ns'
[DEBUG] [rcl]: Initializing wait set with '0' subscriptions, '2' guard conditions, '1' timers, '0' clients, '0' services
[DEBUG] [rcl]: Waiting with timeout: 0s + 986357990ns
[DEBUG] [rcl]: Timeout calculated based on next scheduled timer: true
[DEBUG] [rcl]: Guard condition in wait set is ready
[DEBUG] [rcl]: Initializing wait set with '0' subscriptions, '2' guard conditions, '1' timers, '0' clients, '0' services
[DEBUG] [rcl]: Waiting with timeout: 0s + 986276525ns
[DEBUG] [rcl]: Timeout calculated based on next scheduled timer: true
[DEBUG] [rcl]: Timer in wait set is ready
[DEBUG] [rcl]: Calling timer
Publishing: "Hello World: 0"
[DEBUG] [rcl]: Initializing wait set with '0' subscriptions, '2' guard conditions, '1' timers, '0' clients, '0' services
[DEBUG] [rcl]: Waiting with timeout: 0s + 999647071ns
[DEBUG] [rcl]: Timeout calculated based on next scheduled timer: true
[DEBUG] [rcl]: Guard condition in wait set is ready
[DEBUG] [rcl]: Initializing wait set with '0' subscriptions, '2' guard conditions, '1' timers, '0' clients, '0' services
[DEBUG] [rcl]: Waiting with timeout: 0s + 999516152ns
[DEBUG] [rcl]: Timeout calculated based on next scheduled timer: true
^C[DEBUG] [rcl]: Guard condition in wait set is ready
Traceback (most recent call last):
  File "/home/dhood/ros2_ws/install_isolated/demo_nodes_py/lib/demo_nodes_py/talker", line 11, in <module>
    load_entry_point('demo-nodes-py', 'console_scripts', 'talker')()
  File "/home/dhood/ros2_ws/build_isolated/demo_nodes_py/demo_nodes_py/topics/talker.py", line 50, in main
    rclpy.spin(node)
  File "/home/dhood/ros2_ws/install_isolated/rclpy/lib/python3.5/site-packages/rclpy/__init__.py", line 54, in spin
    executor.spin_once()
  File "/home/dhood/ros2_ws/install_isolated/rclpy/lib/python3.5/site-packages/rclpy/executors.py", line 405, in spin_once
    handler, entity, node = next(self.wait_for_ready_callbacks(timeout_sec=timeout_sec))
  File "/home/dhood/ros2_ws/install_isolated/rclpy/lib/python3.5/site-packages/rclpy/executors.py", line 345, in wait_for_ready_callbacks
    raise KeyboardInterrupt
KeyboardInterrupt
[DEBUG] [rcl]: Finalizing publisher
[DEBUG] [rcl]: Timer canceled
[DEBUG] [rcl]: Finalizing node

rclcpp add_two_ints_client (add rcutils_logging_set_logger_severity_threshold("rcl", RCUTILS_LOG_SEVERITY_DEBUG);)

$ ros2 run demo_nodes_cpp add_two_ints_client
[DEBUG] [rcl]: Initializing node 'add_two_ints_client' in namespace ''
[DEBUG] [rcl]: Using domain ID of '70'
[DEBUG] [rcl]: Using security: false
[DEBUG] [rcl]: Node initialized
[DEBUG] [rcl]: Initializing publisher for topic name 'parameter_events'
[DEBUG] [rcl]: Expanded topic name '/parameter_events'
[DEBUG] [rcl]: Publisher initialized
[DEBUG] [rcl]: Initializing client for service name 'add_two_ints'
[DEBUG] [rcl]: Expanded service name '/add_two_ints'
[DEBUG] [rcl]: Client initialized
[DEBUG] [rcl]: Initializing wait set with '0' subscriptions, '2' guard conditions, '0' timers, '0' clients, '0' services
[DEBUG] [rcl]: Waiting without timeout
[DEBUG] [rcl]: Timeout calculated based on next scheduled timer: false
[DEBUG] [rcl]: Guard condition in wait set is ready
[DEBUG] [rcl]: Guard condition in wait set is ready
[DEBUG] [rcl]: Waiting without timeout
[DEBUG] [rcl]: Timeout calculated based on next scheduled timer: false
service not available, waiting again...
[DEBUG] [rcl]: Guard condition in wait set is ready
[DEBUG] [rcl]: Waiting without timeout
[DEBUG] [rcl]: Timeout calculated based on next scheduled timer: false
[DEBUG] [rcl]: Guard condition in wait set is ready
[DEBUG] [rcl]: Waiting without timeout
[DEBUG] [rcl]: Timeout calculated based on next scheduled timer: false
[DEBUG] [rcl]: Guard condition in wait set is ready
[DEBUG] [rcl]: Waiting without timeout
[DEBUG] [rcl]: Timeout calculated based on next scheduled timer: false
[DEBUG] [rcl]: Initializing wait set with '0' subscriptions, '2' guard conditions, '0' timers, '0' clients, '0' services
[DEBUG] [rcl]: Waiting without timeout
[DEBUG] [rcl]: Timeout calculated based on next scheduled timer: false
[DEBUG] [rcl]: Guard condition in wait set is ready
[DEBUG] [rcl]: Guard condition in wait set is ready
[DEBUG] [rcl]: Client in wait set is ready
[DEBUG] [rcl]: Client take request succeeded: true
Result of add_two_ints: 5
[DEBUG] [rcl]: Guard condition in wait set is ready
[DEBUG] [rcl]: Finalizing client
[DEBUG] [rcl]: Finalizing publisher
[DEBUG] [rcl]: Finalizing publisher
[DEBUG] [rcl]: Finalizing node
  • Linux Build Status
  • Linux-aarch64 Build Status (this is an existing flaky test that may have been exaggerated by the debug print overhead)
  • macOS Build Status
  • Windows Build Status

@dhood dhood added the in review Waiting for review (Kanban column) label Nov 21, 2017
@dhood dhood self-assigned this Nov 21, 2017
Copy link
Member

@mikaelarguedas mikaelarguedas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the level of debug information seems appropriate to me.

currently debug messages will be "compiled in" by default (and therefore their logging adds overhead),

I think that's fine, I'd expect users to be able to turn on our debug prints with the code shiped in our binaries.
We may need to reevaluate if we see more and more performance issues when adding debug prints to the rest of the stack

no subloggers are used, but we could add one for e.g. rcl.wait if we think that users will want to filter out the repetitive logging from repeated waiting. I haven't added that here: my thinking was that we can revisit this later once all debug logging is in place.

👍 I think we need to play with more complex high level code to see when/how to make this more user friendly

wjwwood and I were speaking about how we might want to have another severity level that has even more debugging output, but perhaps compiled out by default.

👍 I mentioned it a few times in recent PRs, the values we use for the severity levels and to be able to compile out some levels should allow intermediate levels so that we and/or users can extend them easily (severity levels allow it already, the macros used to compile in or out don't yet).
As a side note: DDS-Security defines 7 logging severities:

• FATAL_LEVEL – security error causing a shutdown or failure of the Domain Participant
• SEVERE_LEVEL – major security error or fault
• ERROR_LEVEL – minor security error or fault
• WARNING_LEVEL – undesirable or unexpected behavior
• NOTICE_LEVEL – important security event
• INFO_LEVEL – interesting security event
• DEBUG_LEVEL – detailed information on the flow of the security events
• TRACE_LEVEL – even more detailed information

@@ -163,6 +167,7 @@ rcl_ret_t
rcl_client_fini(rcl_client_t * client, rcl_node_t * node)
{
(void)node;
RCUTILS_LOG_DEBUG_NAMED(ROS_PACKAGE_NAME, "Finalizing client")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this add a "client finalized" like it's done in the init function? (same for node and other node entities below)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in bbece9e

@@ -250,6 +255,8 @@ rcl_take_request(
RCL_SET_ERROR_MSG(rmw_get_error_string_safe(), options->allocator);
return RCL_RET_ERROR;
}
RCUTILS_LOG_DEBUG_NAMED(
ROS_PACKAGE_NAME, "Service take request succeeded: %s", taken ? "true" : "false")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a debug print to rcl_send_response as well (https://github.com/ros2/rcl/pull/187/files#diff-777efe6fef16afbd8f9a9877ffba1138R282)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 7c7e117, and added for publish/take in 88b8bee

@@ -224,6 +229,8 @@ rcl_take(
RCL_SET_ERROR_MSG(rmw_get_error_string_safe(), options->allocator);
return RCL_RET_ERROR;
}
RCUTILS_LOG_DEBUG_NAMED(
ROS_PACKAGE_NAME, "Subscription take request succeeded: %s", taken ? "true" : "false")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy-n-paste ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 88b8bee thanks

throw std::runtime_error("error waiting for service to be ready");
}
});
size_t iteration = 0;
do {
++iteration;
if (rcl_wait_set_clear_services(&wait_set) != RCL_RET_OK) {
RCUTILS_LOG_ERROR("Error in wait_set_clear_services: %s", rcl_get_error_string_safe())
RCUTILS_LOG_ERROR_NAMED(
ROS_PACKAGE_NAME, "Error in wait_set_clear_services: %s", rcl_get_error_string_safe());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: added trailing semi-colon

return false;
}
if (rcl_wait_set_add_service(&wait_set, service) != RCL_RET_OK) {
RCUTILS_LOG_ERROR("Error in wait_set_add_service: %s", rcl_get_error_string_safe())
RCUTILS_LOG_ERROR_NAMED(
ROS_PACKAGE_NAME, "Error in wait_set_add_service: %s", rcl_get_error_string_safe());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: added trailing semi-colon (same multiple times in this file)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, removed in decb56a

Copy link
Member

@wjwwood wjwwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also lgtm, though some of the messages could definitely be downgraded to "trace" or "superdebug" like we had discussed.

@@ -163,6 +167,7 @@ rcl_ret_t
rcl_client_fini(rcl_client_t * client, rcl_node_t * node)
{
(void)node;
RCUTILS_LOG_DEBUG_NAMED(ROS_PACKAGE_NAME, "Finalizing client")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@dhood dhood mentioned this pull request Nov 22, 2017
14 tasks
Copy link
Member

@mikaelarguedas mikaelarguedas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dhood
Copy link
Member Author

dhood commented Nov 22, 2017

thanks for the reviews! I'll be happy to move some prints to a lower severity level if/when we go in that direction: this is just the first iteration.

I'll do some more investigating to check if this will make the ARM jobs unstable before merging.

@dhood dhood added in progress Actively being worked on (Kanban column) and removed in review Waiting for review (Kanban column) labels Nov 22, 2017
@dhood
Copy link
Member Author

dhood commented Nov 22, 2017

I reran the ARM job and it came out green on the same node, so I must have gotten unlucky before..: Build Status

@dhood dhood merged commit 7d0045a into master Nov 22, 2017
@dhood dhood removed the in progress Actively being worked on (Kanban column) label Nov 22, 2017
@dhood dhood deleted the debug_logging branch November 22, 2017 01:24
ivanpauno pushed a commit that referenced this pull request Jan 2, 2020
…#187)

* add pub/sub option structures to support rmw specific payload feature

Signed-off-by: William Woodall <william@osrfoundation.org>

* link to design document in TODO

Signed-off-by: William Woodall <william@osrfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants