Fix Groot2Publisher destructor infinite loop #1100

facontidavide · 2026-02-03T08:10:50Z

Summary

Fixes Groot2Publisher enters infinite loop when exception is thrown #1057 where the Groot2Publisher destructor could hang indefinitely
The ZMQ server thread was blocked on recv() while active_server_ remained true
Now properly shuts down ZMQ context to interrupt blocking operations

Changes

Set active_server_=false before joining threads
Call zmq_context.shutdown() to interrupt blocking recv()
Add try-catch around ZMQ operations to handle context termination gracefully
Reorder destructor to remove hooks after threads are joined

Test plan

Added test DestructorCompletesAfterException - verifies destructor completes even when tree throws
Added test DestructorCompletesWithMultipleNodes - verifies cleanup with complex trees
Added test RapidCreateDestroy - verifies no hangs during rapid lifecycle
All 337 tests pass

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Enhanced shutdown robustness with improved thread synchronization and error handling during cleanup to prevent potential race conditions.
Tests
- Added comprehensive test coverage for destructor behavior and rapid create/destroy scenarios under error conditions.

The destructor could hang indefinitely when the ZMQ server thread was waiting on recv() while active_server_ remained true. Changes: - Set active_server_=false before joining threads - Call zmq_context.shutdown() to interrupt blocking recv() - Add try-catch around ZMQ operations to handle context termination - Reorder destructor to remove hooks after threads are joined Includes tests for: - Destructor completion after exception - Destructor with multiple tree nodes attached - Rapid create/destroy cycles Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

coderabbitai · 2026-02-03T08:11:12Z

📝 Walkthrough

Walkthrough

The PR fixes an infinite loop bug in Groot2Publisher's destructor by reordering the shutdown sequence: signaling thread stop, explicitly shutting down the ZMQ context to unblock recv(), joining threads, then removing hooks. Exception handling wraps recv() and send() calls in the server loop. Test coverage is expanded with scenarios for exception handling and rapid creation/destruction cycles.

Changes

Cohort / File(s)	Summary
Groot2Publisher Shutdown Fix `src/loggers/groot2_publisher.cpp`	Reordered destructor shutdown sequence to signal threads stop, shutdown ZMQ context, join threads, then remove hooks. Added try/catch blocks for zmq::error_t in serverLoop for recv() and send() calls to handle context/socket termination gracefully.
Test Coverage for Exception Scenarios `tests/gtest_groot2_publisher.cpp`	Added three new test cases (DestructorCompletesAfterException, DestructorCompletesWithMultipleNodes, RapidCreateDestroy) and XML fixture to verify destructor robustness, exception propagation, and synchronization under repeated creation/destruction and exception conditions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Fixed possible infinite loop in Groot2Publisher when destructor is ca… #1058: Modifies threading and shutdown behavior in Groot2Publisher with server/heartbeat synchronization and destructor sequencing changes.

Poem

🐰 A loop that spun round and round,
No shutdown in sight to be found—
Till ZMQ's context went sleep,
And threads took their exit so deep,
Now destruction completes safe and sound! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main fix: resolving a destructor infinite loop issue in Groot2Publisher.
Description check	✅ Passed	The description provides a clear summary with specific changes, explicitly references issue `#1057`, includes a detailed test plan with three tests, and confirms all 337 tests pass.
Linked Issues check	✅ Passed	The PR directly addresses `#1057` by implementing the necessary fixes: setting active_server_=false, calling zmq_context.shutdown(), adding try-catch blocks, and reordering destructor cleanup. All objectives from the issue are met.
Out of Scope Changes check	✅ Passed	All changes are focused on fixing the infinite loop issue in Groot2Publisher destructor and adding tests to verify the fix. No unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/1057-groot2-publisher-infinite-loop

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codacy-production · 2026-02-03T08:15:52Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ +0.00% (target: -1.00%)	✅ 50.00%

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`2c71b41`)	5395	3755	69.60%
Head commit (`ac925b0`)	5402 (+7)	3760 (+5)	69.60% (+0.00%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#1100)	10	5	50.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

sonarqubecloud · 2026-02-03T08:18:03Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

codecov · 2026-02-03T08:18:19Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 0.00%. Comparing base (2c71b41) to head (ac925b0).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@      Coverage Diff       @@
##   master   #1100   +/-   ##
==============================
==============================

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@src/loggers/groot2_publisher.cpp`:
- Around line 272-283: The sendErrorReply lambda currently calls
reply_msg.send() without catching zmq::error_t, so during ZMQ context shutdown
an exception (ETERM) can escape and kill the thread; update sendErrorReply to
wrap the send call in a try/catch for zmq::error_t and silently handle/return on
ETERM (or log and return) to mirror the recv()/reply_msg.send() handling in the
main loop; apply the same pattern to the similar send path around the code
referenced at lines 510–518 so all zmq::multipart_t::send() calls are guarded
against ZMQ shutdown exceptions.

coderabbitai · 2026-02-03T08:19:21Z

src/loggers/groot2_publisher.cpp

+    try
    {
-      continue;
+      if(!requestMsg.recv(socket) || requestMsg.size() == 0)
+      {
+        continue;
+      }
+    }
+    catch(const zmq::error_t&)
+    {
+      // Context was terminated or socket error - exit the loop
+      break;
    }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

ZMQ zmq::multipart_t::send throw exception ETERM shutdown pattern

💡 Result:

zmq::multipart_t::send() ultimately calls zmq_send() on the underlying socket. If your process is shutting down and the context is shut down / terminated while another thread is sending, libzmq makes (almost) all further socket operations fail with ETERM, which cppzmq surfaces as zmq::error_t. This is expected behavior: context shutdown/termination forces any blocking (and many non-blocking) socket ops to return immediately with ETERM. [1], [2]

Recommended shutdown pattern (avoid “spurious” exceptions, exit cleanly)

Signal your worker threads to stop producing/sending (atomic flag, cancellation token, etc.).

Interrupt any threads blocked in recv() / send() by calling ctx.shutdown() (maps to zmq_ctx_shutdown()), which causes blocked operations to return with ETERM. [1]

In every send/recv loop, catch zmq::error_t and treat ETERM as a normal exit condition.

Close sockets (often set linger = 0 first so close/term doesn’t hang waiting for pending sends). [3]

Finally terminate the context (ctx.close() / zmq_ctx_term()), after sockets are closed. [2]

Minimal C++ example (cppzmq)

std::atomic_bool stopping{false}; void worker(zmq::context_t& ctx, zmq::socket_t& s) { s.set(zmq::sockopt::linger, 0); // optional but common for fast shutdown [3] while (!stopping.load(std::memory_order_relaxed)) { try { zmq::multipart_t mp; // ... fill mp ... mp.send(s); // may throw zmq::error_t(ETERM) during shutdown } catch (const zmq::error_t& e) { if (e.num() == ETERM) break; // context shutdown/term in progress -> exit loop [1][2] throw; } } s.close(); } // elsewhere (shutdown path): stopping = true; ctx.shutdown(); // interrupts blocked ops with ETERM [1] // join worker threads ctx.close(); // terminates context; requires sockets closed; may block per linger [2][3]

Key point: if you call ctx.shutdown() / ctx.close() before (or concurrently with) stopping your senders, you should expect multipart_t::send() to throw ETERM—handle it as a normal shutdown condition rather than an error. [1], [4]

Sources:
[1] zmq_ctx_shutdown(3) behavior and ETERM semantics https://libzmq.readthedocs.io/en/latest/zmq_ctx_shutdown.html
[2] zmq_ctx_term(3) termination steps and blocking/linger considerations https://libzmq.readthedocs.io/en/latest/zmq_ctx_term.html
[3] ZMQ_LINGER option and effect on close/termination https://api.zeromq.org/4-2:zmq-setsockopt
[4] Catching ETERM during shutdown (discussion/example) https://stackoverflow.com/questions/8022391/zeromq-socket-recv-throws-context-was-terminated-exception-why-and-how-to

🏁 Script executed:

wc -l src/loggers/groot2_publisher.cpp

Repository: BehaviorTree/BehaviorTree.CPP

Length of output: 109

🏁 Script executed:

sed -n '260,295p' src/loggers/groot2_publisher.cpp

Repository: BehaviorTree/BehaviorTree.CPP

Length of output: 965

🏁 Script executed:

sed -n '500,530p' src/loggers/groot2_publisher.cpp

Repository: BehaviorTree/BehaviorTree.CPP

Length of output: 792

🏁 Script executed:

sed -n '250,270p' src/loggers/groot2_publisher.cpp

Repository: BehaviorTree/BehaviorTree.CPP

Length of output: 566

🏁 Script executed:

sed -n '240,265p' src/loggers/groot2_publisher.cpp

Repository: BehaviorTree/BehaviorTree.CPP

Length of output: 588

🏁 Script executed:

sed -n '255,265p' src/loggers/groot2_publisher.cpp

Repository: BehaviorTree/BehaviorTree.CPP

Length of output: 353

Guard sendErrorReply against ZMQ shutdown exceptions.

The recv() and reply_msg.send() calls in the main loop (lines 272–283 and 510–518) are wrapped in try/catch for zmq::error_t, but the sendErrorReply lambda (lines 256–262) sends without exception handling. During context shutdown, zmq::multipart_t::send() throws zmq::error_t (with ETERM), which will terminate the thread if uncaught. Wrap the error reply send similarly to maintain consistent shutdown behavior:

Suggested fix

auto sendErrorReply = [&socket](const std::string& msg) { zmq::multipart_t error_msg; error_msg.addstr("error"); error_msg.addstr(msg); + try + { error_msg.send(socket); + } + catch(const zmq::error_t&) + { + // Ignore errors during shutdown (e.g., ETERM) + } };

Also applies to: 510–518

🤖 Prompt for AI Agents

In `@src/loggers/groot2_publisher.cpp` around lines 272 - 283, The sendErrorReply lambda currently calls reply_msg.send() without catching zmq::error_t, so during ZMQ context shutdown an exception (ETERM) can escape and kill the thread; update sendErrorReply to wrap the send call in a try/catch for zmq::error_t and silently handle/return on ETERM (or log and return) to mirror the recv()/reply_msg.send() handling in the main loop; apply the same pattern to the similar send path around the code referenced at lines 510–518 so all zmq::multipart_t::send() calls are guarded against ZMQ shutdown exceptions.

facontidavide merged commit fcb95c8 into master Feb 3, 2026
17 of 18 checks passed

facontidavide deleted the fix/1057-groot2-publisher-infinite-loop branch February 3, 2026 08:17

coderabbitai bot reviewed Feb 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Groot2Publisher destructor infinite loop #1100

Fix Groot2Publisher destructor infinite loop #1100

facontidavide commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 3, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

codacy-production bot commented Feb 3, 2026

Uh oh!

Uh oh!

sonarqubecloud bot commented Feb 3, 2026

Uh oh!

codecov bot commented Feb 3, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix Groot2Publisher destructor infinite loop #1100

Fix Groot2Publisher destructor infinite loop #1100

Conversation

facontidavide commented Feb 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

codacy-production bot commented Feb 3, 2026

Coverage summary from Codacy

See diff coverage on Codacy

See your quality gate settings Change summary preferences

Uh oh!

Uh oh!

sonarqubecloud bot commented Feb 3, 2026

Quality Gate passed

Uh oh!

codecov bot commented Feb 3, 2026

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 3, 2026

Choose a reason for hiding this comment

Recommended shutdown pattern (avoid “spurious” exceptions, exit cleanly)

Minimal C++ example (cppzmq)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

facontidavide commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 3, 2026 •

edited

Loading