Skip to content

Conversation

maciejmajek
Copy link
Member

@maciejmajek maciejmajek commented Sep 12, 2025

Purpose

Resolves #684

Proposed Changes

Dict caching of created clients

Issues

#684

Testing

New test

Summary by CodeRabbit

  • New Features

    • Added thread-safe caching and reuse of ROS 2 service clients, improving performance for repeated calls.
    • Introduced an option to toggle client reuse per call.
    • Added the ability to release a cached client when no longer needed.
  • Tests

    • Expanded coverage for client reuse, concurrent calls, and client release scenarios.
    • Updated test helpers to support the reuse option.
  • Chores

    • Bumped package version to 2.3.0.

@maciejmajek
Copy link
Member Author

@coderabbitai full review

Copy link
Contributor

coderabbitai bot commented Sep 12, 2025

✅ Actions performed

Full review triggered.

Copy link
Collaborator

@rachwalk rachwalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a possible concurrency issue, with a shared object being accessed outside of critical section.

Comment on lines 92 to 95
if os.getenv("ROS_DISTRO") == "humble":
return service_client.call(srv_msg)
return client.call(srv_msg)
else:
return service_client.call(srv_msg, timeout_sec=timeout_sec)
return client.call(srv_msg, timeout_sec=timeout_sec)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be in self._persistent_clients_lock? It's a reference to a shared object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find

Copy link
Contributor

coderabbitai bot commented Sep 12, 2025

Walkthrough

Implements persistent, thread-safe caching of ROS 2 service clients with an API to release cached clients. Extends service call methods to accept a keyword-only reuse_client flag. Updates tests accordingly and bumps the package version to 2.3.0.

Changes

Cohort / File(s) Summary
Version bump
src/rai_core/pyproject.toml
Increment version from 2.2.1 to 2.3.0 under [tool.poetry].
ROS2 service API: client caching and release
src/rai_core/rai/communication/ros2/api/service.py
Add thread-safe persistent client cache (_persistent_clients with Lock), new release_client(service_name) -> bool, and extend call_service(..., *, reuse_client: bool = True) to reuse or create clients. Preserve Humble vs non-Humble call semantics. Minor docstring tweaks.
ROS2 service mixin: delegate options
src/rai_core/rai/communication/ros2/connectors/service_mixin.py
Add release_client(service_name) -> bool delegating to API. Extend service_call(..., *, msg_type, reuse_client: bool = True, **kwargs) and forward reuse_client.
Tests updated for client reuse and release
tests/communication/ros2/test_api.py
Replace helper with invoke_set_bool_service(..., reuse_client: bool = True). Update call sites. Add test test_ros2_service_multiple_calls_with_reused_client verifying reuse across callback groups and release_client.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant Mixin as ROS2ServiceMixin
  participant API as ROS2ServiceAPI
  participant Cache as Client Cache (dict + Lock)
  participant ROS as ROS 2 Service

  Caller->>Mixin: service_call(msg, target, timeout, msg_type, reuse_client)
  Mixin->>API: call_service(service_name, service_type, request,<br/>timeout_sec, reuse_client)

  rect rgba(200,220,255,0.25)
    API->>Cache: acquire lock<br/>lookup service_name
    alt Client exists
      Cache-->>API: return cached client
    else No client
      API->>API: create client
      API->>Cache: store client
    end
    API->>Cache: release lock
  end

  API->>ROS: wait_for_service(timeout)
  alt Humble
    API->>ROS: client.call(request)
  else Non-Humble
    API->>ROS: client.call(request, timeout_sec)
  end
  ROS-->>API: response
  API-->>Mixin: response
  Mixin-->>Caller: response

  note over Caller,API: To free resources
  Caller->>API: release_client(service_name)
  API->>Cache: acquire lock, pop client if present
  Cache-->>API: bool removed
  API-->>Caller: bool
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks (3 passed, 2 warnings)

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Out of Scope Changes Check ⚠️ Warning All functional code changes relate to the caching objective, but the PR also bumps the package version in src/rai_core/pyproject.toml from 2.2.1 to 2.3.0, which is not requested by the linked issue and is unrelated to implementing client caching. This version update constitutes an out-of-scope change relative to issue #684. Tests and API modifications are otherwise in-scope. Remove the pyproject version bump from this PR or explicitly justify it in the PR description and update the changelog; alternatively split the version change into a separate release PR. After isolating or justifying the bump, proceed with merging the caching changes. If the bump is retained, ensure CI and packaging pipelines are updated accordingly.
Docstring Coverage ⚠️ Warning Docstring coverage is 7.14% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title "feat: service client cache" concisely and accurately summarizes the primary change—adding persistent/cached service clients to the ROS2 service API. It is short, specific, and directly reflects the modifications in src/rai_core/rai/communication/ros2/api/service.py and related modules. No extraneous noise or vagueness is present.
Linked Issues Check ✅ Passed The changes implement the linked issue's requirements: call_service gained a reuse_client flag, the class maintains a thread-safe _persistent_clients map, and a release_client API was added; the service mixin and tests were updated to exercise reuse and release. These coding changes directly satisfy the requested caching behavior and client destruction capability. I found no missing coding-related objectives from issue #684.
Description Check ✅ Passed The PR description includes all required template sections (Purpose, Proposed Changes, Issues, Testing) and links to issue #684, concisely stating the intent to add client caching and that a test was added. The content maps to the code changes and is sufficient for review. It could be improved with a brief note about the reuse_client default behavior and test coverage, but it is mostly complete.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/service-client-cache

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (9)
src/rai_core/rai/communication/ros2/connectors/service_mixin.py (2)

33-35: Expose minimal doc or type contract for release_client

Consider a short docstring clarifying that the client must not be released while in-flight calls are executing.


36-56: Remove unused kwargs or forward them intentionally

kwargs is unused (Ruff ARG002). Either drop it or forward explicitly if future options are expected.

Apply:

-    def service_call(
+    def service_call(
         self,
         message: ROS2Message,
         target: str,
         timeout_sec: float = 5.0,
         *,
         msg_type: str,
         reuse_client: bool = True,
-        **kwargs: Any,
     ) -> ROS2Message:
         msg = self._service_api.call_service(
             service_name=target,
             service_type=msg_type,
             request=message.payload,
             timeout_sec=timeout_sec,
             reuse_client=reuse_client,
         )
src/rai_core/rai/communication/ros2/api/service.py (3)

78-86: Make cache lookup type-safe and minimize lock scope

Use a composite key and keep the lock only around map ops; client creation can be outside if you re-check (double-checked pattern) to avoid creating twice.

Apply minimal change (single lock):

-        if reuse_client:
-            with self._persistent_clients_lock:
-                client = self._persistent_clients.get(service_name, None)
-                if client is None:
-                    client = self.node.create_client(srv_cls, service_name)  # type: ignore
-                    self._persistent_clients[service_name] = client
+        if reuse_client:
+            key = (service_name, service_type)
+            with self._persistent_clients_lock:
+                client = self._persistent_clients.get(key)
+                if client is None:
+                    client = self.node.create_client(srv_cls, service_name)  # type: ignore
+                    self._persistent_clients[key] = client
         else:
             client = self.node.create_client(srv_cls, service_name)  # type: ignore

If desired, we can further reduce lock hold time by creating the client outside and inserting conditionally; say the word and I’ll prepare that diff.


88-91: Use TimeoutError and keep message concise (Ruff TRY003)

ValueError isn’t ideal for timeouts. Recommend TimeoutError with a short message.

Apply:

-        if not is_service_available:
-            raise ValueError(
-                f"Service {service_name} not ready within {timeout_sec} seconds. "
-                "Try increasing the timeout or check if the service is running."
-            )
+        if not is_service_available:
+            raise TimeoutError(f"Service '{service_name}' not ready in {timeout_sec}s")

92-96: Normalize ROS_DISTRO env for robustness

Safer to lowercase and default to empty.

Apply:

-        if os.getenv("ROS_DISTRO") == "humble":
+        if os.getenv("ROS_DISTRO", "").lower() == "humble":
             return client.call(srv_msg)
         else:
             return client.call(srv_msg, timeout_sec=timeout_sec)
tests/communication/ros2/test_api.py (4)

141-149: Helper looks good; add timeout to surface hangs faster

Consider passing a small timeout_sec to catch regressions earlier during flakiness.

Example:

-    response = service_api.call_service(
+    response = service_api.call_service(
         service_name,
         service_type="std_srvs/srv/SetBool",
         request={"data": True},
-        reuse_client=reuse_client,
+        reuse_client=reuse_client,
+        timeout_sec=2.0,
     )

197-217: Strengthen release assertions

Also assert that releasing twice returns False to prove idempotency, and that a subsequent call succeeds (re-creates client).

Apply:

         for _ in range(3):
             invoke_set_bool_service(service_name, service_api, reuse_client=True)
-        assert service_api.release_client(service_name), "Client not released"
+        assert service_api.release_client(service_name), "Client not released"
+        assert not service_api.release_client(service_name), "Second release should be False"
+        # Re-create on demand after release
+        invoke_set_bool_service(service_name, service_api, reuse_client=True)

239-240: Threaded test: consider parametrizing reuse_client to exercise both modes

Running with reuse_client=False in a separate parameter set will validate concurrency without shared Client instances.


270-271: Note on multiprocessing lambda

This lambda closes over a non-picklable object (service_api) and would fail if unskipped; fine as the test is skipped, but leave a comment to avoid accidental unskip.

Apply:

-            pool.map(
-                lambda _: invoke_set_bool_service(service_name, service_api), range(10)
-            )
+            # NOTE: service_api is not picklable; keep this test skipped.
+            pool.map(lambda _: invoke_set_bool_service(service_name, service_api), range(10))
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6074a1b and c1501cf.

📒 Files selected for processing (4)
  • src/rai_core/pyproject.toml (1 hunks)
  • src/rai_core/rai/communication/ros2/api/service.py (4 hunks)
  • src/rai_core/rai/communication/ros2/connectors/service_mixin.py (1 hunks)
  • tests/communication/ros2/test_api.py (5 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
src/rai_core/rai/communication/ros2/connectors/service_mixin.py

44-44: Unused method argument: kwargs

(ARG002)

src/rai_core/rai/communication/ros2/api/service.py

88-91: Avoid specifying long messages outside the exception class

(TRY003)

tests/communication/ros2/test_api.py

203-203: Unused function argument: ros_setup

(ARG001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-and-test-ros2 (humble)
  • GitHub Check: build-and-test-ros2 (jazzy)
🔇 Additional comments (4)
src/rai_core/pyproject.toml (1)

7-7: Document behavior change: reuse_client now True by default

Add a 2.3.0 changelog/migration note stating "reuse_client now defaults to True (persistent clients used by default); this changes client lifecycle and may increase persistent connections/resource usage." Repo scan shows internal callers use keyword args or pass reuse_client explicitly (tests/communication/ros2/test_api.py; src/rai_core/rai/communication/ros2/connectors/service_mixin.py) — no positional reuse_client usages detected. Confirm no external consumers rely on per-call client creation; if they do, require them to set reuse_client=False.

src/rai_core/rai/communication/ros2/api/service.py (1)

63-65: Public API change — no positional callers found for keyword-only reuse_client

call_service is defined at src/rai_core/rai/communication/ros2/api/service.py:57 (4 allowed positional args; reuse_client is kw-only). Scanned 5 call sites; 0 suspicious callers found.

tests/communication/ros2/test_api.py (2)

168-171: LGTM: single-call path now exercises cached client by default


191-195: Good coverage for non-reuse path

This ensures parity with previous behavior (new client per call).

Copy link
Collaborator

@rachwalk rachwalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@maciejmajek maciejmajek merged commit f3a9413 into main Sep 12, 2025
6 checks passed
@maciejmajek maciejmajek deleted the feat/service-client-cache branch September 12, 2025 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Caching ROS 2 service clients in ROS2ServiceAPI
2 participants