-
Notifications
You must be signed in to change notification settings - Fork 51
feat: service client cache #685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a possible concurrency issue, with a shared object being accessed outside of critical section.
if os.getenv("ROS_DISTRO") == "humble": | ||
return service_client.call(srv_msg) | ||
return client.call(srv_msg) | ||
else: | ||
return service_client.call(srv_msg, timeout_sec=timeout_sec) | ||
return client.call(srv_msg, timeout_sec=timeout_sec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be in self._persistent_clients_lock? It's a reference to a shared object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find
WalkthroughImplements persistent, thread-safe caching of ROS 2 service clients with an API to release cached clients. Extends service call methods to accept a keyword-only reuse_client flag. Updates tests accordingly and bumps the package version to 2.3.0. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Caller
participant Mixin as ROS2ServiceMixin
participant API as ROS2ServiceAPI
participant Cache as Client Cache (dict + Lock)
participant ROS as ROS 2 Service
Caller->>Mixin: service_call(msg, target, timeout, msg_type, reuse_client)
Mixin->>API: call_service(service_name, service_type, request,<br/>timeout_sec, reuse_client)
rect rgba(200,220,255,0.25)
API->>Cache: acquire lock<br/>lookup service_name
alt Client exists
Cache-->>API: return cached client
else No client
API->>API: create client
API->>Cache: store client
end
API->>Cache: release lock
end
API->>ROS: wait_for_service(timeout)
alt Humble
API->>ROS: client.call(request)
else Non-Humble
API->>ROS: client.call(request, timeout_sec)
end
ROS-->>API: response
API-->>Mixin: response
Mixin-->>Caller: response
note over Caller,API: To free resources
Caller->>API: release_client(service_name)
API->>Cache: acquire lock, pop client if present
Cache-->>API: bool removed
API-->>Caller: bool
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Pre-merge checks (3 passed, 2 warnings)❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal). Please share your feedback with us on this Discord post. ✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (9)
src/rai_core/rai/communication/ros2/connectors/service_mixin.py (2)
33-35
: Expose minimal doc or type contract for release_clientConsider a short docstring clarifying that the client must not be released while in-flight calls are executing.
36-56
: Remove unused kwargs or forward them intentionallykwargs is unused (Ruff ARG002). Either drop it or forward explicitly if future options are expected.
Apply:
- def service_call( + def service_call( self, message: ROS2Message, target: str, timeout_sec: float = 5.0, *, msg_type: str, reuse_client: bool = True, - **kwargs: Any, ) -> ROS2Message: msg = self._service_api.call_service( service_name=target, service_type=msg_type, request=message.payload, timeout_sec=timeout_sec, reuse_client=reuse_client, )src/rai_core/rai/communication/ros2/api/service.py (3)
78-86
: Make cache lookup type-safe and minimize lock scopeUse a composite key and keep the lock only around map ops; client creation can be outside if you re-check (double-checked pattern) to avoid creating twice.
Apply minimal change (single lock):
- if reuse_client: - with self._persistent_clients_lock: - client = self._persistent_clients.get(service_name, None) - if client is None: - client = self.node.create_client(srv_cls, service_name) # type: ignore - self._persistent_clients[service_name] = client + if reuse_client: + key = (service_name, service_type) + with self._persistent_clients_lock: + client = self._persistent_clients.get(key) + if client is None: + client = self.node.create_client(srv_cls, service_name) # type: ignore + self._persistent_clients[key] = client else: client = self.node.create_client(srv_cls, service_name) # type: ignoreIf desired, we can further reduce lock hold time by creating the client outside and inserting conditionally; say the word and I’ll prepare that diff.
88-91
: Use TimeoutError and keep message concise (Ruff TRY003)ValueError isn’t ideal for timeouts. Recommend TimeoutError with a short message.
Apply:
- if not is_service_available: - raise ValueError( - f"Service {service_name} not ready within {timeout_sec} seconds. " - "Try increasing the timeout or check if the service is running." - ) + if not is_service_available: + raise TimeoutError(f"Service '{service_name}' not ready in {timeout_sec}s")
92-96
: Normalize ROS_DISTRO env for robustnessSafer to lowercase and default to empty.
Apply:
- if os.getenv("ROS_DISTRO") == "humble": + if os.getenv("ROS_DISTRO", "").lower() == "humble": return client.call(srv_msg) else: return client.call(srv_msg, timeout_sec=timeout_sec)tests/communication/ros2/test_api.py (4)
141-149
: Helper looks good; add timeout to surface hangs fasterConsider passing a small timeout_sec to catch regressions earlier during flakiness.
Example:
- response = service_api.call_service( + response = service_api.call_service( service_name, service_type="std_srvs/srv/SetBool", request={"data": True}, - reuse_client=reuse_client, + reuse_client=reuse_client, + timeout_sec=2.0, )
197-217
: Strengthen release assertionsAlso assert that releasing twice returns False to prove idempotency, and that a subsequent call succeeds (re-creates client).
Apply:
for _ in range(3): invoke_set_bool_service(service_name, service_api, reuse_client=True) - assert service_api.release_client(service_name), "Client not released" + assert service_api.release_client(service_name), "Client not released" + assert not service_api.release_client(service_name), "Second release should be False" + # Re-create on demand after release + invoke_set_bool_service(service_name, service_api, reuse_client=True)
239-240
: Threaded test: consider parametrizing reuse_client to exercise both modesRunning with reuse_client=False in a separate parameter set will validate concurrency without shared Client instances.
270-271
: Note on multiprocessing lambdaThis lambda closes over a non-picklable object (service_api) and would fail if unskipped; fine as the test is skipped, but leave a comment to avoid accidental unskip.
Apply:
- pool.map( - lambda _: invoke_set_bool_service(service_name, service_api), range(10) - ) + # NOTE: service_api is not picklable; keep this test skipped. + pool.map(lambda _: invoke_set_bool_service(service_name, service_api), range(10))
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/rai_core/pyproject.toml
(1 hunks)src/rai_core/rai/communication/ros2/api/service.py
(4 hunks)src/rai_core/rai/communication/ros2/connectors/service_mixin.py
(1 hunks)tests/communication/ros2/test_api.py
(5 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
src/rai_core/rai/communication/ros2/connectors/service_mixin.py
44-44: Unused method argument: kwargs
(ARG002)
src/rai_core/rai/communication/ros2/api/service.py
88-91: Avoid specifying long messages outside the exception class
(TRY003)
tests/communication/ros2/test_api.py
203-203: Unused function argument: ros_setup
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-and-test-ros2 (humble)
- GitHub Check: build-and-test-ros2 (jazzy)
🔇 Additional comments (4)
src/rai_core/pyproject.toml (1)
7-7
: Document behavior change: reuse_client now True by defaultAdd a 2.3.0 changelog/migration note stating "reuse_client now defaults to True (persistent clients used by default); this changes client lifecycle and may increase persistent connections/resource usage." Repo scan shows internal callers use keyword args or pass reuse_client explicitly (tests/communication/ros2/test_api.py; src/rai_core/rai/communication/ros2/connectors/service_mixin.py) — no positional reuse_client usages detected. Confirm no external consumers rely on per-call client creation; if they do, require them to set reuse_client=False.
src/rai_core/rai/communication/ros2/api/service.py (1)
63-65
: Public API change — no positional callers found for keyword-onlyreuse_client
call_service is defined at src/rai_core/rai/communication/ros2/api/service.py:57 (4 allowed positional args;
reuse_client
is kw-only). Scanned 5 call sites; 0 suspicious callers found.tests/communication/ros2/test_api.py (2)
168-171
: LGTM: single-call path now exercises cached client by default
191-195
: Good coverage for non-reuse pathThis ensures parity with previous behavior (new client per call).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Purpose
Resolves #684
Proposed Changes
Dict caching of created clients
Issues
#684
Testing
New test
Summary by CodeRabbit
New Features
Tests
Chores