Skip to content

Conversation

njooma
Copy link
Member

@njooma njooma commented Apr 18, 2023

This PR tackles a few issues:

  1. When a robot client loses connection and reconnects, failed reconnection attempts could leave behind orphaned sockets
  2. After a successful reconnection, existing resource clients would be useless with dead connections. Introduce a ReconfigurableResourceRPCClientBase type which all built in resource clients use
  3. [flyby] Fix readme typo
  4. [flyby] Add import sorting to our format function (and VSCode settings)
  5. [flyby] Update pyproject dependencies
  6. [flyby] Fix some Board type checker issues

For reviewers, the main file you should be checking is robot.client

@njooma njooma requested a review from a team as a code owner April 18, 2023 22:25
@github-actions
Copy link
Contributor

Warning your change may break code samples. If your change modifies any of the following functions please contact @viamrobotics/fleet-management. Thanks!

component function
base is_moving
board gpio_pin_by_name
camera get_image
motor is_moving
sensor get_readings
servo get_position
arm get_end_position
gantry get_lengths
gripper is_moving
movement_sensor get_linear_acceleration
input_controller get_controls
audio get_properties
pose_tracker get_poses
motion get_pose
vision get_detector_names

@njooma njooma requested a review from stuqdog April 18, 2023 22:25
@njooma njooma force-pushed the RSDK-2485/client-reconnect branch from dbf458b to d309d83 Compare April 18, 2023 22:25
Copy link
Contributor

@maximpertsov maximpertsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice improvements! just some minor suggestions

Comment on lines 199 to 214
if rname in self._manager.resources:
res = self._manager.get_resource(ResourceBase, rname)
if isinstance(res, ReconfigurableResourceRPCClientBase):
res.reset_channel(self._channel)
else:
self._manager.remove_resource(rname)
self._manager.register(
Registry.lookup_subtype(Subtype.from_resource_name(rname)).create_rpc_client(rname.name, self._channel)
)
else:
try:
self._manager.register(
Registry.lookup_subtype(Subtype.from_resource_name(rname)).create_rpc_client(rname.name, self._channel)
)
except ResourceNotFoundError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] it feels like this could factored out into a helper method - _create_or_reset_client or something like that?

Copy link
Member

@cheukt cheukt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few comments + how hard is it to add tests around this behavior?

"""Start the module service and gRPC server"""
try:
await self.server.serve(log_level=self._log_level, path=self._address)
except SystemExit as e:
Copy link
Member

@cheukt cheukt Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this do and why do we need it? maybe leave a comment around this bit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@benjirewis benjirewis Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not resolve RSDK-2551, and I'm actually not sure we should add this @njooma (totally open to other opinions). The except here doesn't actually catch anything from grpclib. It's confusing, but the only error you can catch from grpclib's second stage of graceful exit is asyncio.CancelledError. From what I can tell, there's no way to catch the raised SystemExit(143).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@njooma njooma force-pushed the RSDK-2485/client-reconnect branch from 42f4d29 to e2286c0 Compare April 20, 2023 21:03
@njooma njooma requested review from cheukt and maximpertsov April 20, 2023 21:55
@biotinker biotinker removed their request for review April 20, 2023 22:32
@biotinker
Copy link
Member

Removing myself from reviewers since I have insufficient background on this codebase to be super productive; happy to test this on prior failure modes ones it's merged though

@njooma
Copy link
Member Author

njooma commented Apr 21, 2023

Removing myself from reviewers since I have insufficient background on this codebase to be super productive; happy to test this on prior failure modes ones it's merged though

Yup, just put you on so you could track progress

with self._lock:
return [r for r in self._resource_names]

def _close_channel(self, *, tab_count=0):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(q) I'm not sure I get why the tab_count exists. Why do we want to indent when calling close() specifically?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I call this internal function from a 2 difference places, and I want the logs to reflect a certain level of indentation depending on where it was called from (e.g. if it was called as part of a task that closes a lot of things, it should be indented. if it was closed on its own, it should not be indented)

Copy link
Member

@stuqdog stuqdog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for digging into this!

@njooma njooma merged commit eb33276 into viamrobotics:main Apr 21, 2023
@njooma njooma deleted the RSDK-2485/client-reconnect branch April 21, 2023 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants