Skip to content

zycrobot/SIRIL

Repository files navigation

Imitation Learning for Dexterous Robot Micromanipulation of Deformable Cell

🎯

  • Skill information representation imitation learning algorithm was proposed for long-horizon dexterous robot micromanipulation tasks. SIRIL quantifies the representation of the expert's skill information by temporal log-likelihood of the latent discrete codes and extracts the safety action constraints, which effectively suppresses the compound error.
  • A 2-DOF microgripper for cell dexterity manipulation was designed and integrated into robotic arm. A visual, force feedback master-slave dual robotic arm robot micromanipulation system was constructed.
  • Physical experiments show that SIRIL algorithm can complete the deformable zebrafish embryonic cells dexterous membrane stripping surgery, and outperform existing algorithms.

1.Robot design 🤖

Robot overview

forcep

a, Hardware structure diagram of the robot system. The robotic system primarily consists of two symmetric three-axis micromanipulation robotic arms, motor controllers, two force-feedback devices, and a micromanipulation end-effector integrated at the end of each robotic arm. b, Images under the microscope about robot performing collaborative, long-horizon dexterity micromanipulation task of cell membrane peeling. The task mainly includes three sub-tasks: reaching and pushing the cell, grasping the cell, and tearing off the cell membrane. The scale bar is 800 $\mu$ m.

End-effector

forcep

a, Schematic of the mechanical structure of the microgripper. The microgripper has two degrees of freedom: a gripping action and a wrist-like rotational motion. Axial rotation is achieved by driving a set of rotating gears with DC motor 1. The gripping action is performed by DC motor 2, which pushes a sliding sleeve through a cylindrical cam mechanism. Three Fiber Bragg Gratings (FBGs) are integrated on the sliding sleeve to sense bending deformation, providing force sensing and collision detection to prevent breakage of the microgripper. b, Diagram of Data Communication Nodes in the Robotic System. The software and control part of the system was integrated into the Robot Operating System (ROS) Melodic.

First GIF Second GIF
DoF1: grasp DoF2: rotate

2.Imitation Learning

👨‍⚕️Get expert data

forcep

a, The expert remotely operated the robotic system using force-feedback devices to collect demonstration data. During the expert demonstration, video information, end-effector trajectories, axial rotation angles, grasping actions, and multi-joint trajectories of the robotic arm were recorded on the ROS platform. The scale bar is 800 $\mu$m. b, Visualization of recorded demonstration data. (a) Data from the robot’s left arm, including the three-axis robotic arm trajectory (left y-axis), the microgripper’s axial rotation angle (right y-axis), and the microgripper’s gripping actions. (b) Data from the robot’s right arm. The manipulator trajectory values are on the left axis. Microgripper rotation, grasp, and release are on the right axis.

🧠SIRIL

forcep

a, Schematic of the SIRIL network structure. (i) VQ-GAN Encoder and Decoder Architecture: The input consists of workspace video frames, generating corresponding images. The encoder compresses video frame information into discrete latent codes. (ii) an Autoregressive transformer Model: The input is the latent code of the video frame from the previous time step, while the output is the predicted latent code for the next time step. This allows for calculating the likelihood of past latent codes relative to the current time step. (iii) Network Structure for Controlling the Robot's Left and Right Manipulators and Microgrippers: Video frame information is effectively encoded by the pre-trained VQ-GAN encoder, followed by the transformer predicting the latent code for the next time step. The encoded tensor is flattened and fused with the previous time step’s robot action information, then input to a multi-layer perception network to generate the final robot actions. b, Overview of the SIRIL Training and Inference Pipeline. Step 1: Utilize expert demonstration videos to train a VQ-GAN, extracting discrete latent codes. Step 2: Leverage these codes to pretrain an autoregressive transformer for skill information representation. Step 3: Employ behavior cloning to train SIRIL and preserve the policy model. Step 4: In inference, actions are predicted, verified against safety constraints, and executed if they comply.

🚀Result

SIRIL and Baselines Success Rate

Method PushCell(%) GraspCell(%) PeelCell(%) Mean (%) Final (%)
BC 93.3 (28/30) 32.1 (9/28) 66.7 (6/9) 64.0 20.0 (6/30)
ACT 96.7 (29/30) 58.6 (17/29) 76.5 (13/17) 77.2 43.3 (13/30)
VINN 96.7 (29/30) 31.0 (9/29) 22.2 (2/9) 50.0 6.67 (2/30)
Diffusion 76.7 (23/30) 26.1 (6/23) N/A (0/6) 34.3 N/A (0/30)
Beginner RC 83.3 (25/30) 72.0 (18/25) 88.9 (16/18) 81.4 53.3 (16/30)
SIRIL(0.8) 96.7 (29/30) 82.8 (24/29) 79.2 (19/24) 86.2 63.3 (19/30)
First GIF Second GIF
Images of the SIRIL strategy for controlling the robotic system to perform the cell membrane tearing process. Trial 1, 2, 3 and 4 are the images of the successful task. Trial 5 is the image of the failed task. The scale bar is 800 $\mu$m. Trajectories of the SIRIL strategy for controlling the robotic system to perform the cell membrane tearing process. The top four are successful trials 1-4, and the bottom is failed trial.

📹Automated surgery

First GIF 3 GIF
trail1 trail2

Source code for robot system

  1. phantom touch as master device
  2. eppendorf TranferMan 4R as slave device (velocity-control)
  3. self-designed 2-dof forcep
  4. micro machine vision system
  5. SIRIL
  • master-slave control for biarm
  • master-slave control for forcep
  • imitation learning

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors