Jing Zhang1,*,†, Duojie Chen1,2,*, Wentao Jiang1, Zihan Lou1, Jianxin Liu3, Xinwu Cui4, Qinghong Zhao5, Bo Du1,†, Christoph F. Dietrich6, Dacheng Tao7,†
1 School of Computer Science, Wuhan University, China,
2 Hubei Center for Applied Mathematics, Wuhan University, China,
3 Department of Ultrasound, The Central Hospital of Wuhan, China,
4 Department of Medical Ultrasound, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, China,
5 Department of Ultrasound in Medicine, Renmin Hospital of Wuhan University, China,
6 Department General Internal Medicine (DAIM), Hospitals Hirslanden Bern Beau Site, Salem and Permanence, Bern, Switzerland,
7 College of Computing and Data Science, Nanyang Technological University, Singapore
A plug-in anatomical prior layer that grounds clinical intent and a single external body image into patient-specific anatomical hypotheses and control-facing 6-DoF probe initialization cues.
Introduction | Key Features | Method Overview | Clinical Semantics Grounding (CSG) | Anatomical Representation Instantiation (ARI) | Actionable Target Initialization (ATI) | Main Results | Getting Started | Release Status | Citation
2026.04
- Repository structure and overview materials published.
- Manuscript under review.
Robotic ultrasound has made substantial progress in local image-driven control, contact regulation, and view optimization. However, current systems still lack an explicit anatomical prior layer for deciding what to scan, where to start, and how to place the probe before local refinement begins.
SAMe bridges this gap as a target-to-anatomy-to-action framework for anatomy-aware scan initialization. It provides a plug-in prior layer that grounds clinical intent into structured anatomical targets, instantiates patient-specific anatomy from a single external body image, and converts the result into robot-readable, contact-aware 6-DoF initialization cues — without preoperative CT/MRI or additional registration sweeps.
- 🧠 Structured Anatomy Mapping: Decomposed into 3 coupled modules (CSG → ARI → ATI) spanning complaint grounding to probe initialization.
- 📷 Single RGB Input: One monocular external body image — no preoperative CT/MRI and no additional registration sweep at scan time.
- ⚡ Lightweight Inference: Full-organ prior inference in 0.76 s on CPU; liver-only in 0.08 s.
- 🤖 Real-Robot Validation: 97.3% organ-hit rate for liver and 81.7% for kidney initialization on physical platform.
| Item | Description |
|---|---|
| Input | Clinical complaint + single monocular RGB body image |
| Core Output | Grounded organ/ROI, patient-specific anatomical hypothesis, control-facing 6-DoF initialization cues |
| Design Goal | Anatomy-aware scan initialization before local image-based refinement |
| No Extra Acquisition | No CT/MRI; no registration sweep |
| Inference Cost | 0.76 s (full organ set, CPU) / 0.08 s (liver only, CPU) |
| Real-Robot Hit Rate | 97.3% liver / 81.7% kidney |
SAMe is organized into three coupled components that form a target-to-anatomy-to-action pipeline:
Overview of the SAMe pipeline from semantic grounding to anatomical instantiation and robot-facing initialization.
| Module | Core Question | Input | Output | Key Capability |
|---|---|---|---|---|
| CSG | What should be scanned? | Complaint text | Organ- and anatomy-level targets | Clinical semantics grounding via retrieval-augmented prior |
| ARI | Where is the target anatomy? | Single RGB body image | Patient-specific anatomical hypothesis | Skeleton-conditioned organ instantiation with uncertainty |
| ATI | How should scanning begin? | Anatomical hypothesis | Candidate contacts, target rays, 6-DoF probe states | Contact-aware initialization for downstream control |
SAMe is not a full autonomous scanning system. It is an explicit anatomical prior layer designed to improve initialization and remain compatible with downstream control.
CSG grounds under-specified clinical complaints into explicit organ- and anatomy-level targets via a structured semantic prior.
ARI instantiates a patient-specific anatomical representation from a single RGB body image through skeleton-conditioned organ placement with uncertainty estimation.
Offline organ-layer modeling and rig-anchored prior learning used to support patient-specific anatomical instantiation.
ATI converts internal target hypotheses into candidate surface contacts, target-directed entry rays, and contact-aware 6-DoF probe initialization states for downstream robotic execution.
| System / Approach | Anatomical Prior | Input Modality | Online Initialization | Complaint-Driven |
|---|---|---|---|---|
| Surface-heuristic baselines | ❌ | RGB / none | ❌ | |
| CT/MRI-registered methods | ✅ Volumetric | Preoperative CT/MRI | ✅ | ❌ |
| Image-based controllers | ❌ | Ultrasound B-mode | ✅ | ❌ |
| SAMe (Ours) | ✅ Skeleton-conditioned | Single RGB | ✅ | ✅ |
| Component | Evaluation Setting | Key Result |
|---|---|---|
| CSG | 1,000 held-out symptom descriptions | Location-level F1: 0.357 (macro) / 0.356 (micro) with SAMe-DB |
| ARI | 35 held-out cases, 11 organs, 385 pairs | Mean centroid error: 22.55 mm; support IoU: 0.391 |
| ARI Efficiency | CPU inference | Full-organ: 0.76 s; liver-only: 0.08 s |
| ATI | Real-robot ultrasound | Organ-hit rate: 97.3% (liver) / 81.7% (kidney) |
Additional findings include:
- Centroid-based SAMe initialization outperformed surface-heuristic baselines for both liver and kidney.
- The explicit low-dimensional representation improved point localization, organ support estimation, and downstream initialization usability.
- The formulation preserved robustness across substantial body-habitus variation while remaining lightweight enough for deployment.
Coming soon...
If you find SAMe useful for your research, please cite:
@misc{same2026,
title={SAMe: A Semantic Anatomy Mapping Engine for Robotic Ultrasound},
author={Jing Zhang and Duojie Chen and Wentao Jiang and Zihan Lou and Jianxin Liu and Xinwu Cui and Qinghong Zhao and Bo Du and Christoph F. Dietrich and Dacheng Tao},
year={2026},
note={Under review}
}Maintained by the Echo-SAMe Team


