HiSync

Official repository for HiSync: Spatio-Temporally Aligning Hand Motion from Wearable IMU and On-Robot Camera for Command Source Identification in Long-Range HRI.

This work is accepted at CHI 2026.

Dataset page: https://huggingface.co/datasets/Octopus1/HiSync

arXiv preprint: https://arxiv.org/abs/2603.11809

News

2026/04/15 HiSync will be presented at CCIB, P1 - Room 116.
2026/03/16 Dataset temporarily switched to grant-only access due to incomplete face anonymization in some samples. Public access is expected to reopen by the end of March.
2026/03/13 Paper posted on arXiv and dataset released on Hugging Face.
2026/03/02 CHI 2026 paper accepted.
2026/02/02 Public data release planned before the CHI 2026 conference.

Abstract

Long-range Human-Robot Interaction (HRI) remains underexplored. Within it, Command Source Identification (CSI) - determining who issued a command - is especially challenging due to multi-user and distance-induced sensor ambiguity. We introduce HiSync, an optical-inertial fusion framework that treats hand motion as binding cues by aligning robot-mounted camera optical flow with hand-worn IMU signals. We first elicit a user-defined (N=12) gesture set and collect a multimodal command gesture dataset (N=38) in long-range multi-user HRI scenarios. Next, HiSync extracts frequency-domain hand motion features from both camera and IMU data, and a learned CSINet denoises IMU readings, temporally aligns modalities, and performs distance-aware multi-window fusion to compute cross-modal similarity of subtle, natural gestures, enabling robust CSI. In three-person scenes up to 34m, HiSync achieves 92.32% CSI accuracy, outperforming the prior SOTA by 48.44%. HiSync is also validated on real-robot deployment. By making CSI reliable and natural, HiSync provides a practical primitive and design guidance for public-space HRI.

Dataset Structure

Published data is organized by collection batch ID.

HiSync_publish/
├── 1/                         # Batch ID
│   ├── user1_20250726_132447/ # Sample directory
│   │   ├── cam_1/
│   │   ├── cam_2/
│   │   ├── cam_3/
│   │   ├── person_keypoints.json
│   │   └── meta.json
│   ├── user2_20250726_135244/
│   │   └── ...
│   └── IMU/
│       ├── IMU_Palm/
│       │   └── *.csv
│       ├── IMU_Ring/
│       │   └── *.csv
│       └── IMU_Wrist/
│           └── *.csv
├── 2/
│   └── ...
└── 18/
        └── ...

Notes:

Each sample directory is named userX_YYYYMMDD_HHMMSS.
Each sample directory contains camera data, person_keypoints.json, and meta.json.
IMU data is aggregated per batch under batch_id/IMU/IMU_{Palm|Ring|Wrist}, not in individual sample directories.

Data Format

`meta.json` Example

{
    "user": "user10",
    "action": "Right",
    "perspective": "Eye-level",
    "distance": "10-15m",
    "camera": {
        "cam_0": "telephone",
        "cam_2": "iphone",
        "cam_1": "cam"
    },
    "IMU": {
        "Palm": {
            "timestamp": "5/IMU/IMU_Palm/calibrated_imu_20250727_150254.csv"
        },
        "Ring": {
            "timestamp": "5/IMU/IMU_Ring/calibrated_imu_20250727_150254.csv"
        },
        "Wrist": {
            "timestamp": "5/IMU/IMU_Wrist/calibrated_imu_20250727_150254.csv"
        }
    }
}

Field constraints:

action is one of: Right, Left, Approach, Retreat, Summon, Ascend, Descend, No-Gesture.
perspective is one of: Upward, Eye-level, Downward.
distance is one of: 3-5m, 5-10m, 10-15m, 15-20m, 20-25m, 25-34m.
IMU.*.timestamp may be null; parsers should handle null safely.

A small portion of samples may have missing camera or IMU modalities. Please use robust reading logic.

`person_keypoints.json` Example

{
    "0": [
        {
            "frame_idx": 0,
            "filename": "frame_0000.png",
            "keypoints": [
                [1204.31, 181.52, 0.9949],
                [1208.77, 170.23, 0.9781],
                [1198.12, 170.84, 0.9675],
                [1216.43, 181.65, 0.8360],
                [1187.55, 183.20, 0.6951]
            ],
            "bbox": [1133, 106, 1373, 813]
        },
        {
            "frame_idx": 1,
            "filename": "frame_0001.png",
            "keypoints": [
                [1204.88, 181.46, 0.9955],
                [1209.06, 170.34, 0.9761],
                [1197.93, 170.95, 0.9748]
            ],
            "bbox": [1132, 106, 1374, 813]
        }
    ],
    "1": [
        {
            "frame_idx": 0,
            "filename": "frame_0000.png",
            "keypoints": [],
            "bbox": [0, 0, 0, 0]
        }
    ]
}

Notes:

Top-level keys (for example, "0", "1") are person IDs in string format.
Each camera maps to a frame list with frame_idx, filename, keypoints, and bbox.
Each keypoint follows [x, y, score] in COCO-17 order.
keypoints can be empty when no person is detected in a frame.

Citation

If you find HiSync useful, please cite:

@article{zhang2026hisync,
  title={HiSync: Spatio-Temporally Aligning Hand Motion from Wearable IMU and On-Robot Camera for Command Source Identification in Long-Range HRI},
  author={Zhang, Chengwen and Yu, Chun and Zhuang, Borong and Jin, Haopeng and Wan, Qingyang and Li, Zhuojun and He, Zhe and Ye, Zhoutong and Mei, Yu and Liu, Chang and others},
  journal={arXiv preprint arXiv:2603.11809},
  year={2026}
}

License

Please follow the dataset license and usage terms on the Hugging Face dataset page.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HiSync

News

Abstract

Dataset Structure

Data Format

`meta.json` Example

`person_keypoints.json` Example

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

HiSync

News

Abstract

Dataset Structure

Data Format

meta.json Example

person_keypoints.json Example

Citation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

`meta.json` Example

`person_keypoints.json` Example

Packages