UNIC: Learning Unified Multimodal Extrinsic Contact Estimation

Abstract

Contact-rich manipulation requires reliable estimation of extrinsic contacts—the interactions between a grasped object and its environment—which provide essential contextual information for planning, control, and policy learning. However, existing approaches often rely on restrictive assumptions, such as predefined contact types, fixed grasp configurations, or camera calibration, that hinder generalization to novel objects and deployment in unstructured environments.

UNIC is a unified multimodal framework for extrinsic contact estimation that operates without any prior knowledge or camera calibration. UNIC directly encodes visual observations in the camera frame and integrates them with proprioceptive and tactile modalities in a fully data-driven manner. It introduces a unified contact representation based on scene affordance maps that captures diverse contact formations and employs a multimodal fusion mechanism with random masking, enabling robust multimodal representation learning.

Demo Videos

Diverse Contact Types

UNIC handles multiple contact scenarios including single-object contact, multi-object interactions, and no-contact states.

Contact Estimation Under Robot Motion

UNIC performs reliable estimation during robot motion and in-hand object slip.

Real-time Estimation with Dynamic Camera

UNIC adapts to dynamic camera viewpoints without requiring recalibration.

Robustness Across Configurations

UNIC generalizes to diverse object configurations and contact locations.

Generalization to Unseen Objects

UNIC demonstrates strong generalization to objects not seen during training.

Method Overview

Architecture

UNIC integrates four sensing modalities:

🔵 Point clouds - 3D information from RGB-D camera
🟠 Tactile signals - Marker displacement maps from GelSight sensors
🟢 Force-torque - 6D wrench from wrist-mounted sensor
🟡 Proprioception - End-effector rotation

Key Technical Contributions

Prior-free Contact Affordance Representation
- Unified representation based on scene affordance maps
- Captures diverse contact types: point, line, patch
- Models complex contact chains (gripper–object–object–environment)
- No camera calibration or object geometry required
Masked Multimodal Fusion
- Random masking during training
- Learns robust cross-modal representations
- Enables reliable estimation even with missing modalities at deployment
- Flexible sensor configuration without retraining
Efficient Sampling Strategy
- Decouples global multimodal fusion from point-wise affordance generation
- Lightweight point-wise computation
- Supports real-time inference (>600 Hz)

Installation

Setup

Install Miniforge (recommended). Miniforge is the conda-forge–recommended installer and includes mamba out of the box.
Create conda environment:

mamba env create -f conda_env.yaml
conda activate unic

Install third-party dependencies:

bash third_party.sh

Dataset

For all dataset merge and usage instructions, see dataset_readme.md.

The training dataset is a Zarr archive (~98 GB unzipped). For distribution it is split into two balanced zip parts, each ~48.9 GB, hosted on Zenodo:

Part 1 — split_training_part1.zip (DOI 10.5281/zenodo.20127326)
Part 2 — split_training_part2.zip (DOI 10.5281/zenodo.20287722)

Released under CC-BY-SA-4.0.

Training

Train the UNIC model with:

python train.py --config-dir=./unic/config --config-name=train_unic

Training Configuration

Training configurations are located in unic/config.

Monitoring Training

Training logs and metrics are automatically tracked with Weights & Biases (wandb). Checkpoints are saved periodically in the output directory specified in the config.

Citation

If you find this work useful, please consider citing:

@inproceedings{xu2026unic,
    author = {Xu, Zhengtong and Shirai, Yuki},
    title = {UNIC: Learning Unified Multimodal Extrinsic Contact Estimation},
    booktitle = {IEEE International Conference on Robotics and Automation (ICRA)},
    year = {2026}
}

License

Released under AGPL-3.0-or-later license, as found in the LICENSE.md file.

All files:

Copyright (C) 2025 Mitsubishi Electric Research Laboratories (MERL)

SPDX-License-Identifier: AGPL-3.0-or-later

Contact

For questions or issues, please contact:

Zhengtong Xu (Purdue University): xu1703@purdue.edu
Yuki Shirai (MERL): yukishirai1926@gmail.com
Diego Romeres (MERL): romeres@merl.com

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
.reuse		.reuse
figures		figures
unic		unic
videos/gifs		videos/gifs
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
conda_env.yaml		conda_env.yaml
dataset_readme.md		dataset_readme.md
merge_dataset.py		merge_dataset.py
pyproject.toml		pyproject.toml
third_party.sh		third_party.sh
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UNIC: Learning Unified Multimodal Extrinsic Contact Estimation

Abstract

Demo Videos

Diverse Contact Types

Contact Estimation Under Robot Motion

Real-time Estimation with Dynamic Camera

Robustness Across Configurations

Generalization to Unseen Objects

Method Overview

Architecture

Key Technical Contributions

Installation

Setup

Dataset

Training

Training Configuration

Monitoring Training

Citation

License

Contact

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

UNIC: Learning Unified Multimodal Extrinsic Contact Estimation

Abstract

Demo Videos

Diverse Contact Types

Contact Estimation Under Robot Motion

Real-time Estimation with Dynamic Camera

Robustness Across Configurations

Generalization to Unseen Objects

Method Overview

Architecture

Key Technical Contributions

Installation

Setup

Dataset

Training

Training Configuration

Monitoring Training

Citation

License

Contact

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages