VLA trained on egocentric data

Papers are grouped by institution and sorted chronologically by the earliest paper in each group.

Year	Org.	Acronym	Paper	EEF	Observations	Alignment	Model	Training	Eval	Comments
2023.10	GIT	MimicPlay	Long-Horizon Imitation Learning byWatching Human Play	Human:hand; robot: grippers	Robot:head and wrist cams, human:third-person-view video	NA	Simple BC models	Stage1: only human data to train a high-level planner, Stage2: only robot data to train visomotor policy
2024.10	GIT	EgoMimic	Scaling Imitation Learning via Egocentric Video	human: hand; robot: grippers	Human: head cam, hand pose (estimated); Robot: head and wrist cams, Proprio EEF poses, Joint positions	Hardware：1)identical Aria glassesas, 2) teleop device similar to human upper body; Data Processing: 1) unify action frames, 2) align action distributions, 3) mask out robot & human arms	改进ACT	co-train(1 hr human data + 2 hrs robot data)	generalize to new objects/scenes/tasks seen only in human data	值得细看，缺点是数据量较少
2025.09	GIT	ImMimic	Cross-Domain Imitation from Human Videos via Mapping and Interpolation	Human:hand; robot: grippers or hands; retargeting required	Human: head cam, hand pose (algorithm estimated); Robot: head and wrist cams, proprioception	头部观测对齐	DP	Robot data + interpolated human data cotrain
2025.09	GIT	EgoBridge	Domain Adaptation for Generalizable Imitation from Egocentric Human Data	与EgoMimic一致	与EgoMimic一致	align latent representations from human and robot domains	transformer-based design	co-training with OT loss	generalize to new objects/scenes/tasks seen only in human data	在EgoMimic基础上增加latent representation的对齐及相应的co-train改进
2025.12	GIT	EMMA	Scaling Mobile Manipulation via Egocentric Human Data	与EgoMimic一致	与EgoMimic一致+导航	optimization-based retargeting for navigation and coordinate-space alignment for manipulation	decoder-only transformer	co-train human full-body motion data with static robot data	1) direct transfer of navigation skills from human data to robot (2) co-training scales up full mobile manipulation policy performance	在EgoMimic基础上引入mobile base，从人类移动数据中学习机器人的移动
2025.12	GIT, PI	Human2robo	Emergence of Human to Robot Transfer in VLAs				pi0.5
2026.02	GIT, Nvidia	EgoScale	Scaling Human Video to Unlock Dexterous Robot Intelligence
2025.07	UCSD	EgoVLA	Learning Vision-Language-Action Models from Egocentric Human Videos	human:hand; robo:hand	human:head cam; robo:head cam	unified action space + robo data finetune	VLM + action head	pretrain only on human data to unified action space + robo data finetune
2025.11	UCSD	In-N-On	Scaling Egocentric Manipulation with in-the-wild and on-task Data	human:hand; robo:hand			VLM + action head	pretrain on human and robo data to unified action space		Adversarial domain adaptation
2025.08	Tsinghua	Motiontrans	Human vr data enable motion-level learning for robotic manipulation policies
2026.02	Microsoft	VITRA	Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos

Egocentric data collection methods and datasets

Year	Org.	Acronym	Paper	Scale	Cameras
2022.03	HOI4D	Tsinghua	HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction	4000seqs
2023.09	HoloAssist	Microsoft	[HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real World(https://arxiv.org/abs/2309.17024)	166hrs
2024.01	HOT3D	Meta	Introducing HOT3D: An Egocentric Dataset for 3D Hand and Object Tracking	52hrs
2024.03	TACO	ShanghaiAI	TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding	2317seqs
2025.08	Apple	EgoDex	Learning Dexterous Manipulation from Large-Scale Egocentric Video	800hrs
2026.02	Ant Group	AoE	Always-on Egocentric Human Video Collection for Embodied AI	Neck-mounted camera	Scene reconstruction, hand motion detection, etc.

Year	Org.	Acronym	Paper
2022.03	HOI4D	Tsinghua	HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction
2023.09	HoloAssist	Microsoft	HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real World
2024.01	HOT3D	Meta	Introducing HOT3D: An Egocentric Dataset for 3D Hand and Object Tracking

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VLA trained on egocentric data

Egocentric data collection methods and datasets

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

VLA trained on egocentric data

Egocentric data collection methods and datasets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages