Skip to content

yunongLiu1/IKEA-Manuals-at-Work

Repository files navigation

IKEA Manuals at Work

4D Grounding of Assembly Instructions on Internet Videos

NeurIPS 2024 Datasets and Benchmarks

Yunong Liu1, Cristobal Eyzaguirre1, Manling Li1, Shubh Khanna1, Juan Carlos Niebles1, Vineeth Ravi2, Saumitra Mishra2, Weiyu Liu1*, Jiajun Wu1*

1Stanford University    2J.P. Morgan AI Research
*Equal advising

[Project Website] [Paper] [Dataset Setup Guide] [Notebook]

Overview

The IKEA-Manuals-at-Work dataset provides detailed annotations for aligning 3D models, instructional manuals, and real-world assembly videos. This is the first dataset to provide 4D grounding of assembly instructions on Internet videos, offering high-quality, spatial-temporal alignments between assembly instructions, 3D models, and real-world internet videos.

Key Features

  • 🪑 36 furniture models from 6 categories
  • 🎥 98 assembly videos from the Internet
  • 🔄 Dense spatio-temporal alignments between instructions and videos
  • 📊 Rich annotations including part segmentation, 6D poses, and temporal alignments

Getting Started

Installation

# Create and activate conda environment
conda create -n IKEAVideo python=3.8
conda activate IKEAVideo

# Install dependencies
pip install -r requirements.txt

# Set PYTHONPATH
export PYTHONPATH="./src:$PYTHONPATH"

Dataset Structure

data/
├── data.json           # Main annotation file
├── parts/             # 3D model files
├── manual_img/        # Instruction manual images
├── pdfs/              # Original PDF manuals
└── videos/            # Assembly videos

Dataset Contents

The dataset includes:

  • 3D Models: Detailed 3D models of furniture parts
  • Instruction Manuals: Step-by-step assembly instructions
  • Assembly Videos: Real-world assembly videos from the Internet
  • Rich Annotations:
    • ⏱️ Temporal step alignments
    • 🔄 Temporal substep alignments
    • 🎯 2D-3D part correspondences
    • 🎨 Part segmentations
    • 📐 Part 6D poses
    • 📷 Estimated camera parameters

For detailed information about the dataset, please refer to our datasheet.

Dataset Setup

  1. Download Required Files:
  • Annotation file: data/data.json
  • Assembly videos: Stanford Digital Repository
  • Clone the repo to obtain other resources (e.g. 3D models, manual images)
  • Place downloads in their respective directories as shown in Dataset Structure
  1. Explore the Dataset: Check our tutorial notebook: notebooks/data_viz.ipynb

Applications

The dataset supports various research directions:

  • 🔍 Assembly plan generation
  • 🎯 Part-conditioned segmentation
  • 📐 Part-conditioned pose estimation
  • 🎥 Video object segmentation
  • 🛠️ Shape assembly with instruction videos

4D IKEA Video Annotation Interface

We also provide the annotation tools used to create the IKEA Manuals at Work dataset.

Tools Included

This repository contains two annotation interfaces:

  1. Main Annotation Interface - For creating 2D masks and initial 3D poses
  2. Pose Refinement Interface - For fine-tuning 3D poses after initial estimation

Please see the respective directories for setup and usage instructions:

FAQ

How do I project mesh vertices to 2D?

You can use the provided render_part.py script (in this repo) to project mesh vertices onto 2D images. Ensure you have the correct intrinsic and extrinsic matrices.

Are segmentation masks and part poses available for all parts?

No, annotations are provided only for parts being interacted with in each frame. This aligns with the manual’s structure and supports assembly plan generation.

For more details, check the paper or feel free to ask!

License

This dataset is released under the CC-BY-4.0 license.

Citation

If you find this dataset useful for your research, please cite:

  @inproceedings{
  liu2024ikea,
  title={{IKEA} Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos},
  author={Yunong Liu and Cristobal Eyzaguirre and Manling Li and Shubh Khanna and Juan Carlos Niebles and Vineeth Ravi and Saumitra Mishra and Weiyu Liu and Jiajun Wu},
  booktitle={The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
  year={2024}
  }

Contact

For questions and feedback:

About

IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors