Skip to content

LeapLabTHU/Deep-Incubation

Repository files navigation

Deep Incubation

This repository contains the official code for Deep Incubation: Training Large Models by Divide-and-Conquering.

Title:  Deep Incubation: Training Large Models by Divide-and-Conquering
Authors:  Zanlin Ni*, Yulin Wang*, Jiangwei Yu, Haojun Jiang, Yue Cao, Gao Huang (Corresponding Author)
Institute: Tsinghua University and Beijing Academy of Artificial Intelligence (BAAI)
Publish:   arXiv preprint (arXiv 2212.04129)
Contact:  nzl22 at mails dot tsinghua dot edu dot cn

News

  • Dec 22, 2022: release all pre-trained models on ImageNet-1K, including models tuned at higher resolutions.
  • Dec 15, 2022: release pre-trained models for ViT-B, ViT-L and ViT-H on ImageNet-1K.
  • Dec 13, 2022: release pre-trained meta models for ViT-B, ViT-L and ViT-H on ImageNet-1K.
  • Dec 10, 2022: release code for training ViT-B, ViT-L and ViT-H on ImageNet-1K.

Our final models and the pre-trained meta models are all available at 🤗 Hugging Face. Please follow the instructions in EVAL.md and TRAINING.md for their usage.

Overview

In this paper, we present a divide-and-conquer strategy for training large models. Our algorithm, Deep Incubation, divides a large model into smaller modules, optimizes them independently, and then assembles them together. Though conceptually simple, our method significantly outperforms end-to-end (E2E) training in terms of both training efficiency and final accuracy. For example, on ViT-H, Model Incubation outperforms E2E training by 2.7% or achieves similar performance with 4x less training time.

Data Preparation

  • The ImageNet dataset should be prepared as follows:
data
├── train
│   ├── folder 1 (class 1)
│   ├── folder 2 (class 1)
│   ├── ...
├── val
│   ├── folder 1 (class 1)
│   ├── folder 2 (class 1)
│   ├── ...

Pre-trained Models & Evaluation

See EVAL.md for the pre-trained models and the evaluation instructions.

Training

See TRAINING.md for the training instructions.

Results

Results on ImageNet-1K

Semantic Segmentation on ADE20K

Object Detection and Instance Segmentation on COCO

Training Efficiency

Data Efficiency

Citation

If you find our work helpful, please star🌟 this repo and cite📑 our paper. Thanks for your support!

@article{Ni2022Incub,
  title={Deep Incubation: Training Large Models by Divide-and-Conquering},
  author={Ni, Zanlin and Wang, Yulin and Yu, Jiangwei and Jiang, Haojun and Cao, Yue and Huang, Gao},
  journal={arXiv preprint arXiv:2212.04129},
  year={2022}
}

Acknowledgements

Our implementation is mainly based on deit. We thank to their clean codebase.

Contact

If you have any questions or concerns, please send mail to nzl22@mails.tsinghua.edu.cn.

Releases

No releases published

Packages

No packages published

Languages