Deep Incubation

This repository contains the official code for Deep Incubation: Training Large Models by Divide-and-Conquering.

Title: Deep Incubation: Training Large Models by Divide-and-Conquering
Authors:  Zanlin Ni*, Yulin Wang*, Jiangwei Yu, Haojun Jiang, Yue Cao, Gao Huang (Corresponding Author)
Institute: Tsinghua University and Beijing Academy of Artificial Intelligence (BAAI)
Publish:   arXiv preprint (arXiv 2212.04129)
Contact:  nzl22 at mails dot tsinghua dot edu dot cn

News

Dec 22, 2022: release all pre-trained models on ImageNet-1K, including models tuned at higher resolutions.
Dec 15, 2022: release pre-trained models for ViT-B, ViT-L and ViT-H on ImageNet-1K.
Dec 13, 2022: release pre-trained meta models for ViT-B, ViT-L and ViT-H on ImageNet-1K.
Dec 10, 2022: release code for training ViT-B, ViT-L and ViT-H on ImageNet-1K.

Our final models and the pre-trained meta models are all available at 🤗 Hugging Face. Please follow the instructions in EVAL.md and TRAINING.md for their usage.

Overview

In this paper, we present a divide-and-conquer strategy for training large models. Our algorithm, Deep Incubation, divides a large model into smaller modules, optimizes them independently, and then assembles them together. Though conceptually simple, our method significantly outperforms end-to-end (E2E) training in terms of both training efficiency and final accuracy. For example, on ViT-H, Model Incubation outperforms E2E training by 2.7% or achieves similar performance with 4x less training time.

Data Preparation

The ImageNet dataset should be prepared as follows:

data
├── train
│   ├── folder 1 (class 1)
│   ├── folder 2 (class 1)
│   ├── ...
├── val
│   ├── folder 1 (class 1)
│   ├── folder 2 (class 1)
│   ├── ...

Pre-trained Models & Evaluation

See EVAL.md for the pre-trained models and the evaluation instructions.

Training

See TRAINING.md for the training instructions.

Results

Results on ImageNet-1K

Semantic Segmentation on ADE20K

Object Detection and Instance Segmentation on COCO

Training Efficiency

Data Efficiency

Citation

If you find our work helpful, please star🌟 this repo and cite📑 our paper. Thanks for your support!

@article{Ni2022Incub,
  title={Deep Incubation: Training Large Models by Divide-and-Conquering},
  author={Ni, Zanlin and Wang, Yulin and Yu, Jiangwei and Jiang, Haojun and Cao, Yue and Huang, Gao},
  journal={arXiv preprint arXiv:2212.04129},
  year={2022}
}

Acknowledgements

Our implementation is mainly based on deit. We thank to their clean codebase.

Contact

If you have any questions or concerns, please send mail to nzl22@mails.tsinghua.edu.cn.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
imgs		imgs
timm		timm
.gitignore		.gitignore
EVAL.md		EVAL.md
LICENSE		LICENSE
README.md		README.md
TRAINING.md		TRAINING.md
datasets.py		datasets.py
engine.py		engine.py
losses.py		losses.py
main.py		main.py
models.py		models.py
requirements.txt		requirements.txt
samplers.py		samplers.py
tox.ini		tox.ini
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Incubation

News

Overview

Data Preparation

Pre-trained Models & Evaluation

Training

Results

Results on ImageNet-1K

Semantic Segmentation on ADE20K

Object Detection and Instance Segmentation on COCO

Training Efficiency

Data Efficiency

Citation

Acknowledgements

Contact

About

Releases

Packages

Languages

License

LeapLabTHU/Deep-Incubation

Folders and files

Latest commit

History

Repository files navigation

Deep Incubation

News

Overview

Data Preparation

Pre-trained Models & Evaluation

Training

Results

Results on ImageNet-1K

Semantic Segmentation on ADE20K

Object Detection and Instance Segmentation on COCO

Training Efficiency

Data Efficiency

Citation

Acknowledgements

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages