GitHub - mrazhou/RePB: [ICLR2026] Official code repository for RePB: "Inconsistency Biases in Dynamic Data Pruning"

Inconsistency Biases in Dynamic Data Pruning

ICLR 2026 | [Paper] | [Code]

Author: Qing Zhou, Tao Yang, BingXuan Zhao, Hongyuan Zhang, Junyu Gao, Qi Wang

RePB achieves lossless dynamic pruning by resolving score drift and gradient bias, requiring only 3 lines of code to integrate.

Support pruning methods: Random, InfoBatch, SeTa and RePB.

Installation

pip install git+https://github.com/mrazhou/RePB

Or you can clone this repo and install it locally.

git clone https://github.com/mrazhou/RePB
cd RePB
pip install -e .

Usage

To adapt your code with RePB/SeTa/InfoBatch, just change the following three lines:

from repb import prune

# 1. Wrap dataset
train_data = prune(train_data, args) # args: prune_type, epochs...

# 2. Pass sampler to DataLoader
train_loader = DataLoader(train_data, sampler=train_data.sampler)

for epoch in range(args.epochs):
    for batch in train_loader:
        # 3. Update loss
        loss = train_data.update(loss)

Experiment

In this repository, we provide two examples to demonstrate the usage of RePB. CIFAR10/CIFAR100 (support resnet18/50/101) and ImageNet (support various CNNs/Transformers/Mamba) are used as the datasets.

CIFAR10/CIFAR100 (for exploratory research.)

# For lossless pruning with RePB
bash scripts/cifar.sh RePB 0.6

# For with SeTa
bash scripts/cifar.sh SeTa 0.1

# For InfoBatch
bash scripts/cifar.sh InfoBatch 0.5

# For Static Random
bash scripts/cifar.sh Static

ImageNet (for large-scale and cross-architecture comprehensive validation.)

# For lossless pruning
bash scripts/imagenet.sh

# For CNNs
bash scripts/imagenet.sh RePB mobilenetv3_small_050

# For Transformers
bash scripts/imagenet.sh RePB vit_tiny_path16_224

# For Vim
# refer to https://github.com/hustvl/Vim for more details

Results

Citation

If you find this repository helpful, please consider citing our paper:

@inproceedings{zhou2026inconsistency,
  title={Inconsistency Biases in Dynamic Data Pruning},
  author={Qing Zhou and Tao Yang and Bingxuan Zhao and Hongyuan Zhang and Junyu Gao and Qi Wang},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
}

and the SeTa/InfoBatch paper:

@inproceedings{zhou2025scale,
  title={Scale Efficient Training for Large Datasets},
  author={Zhou, Qing and Gao, Junyu and Wang, Qi},
  booktitle={CVPR},
  pages={20458-20467},
  year={2025}
}

@inproceedings{qin2024infobatch,
  title={InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning},
  author={Qin, Ziheng and Wang, Kai and Zheng, Zangwei and Gu, Jianyang and Peng, Xiangyu and Zhaopan Xu and Zhou, Daquan and Lei Shang and Baigui Sun and Xuansong Xie and You, Yang},
  booktitle={The Twelfth International Conference on Learning Representations},
  year={2024}
}

Acknowledgments

Thanks very much to InfoBatch for implementing the basic framework for dynamic pruning by three lines of code.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
examples		examples
scripts		scripts
.gitignore		.gitignore
README.md		README.md
repb.py		repb.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inconsistency Biases in Dynamic Data Pruning

Installation

Usage

Experiment

Results

Citation

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

mrazhou/RePB

Folders and files

Latest commit

History

Repository files navigation

Inconsistency Biases in Dynamic Data Pruning

Installation

Usage

Experiment

Results

Citation

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages