Task-Specific Generative Dataset Distillation with Difficulty-Guided Sampling

The code used in the following paper:
Task-Specific Generative Dataset Distillation with Difficulty-Guided Sampling (ICCVW 2025)

How to Use

This method samples the final distilled dataset from a larger image pool that is obtained by SOTA generative dataset distillation methods, guided by the concept of DIFFICULTY, which is defined to be the opposite of classification probability.

Set the virtual environment with conda

git clone https://github.com/SumomoTaku/DiffGuideSamp.git
cd DiffGuideSamp
conda env create -f environment.yml
conda activate DiffGuide

Get the image pool (ip).
This project doesn't include the code for creating the image pool.
You can refer to the pages of other SOTA methods, like Minimax.
The size of the image pool is recommended to $5 \times IPC$ following the experiments in the article.
Sampling.
You can find the basic implementation in the ./scripts/sample.sh
You need to set the path of the image pool, the original dataset, and the output. As well as the IPC (the size of the distilled dataset).

cd scripts
sh sample.sh

Train the downstream model.
This project doesn't include the code for training the downstream model.
You can refer to the code of other SOTA methods by putting the distilled dataset under their output path.

Citation

If you find this paper useful for your research, please use the following BibTeX entry.

@inproceedings{li2025diff,
  title={Task-Specific Generative Dataset Distillation with Difficulty-Guided Sampling}, 
  author={Li, Mingzhuo and Li, Guang and Mao, Jiafeng and Ye, Linfeng and Ogawa, Takahiro and Haseyama, Miki},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.idea		.idea
__pycache__		__pycache__
scripts		scripts
README.md		README.md
dataset_utils.py		dataset_utils.py
environment.yml		environment.yml
sampling.py		sampling.py
score_generate.py		score_generate.py
threshold.jsonl		threshold.jsonl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Task-Specific Generative Dataset Distillation with Difficulty-Guided Sampling

How to Use

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Task-Specific Generative Dataset Distillation with Difficulty-Guided Sampling

How to Use

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages