Early Timestep Zero-Shot Candidate Selection for Instruction-Guided Image Editing (ELECT) (ICCV 2025)
Official PyTorch implementation of
“Early Timestep Zero-Shot Candidate Selection for Instruction-Guided Image Editing” (ICCV 2025)
Despite impressive advances in diffusion models, instruction-guided image editing often fails—e.g. distorted backgrounds—due to the stochastic nature of sampling.
ELECT (Early-timestep Latent Evaluation for Candidate Selection) tackles this by:
- Multiple-seed baseline: Uses a Background Inconsistency Score (BIS) to reach Best-of-N performance without supervision.
- Zero-shot ranking: Estimates background mismatch at early diffusion timesteps, selecting seeds that keep the background while editing only the foreground.
- Cost savings: Cuts sampling compute by 41 % on average (up to 61 %).
- Higher success: Recovers ~40 % of previously failed edits, boosting background consistency & instruction adherence.
- Extensibility: Integrates into instruction-guided pipelines and MLLMs for joint seed & prompt selection when seed-only isn’t enough.
All benefits come without additional training or external supervision.
| Date | Event |
|---|---|
| 2025-04-19 | 📚 arXiv pre-print released |
| 2025-06-26 | 🏆 Accepted to ICCV 2025 |
| 2025-07-26 | 💻 Initial code release |
-
Clone the repo
-
Create the environment. This is an example using conda.
conda create -n elect python=3.9
conda activate elect
pip install -r requirements.txt- (Optional) Use InstructDiffusion
- git clone https://github.com/cientgu/InstructDiffusion
- Download
v1-5-pruned-emaonly-adaption-task.ckptfrom that repo and move it to./checkpoints.
- (Optional) Use MGIE
- Follow the MGIE setup guide.
- Place the official LLaVA-Lightning-7B in
./checkpoints/LLaVA-7B-v1. - Put
mllm.ptandunet.ptin./checkpoints/mgie_7b.
- Datasets for evaluation
| Dataset | Link |
|---|---|
| PIE-Bench | https://github.com/cure-lab/PnPInversion |
| MagicBrush test set | https://osu-nlp-group.github.io/MagicBrush/ |
- Single image
python inference.py \
--run_type run_single_image \
--input_path ./images/cat_to_bear.png \
--instruction "Replace the cat with a bear" \
--model {instructpix2pix | magicbrush | instructdiffusion | mgie | ultraedit}
- Dataset
python inference.py \
--run_type run_dataset \
--dataset_dir ./datasets/PIE-bench \
--model {instructpix2pix | magicbrush | instructdiffusion | mgie | ultraedit}
python inference.py \
--run_type run_single_image \
--input_path ./images/cat_to_bear.png \
--instruction "Replace the cat with a bear" \
--model {instructpix2pix | magicbrush | instructdiffusion | mgie | ultraedit} \
--select_one_seed \
--num_random_candidates 10
- Arguments
-
select_one_seed: If set, select the best seed from the candidate seeds based on background inconsistency scores.
-
num_random_candidates: Number of random candidate seeds to be used for inference. (random seeds)
-
candidate_seeds: List of candidate seeds for fixed seed inference. (if num_random_candidates is 0, this will be used)
-
stopping_step: The step at which to select the best seed. (default=40) (assuming --inference_step 100; scale proportionally if you use a different total number of inference steps)
-
first_step_for_mask_extraction: The first step for relevance mask extraction. This is used to accumulate the relevance map. (default=0)
-
last_step_for_mask_extraction: The last step for relevance mask extraction. This is used to accumulate the relevance map. (default=20)
-
visualize_all_seeds: If set, visualize all seeds' outputs. Otherwise, only the best seed's output is visualized.
-
output_dir: Directory where edited images are saved (default:
./outputs).- If
visualize_all_seedsis not set, the result is saved as
{input_name}_output_{selected_seed}.png. - If
visualize_all_seedsis set, every seed’s output is saved as
{input_name}_output_{seed}.png, and the elected one is additionally stored as
{input_name}_output-best_seed-{seed}.png. - The relevance map used for the Background Inconsistency Score is saved as
{input_name}_output-mask.png.
- If
-
- Release implementation code for seed selection
- Release implementation code for prompt selection
- Release evaluation code
We gratefully acknowledge and extend our sincere appreciation to the creators of the following projects and datasets, whose excellent contributions laid the foundation for this work.
| Category | Repositories |
|---|---|
| Models | InstructPix2Pix · InstructDiffusion · MGIE · UltraEdit |
| Datasets | PIE-Bench · MagicBrush |
@article{kim2025early,
title = {Early timestep zero-shot candidate selection for instruction-guided image editing},
author = {Kim, Joowon and Lee, Ziseok and Cho, Donghyeon and Jo, Sanghyun and Jung, Yeonsung and Kim, Kyungsu and Yang, Eunho},
journal = {arXiv preprint arXiv:2504.13490},
year = {2025}
}