Skip to content

derekluo/Awesome-Diffusion-Model-Based-Image-Editing-Methods

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

43 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

image

Awesome License: MIT Made With Love arXiv visitors

The repository is based on our survey Diffusion Model-Based Image Editing: A Survey.

Yi Huang*, Jiancheng Huang*, Yifan Liu*, Mingfu Yan*, Jiaxi Lv*, Jianzhuang Liu*, Wei Xiong, He Zhang, Liangliang Cao, Shifeng Chen

Shenzhen Institute of Advanced Technology (SIAT), Chinese Academy of Sciences (CAS), Adobe Inc, Apple Inc, Southern University of Science and Technology (SUSTech)

Abstract

Denoising diffusion models have emerged as a powerful tool for various image generation and editing tasks, facilitating the synthesis of visual content in an unconditional or input-conditional manner. The core idea behind them is learning to reverse the process of gradually adding noise to images, allowing them to generate high-quality samples from a complex distribution. In this survey, we provide an exhaustive overview of existing methods using diffusion models for image editing, covering both theoretical and practical aspects in the field. We delve into a thorough analysis and categorization of these works from multiple perspectives, including learning strategies, user-input conditions, and the array of specific editing tasks that can be accomplished. In addition, we pay special attention to image inpainting and outpainting, and explore both earlier traditional context-driven and current multimodal conditional methods, offering a comprehensive analysis of their methodologies. To further evaluate the performance of text-guided image editing algorithms, we propose a systematic benchmark, EditEval, featuring an innovative metric, LMM Score. Finally, we address current limitations and envision some potential directions for future research.

πŸ”– News!!!

πŸ“Œ We are actively tracking the latest research and welcome contributions to our repository and survey paper. If your studies are relevant, please feel free to contact us.

πŸ“° 2024-03-06: We establish a template for paper submissions. This template is accessible by navigating to the New Issue button within Issues or by clicking here. Once there, please select the Paper Submission Form and complete it following the guidelines provided.

πŸ“° 2024-02-28: Our comprehensive survey paper, summarizing related methods published before February 1, 2024, is now available!

πŸ” BibTeX

@article{huang2024diffusion,
  title={Diffusion Model-Based Image Editing: A Survey},
  author={Huang, Yi and Huang, Jiancheng and Liu, Yifan and Yan, Mingfu and Lv, Jiaxi and Liu, Jianzhuang and Xiong, Wei and Zhang, He and Chen, Shifeng and Cao, Liangliang},
  journal={arXiv preprint arXiv:2402.17525},
  year={2024}
}

Table of contents

Papers

Training-Based

Training-Based: Domain-Specific Editing with Weak Supervision

Title Pub Release Date
CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation NeurIPS 2023 2023.10
Stylediffusion: Controllable disentangled style transfer via diffusion models ICCV 2023 2023.08
Hierarchical diffusion autoencoders and disentangled image manipulation WACV 2024 2023.04
Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion Models arXiv 2023 2023.04
Fine-grained Image Editing by Pixel-wise Guidance Using Diffusion Models CVPR 2023 WORKSHOP 2022.12
Diffstyler: Controllable dual diffusion for text-driven image stylization TNNLS 2024 2022.11
Diffusion Models Already Have A Semantic Latent Space ICLR 2022 2022.10
Egsde: Unpaired image-to-image translation via energy-guided stochastic differential equations NeurIPS 2022 2022.07
Diffusion autoencoders: Toward a meaningful and decodable representation CVPR 2022 2021.11
Unit-ddpm: Unpaired image translation with denoising diffusion probabilistic models arXiv 2021 2021.04
Diffusionclip: Text-guided diffusion models for robust image manipulation CVPR 2022 2021.01

Training-Based: Reference and Attribute Guidance via Self-Supervision

Title Pub Release Date
SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control arXiv 2023 2023.12
A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting arXiv 2023 2023.12
DreamInpainter: Text-Guided Subject-Driven Image Inpainting with Diffusion Models arXiv 2023 2023.12
Uni-paint: A Unified Framework for Multimodal Image Inpainting with Pretrained Diffusion Model ACM MM 2023 2023.10
Face Aging via Diffusion-based Editing BMVC 2023 2023.09
Anydoor: Zero-shot object-level image customization arXiv 2023 2023.07
Paste, Inpaint and Harmonize via Denoising: Subject-Driven Image Editing with Pre-Trained Diffusion Model arXiv 2023 2023.06
Text-to-image editing by image information removal WACV 2024 2023.05
Reference-based Image Composition with Sketch via Structure-aware Diffusion Model arXiv 2023 2023.04
Pair-diffusion: Object-level image editing with structure-and-appearance paired diffusion models arXiv 2023 2023.03
Imagen editor and editbench: Advancing and evaluating text-guided image inpainting CVPR 2023 2022.12
Smartbrush: Text and shape guided object inpainting with diffusion model CVPR 2023 2022.12
ObjectStitch: Object Compositing With Diffusion Model CVPR 2023 2022.12
Paint by example: Exemplar-based image editing with diffusion models CVPR 2023 2022.11

Training-Based: Instructional Editing via Full Supervision

Title Pub Release Date
SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models CVPR 2024 2023.12
InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following arXiv 2023 2023.12
Focus on Your Instruction: Fine-grained and Multi-instruction Image Editing by Attention Modulation CVPR 2024 2023.12
Emu edit: Precise image editing via recognition and generation tasks arXiv 2023 2023.11
Guiding instruction-based image editing via multimodal large language models arXiv 2023 2023.09
Instructdiffusion: A generalist modeling interface for vision tasks arXiv 2023 2023.09
MoEController: Instruction-based Arbitrary Image Manipulation with Mixture-of-Expert Controllers arXiv 2023 2023.09
ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation NeurIPS 2023 2023.08
Inst-Inpaint: Instructing to Remove Objects with Diffusion Models arXiv 2023 2023.04
HIVE: Harnessing Human Feedback for Instructional Visual Editing arXiv 2023 2023.03
DialogPaint: A Dialog-based Image Editing Model arXiv 2023 2023.01
Learning to Follow Object-Centric Image Editing Instructions Faithfully ACL 2023 2023.01
Instructpix2pix: Learning to follow image editing instructions CVPR 2023 2022.11

Training-Based: Pseudo-Target Retrieval with Weak Supervision

Title Pub Release Date
Text-Driven Image Editing via Learnable Regions arXiv 2023 2023.11
iEdit: Localised Text-guided Image Editing with Weak Supervision arXiv 2023 2023.05
ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space Manipulation arXiv 2023 2023.05

Testing-Time Finetuning

Testing-Time Finetuning: Denosing Model Finetuning

Title Pub Release Date
Kv inversion: Kv embeddings learning for text-conditioned real image action editing arXiv 2023 2023.09
Custom-edit: Text-guided image editing with customized diffusion models arXiv 2023 2023.05
Unitune: Text-driven image editing by fine tuning an image generation model on a single image arXiv 2022 2022.10

Testing-Time Finetuning: Embeddings Finetuning

Title Pub Release Date
Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing NeurIPS 2023 2023.09
Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models ICCV 2023 2023.05
Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models CVPR 2023 2022.12
Null-text inversion for editing real images using guided diffusion models CVPR 2023 2022.11

Testing-Time Finetuning: Guidance with Hypernetworks

Title Pub Release Date
StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing arXiv 2023 2023.05
Inversion-based creativity transfer with diffusion models CVPR 2023 2022.11

Testing-Time Finetuning: Latent Variable Optimization

Title Pub Release Date
Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing arXiv 2023 2023.11
MagicRemover: Tuning-free Text-guided Image inpainting with Diffusion Models arXiv 2023 2023.10
Dragondiffusion: Enabling drag-style manipulation on diffusion models arXiv 2023 2023.07
DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing arXiv 2023 2023.06
Delta denoising score ICCV 2023 2023.04
Diffusion-based Image Translation using disentangled style and content representation ICLR 2022 2022.09

Testing-Time Finetuning: Hybrid Finetuning

Title Pub Release Date
Forgedit: Text Guided Image Editing via Learning and Forgetting arXiv 2023 2023.09
LayerDiffusion: Layered Controlled Image Editing with Diffusion Models arXiv 2023 2023.05
Sine: Single image editing with text-to-image diffusion models CVPR 2023 2022.12
Imagic: Text-Based Real Image Editing With Diffusion Models CVPR 2023 2022.10

Training and Finetuning Free

Training and Finetuning Free: Input Text Refinement

Title Pub Release Date
User-friendly Image Editing with Minimal Text Input: Leveraging Captioning and Injection Techniques arXiv 2023 2023.06
ReGeneration Learning of Diffusion Models with Rich Prompts for Zero-Shot Image Translation arXiv 2023 2023.05
InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions arXiv 2023 2023.05
Preditor: Text guided image editing with diffusion prior arXiv 2023 2023.02

Training and Finetuning Free: Inversion/Sampling Modification

Title Pub Release Date
Inversion-Free Image Editing with Natural Language CVPR 2024 2023.12
Fixed-point Inversion for Text-to-image diffusion models arXiv 2023 2023.12
Tuning-Free Inversion-Enhanced Control for Consistent Image Editing arXiv 2023 2023.12
The Blessing of Randomness: SDE Beats ODE in General Diffusion-based Image Editing arXiv 2023 2023.11
LEDITS++: Limitless Image Editing using Text-to-Image Models arXiv 2023 2023.11
A latent space of stochastic diffusion models for zero-shot image editing and guidance ICCV 2023 2023.10
Effective real image editing with accelerated iterative diffusion inversion ICCV 2023 2023.09
Fec: Three finetuning-free methods to enhance consistency for real image editing arXiv 2023 2023.09
Iterative multi-granular image editing using diffusion models arXiv 2024 2023.09
ProxEdit: Improving Tuning-Free Real Image Editing With Proximal Guidance WACV 2024 2023.06
Diffusion self-guidance for controllable image generation arXiv 2023 2023.06
Diffusion Brush: A Latent Diffusion Model-based Editing Tool for AI-generated Images arXiv 2023 2023.06
Negative-prompt Inversion: Fast Image Inversion for Editing with Text-guided Diffusion Models arXiv 2023 2023.05
An Edit Friendly DDPM Noise Space: Inversion and Manipulations arXiv 2023 2023.04
Training-Free Content Injection Using H-Space in Diffusion Models WACV 2024 2023.03
Edict: Exact diffusion inversion via coupled transformations CVPR 2023 2022.11
Direct inversion: Optimization-free text-driven real image editing with diffusion models arXiv 2022 2022.11

Training and Finetuning Free: Attention Modification

Title Pub Release Date
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models arXiv 2023 2023.12
Tf-icon: Diffusion-based training-free cross-domain image composition ICCV 2023 2023.07
Energy-Based Cross Attention for Bayesian Context Update in Text-to-Image Diffusion Models NeurIPS 2023 2023.06
Conditional Score Guidance for Text-Driven Image-to-Image Translation NeurIPS 2023 2023.05
MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing arXiv 2023 2023.04
Localizing Object-level Shape Variations with Text-to-Image Diffusion Models ICCV 2023 2023.03
Zero-shot image-to-image translation ACM SIGGRAPH 2023 2023.02
Shape-Guided Diffusion With Inside-Outside Attention WACV 2024 2022.12
Plug-and-play diffusion features for text-driven image-to-image translation CVPR 2023 2022.11
Prompt-to-prompt image editing with cross attention control ICLR 2022.08

Training and Finetuning Free: Mask Guidance

Title Pub Release Date
ZONE: Zero-Shot Instruction-Guided Local Editing CVPR 2024 2023.12
Watch your steps: Local image and scene editing by text instructions arXiv 2023 2023.08
Energy-Based Cross Attention for Bayesian Context Update in Text-to-Image Diffusion Models NeurIPS 2023 2023.06
Differential Diffusion: Giving Each Pixel Its Strength arXiv 2023 2023.06
PFB-Diff: Progressive Feature Blending Diffusion for Text-driven Image Editing arXiv 2023 2023.06
FISEdit: Accelerating Text-to-image Editing via Cache-enabled Sparse Diffusion Inference AAAI 2023 2023.05
Inpaint anything: Segment anything meets image inpainting arXiv 2023 2023.04
Region-aware diffusion for zero-shot text-driven image editing CVM 2023 2023.02
Text-guided mask-free local image retouching ICME 2023 2022.12
Blended diffusion for text-driven editing of natural images CVPR 2022 2021.11
DiffEdit: Diffusion-based semantic image editing with mask guidance ICLR 2022.10
Blended latent diffusion SIGGRAPH 2023 2022.06

Training and Finetuning Free: Multi-Noise Redirection

Title Pub Release Date
Object-aware Inversion and Reassembly for Image Editing arXiv 2023 2023.10
Ledits: Real image editing with ddpm inversion and semantic guidance arXiv 2023 2023.07
Sega: Instructing diffusion using semantic dimensions arXiv 2023 2023.01
The stable artist: Steering semantics in diffusion latent space arXiv 2022 2022.12

About

Diffusion Model-Based Image Editing: A Survey (arXiv)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published