Skip to content

FreeStyleFreeLunch/Research-Papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 

Repository files navigation

Reading list for research in generation models.

We list the most popular methods for generation models, if we missed something, please submit a request. (Note: We show the date the first edition of the paper was submitted to arxiv, but the link to the paper may be up to date.)

Backbone:

Date Method Conference Title Code/Project Page abstract
2024 VisionLLaMA ECCV 2024 VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks code 对视觉语言大模型的大一统backbone进行了探讨,主要提出了一个2d位置编码

Autoregressive generation model:

Date Method Conference Title Code/Project Page abstract
Jun 2024 LlamaGen CVPR Jun 2024 Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation code 采用与Llama相同的网络结构实现了图像的自回归生成,在语言视觉模型的范式统一上具有重要意义
28 Jul 2024 MAR ARXIV 28 Jul 2024 Autoregressive Image Generation without Vector Quantization code 去除VQ过程的自回归生成方法
Jun 2024 VAR PR Jun 2024 Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction code 以不同scale图像作为自回归单位进行自回归生成
Oct 2024 MovieGen Oct 2024 Movie Gen: A Cast of Media Foundation Models poster 继Sora后Meta推出的视频生成模型
Oct 2024 DART Oct 2024 DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation poster 提出了一种与LLM主流模型结构具有统一性的扩散生成模型(里面推导很多,暂时没看懂,先去看VDM了)
2 Oct 2024 ControlVAR 2 Oct 2024 ControlVAR: Exploring Controllable Visual Autoregressive Modeling code coming soon
7 Oct 2024 CAR Arxiv 7 Oct 2024 CAR: Controllable Autoregressive Modeling for Visual Generation code 参考controlNet的方式在自回归生成模型VAR上进行可控生成
14 Oct 2024 HART Arxiv 14 Oct 2024 HART: Efficient Visual Generation with Hybrid Autoregressive Transformer code 基于VAR生成模型构建的非离散的自回归图像生成
10 Oct 2024 Meissonic Arxiv 10 Oct 2024 Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis code 一种高效的文生图模型,它将非自回归掩码图像建模 (MIM) 文本到图像提升到与 SDXL 等最先进的扩散模型相当的水平,并大大提高性能和效率
17 Oct 2024 Fluid Arxiv 17 Oct 2024 Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens None 在自回归生成任务中对token的连续和离散,预测采用光栅扫描还是随机mask方式进行了深入探讨
27 Sep 2024 Emu3 Arxiv 27 Sep 2024 Emu3: Next-Token Prediction is All You Need poster 首个采用Llama2架构实现文本、图像、视频大一统的模型(看起来确实很吊)
24 Oct 2024 FairQueue Arxiv 24 Oct 2024 FairQueue: Rethinking Prompt Learning for Fair Text-to-Image Generation None coming soon
30 Apr 2024 None Arxiv 30 Apr 2024 Better & Faster Large Language Models via Multi-token Prediction None 自回归加速推理(NLP)
29 Oct 2024 None Arxiv 29 Oct 2024 Towards Unifying Understanding and Generation in the Era of Vision Foundation Models: A Survey from the Autoregression Perspective None 语言图像自回归大一统的探讨

Diffusion model:

Date Method Conference Title Code/Project Page/abstract
2015-xx-xx Diffusion models ICML 2015 Deep Unsupervised Learning using Nonequilibrium Thermodynamics None
2020-06-09 Denoised Diffusion models NeurIPS 2020 Denoising Diffusion Probabilistic Models Diffusion Models
2020-10-06 DDIM ICLR 2021 DENOISING DIFFUSION IMPLICIT MODELS None
2020-11-26 SDE ICLR 2021(Oral) Score-Based Generative Modeling through Stochastic Differential Equations None
2021-02-18 improved-diffusion Arxiv 2021 Improved Denoising Diffusion Probabilistic Models improved-diffusion
2021-05-11 guided-diffusion NeurIPS 2021 Diffusion Models Beat GANs on Image Synthesis guided-diffusion
2021-05-30 cascaded diffusion models Arxiv 2021 Cascaded Diffusion Models for High Fidelity Image Generation None
2021-07-01 Variational Diffusion Models NeurIPS 2021 Variational Diffusion Models Variational Diffusion Models
2021-09-28 Classifier-Free Diffusion NeurIPS 2021 WorkShop Classifier-Free Diffusion Guidance None
2021-10-06 DiffusionCLIP Arxiv 2021 DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation DiffusionCLIP
2021-11-10 Palette Arxiv 2021 Palette: Image-to-Image Diffusion Models None
2021-11-29 Blended diffusion Arxiv 2021 Blended Diffusion for Text-driven Editing of Natural Images Blended Diffusion
2021-12-20 GLIDE ICML 2022 GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models GLIDE
2022-01-24 RePaint CVPR 2022 RePaint: Inpainting using Denoising Diffusion Probabilistic Models RePaint
2022-04-06 KNN-Diffusion Arxiv 2022 KNN-Diffusion: Image Generation via Large-Scale Retrieval None
2022-04-13 DALL·E 2 Arxiv 2022 Hierarchical Text-Conditional Image Generation with CLIP Latents None
2022-05-23 Imagen Arxiv 2022 Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding None
2022-06-01 -- Arxiv 2022 Elucidating the Design Space of Diffusion-Based Generative Models None
2022-06-03 Composable-Diffusion Arxiv 2022 Compositional Visual Generation with Composable Diffusion Models Composable-Diffusion
2022-06-22 videoDiffusion NeurIPS 2022 Video Diffusion Models None
2022-08-03 PDDPM Arxiv 2022 Pyramidal Denoising Diffusion Probabilistic Models None
2022-08-25 DreamBooth Arxiv 2022 DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation None
2022-11-17 UViT Arxiv 2022 All are Worth Words: A ViT Backbone for Diffusion Models UViT
2022-11-17 RenderDiffusion Arxiv 2022 RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation RenderDiffusion
2022-11-17 Null-text Inversion Arxiv 2022 Null-text Inversion for Editing Real Images using Guided Diffusion Models Null-text Inversion
2022-11-25 3DDesigner Arxiv 2022 3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models 3DDesigner
2024-x-x DEADiff CVPR 2024 DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations DEADiff
2024-8-14 TurboEdit ARXIV 2024 TurboEdit: Instant text-based image editing diffusion模型的目标解耦和图像编辑
2021-X-X VDM NIPS 2021 Variational Diffusion Models 哈哈,好像是换了个角度重新推了一遍DDPM,有点没看懂,公式推导好多啊
23 Oct 2024 None Arxiv 23 Oct 2024 Scalable Ranked Preference Optimization for Text-to-Image Generation diffusion model上的HFRL
23 Oct 2024 None Arxiv 23 Oct 2024 How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization? 还没看
2025 None CVPR 2025 ACE: Anti-Editing Concept Erasure in Text-to-Image Models unlearning相关
2025 None CVPR 2025 Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models 数据集攻击,安全相关

Segmentation:

Date Method Conference Title Code
2021-12-06 ddpm-segmentation ICLR 2022 Label-Efficient Semantic Segmentation with Diffusion Models ddpm-segmentation

Other Discriminative Tasks:

Date Method Conference Title Code
2023-05-18 ~ Arxiv 2023 Discriminative Diffusion Models as Few-shot Vision and Language Learners None

Survey:

Date Conference Title
2022-09-02 Arxiv 2022 Diffusion Models: A Comprehensive Survey of Methods and Applications
2022-09-10 Arxiv 2022 Diffusion Models in Vision: A Survey
2022-09-12 Arxiv 2022 A Survey on Generative Diffusion Model
2023-04-02 Arxiv 2023 Text-to-image Diffusion Models in Generative AI:A Survey
Sep 2024 AI 2024 Multi-Modal Generative AI: Multi-modal LLM, Diffusion and Beyond

Large Language Models:

Date Method Conference Title Code
Feb 2023 LLAMA Arxiv Feb 2023 Llama: Open and efficient foundation language models code

Model Merging:

Date Method Conference Title Code/Project Page abstract
Sep 2024 None ML Sep 2024 REALISTIC EVALUATION OF MODEL MERGING FORCOMPOSITIONAL GENERALIZATION None None
Jun 2024 WATT CVPR Jun 2024 WATT: Weight Average Test-Time Adaptation of CLIP None 在test-time adaptation的问题中采用model merging的方法提高模型泛化能力
9 Oct 2024 DECOUPLE-THEN-MERGE 9 Oct 2024 DECOUPLE-THEN-MERGE:TOWARDS BETTER TRAINING FOR DIFFUSION MODELS None 分时间步分别训练diffusion模型再通过merging得到减少由于不同时间步导致的参数冲突问题
6 Jun 2024 None 6 Jun 2024 B-ary Tree Push-Pull Method is Provably Efficient for Distributed Learning on Heterogeneous Data None reading
27 Sept 2024 None 27 Sept 2024 LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging None 探讨观点:网络深层和浅层在finetune中的不同影响
8 Dec 2022 None 8 Dec 2022 Editing Models with Task Arithmetic None task vector提出的论文
16 Oct 2023 None 16 Oct 2023 Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards None Rewarded Soup
24 Nov 2024 None 24 Nov 2024 Less is More: Efficient Model Merging with Binary Task Switch None 动态merging方法,具体有点像剪枝

HFRL:

Date Method Conference Title Code/Project Page abstract
6 Oct 2024 TIS-DPO arxiv 6 Oct 2024 TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights [None] 正在看,别催啦
17 Oct 2024 TIS-DPO arxiv 17 Oct 2024 Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design code 提出了一种RL算法,结合蛋白质领域的离散扩散方法,直接通过奖励对模型进行优化解决不可导问题

Tutorial:

CVPR 2022 Tutorial:Denoising Diffusion-based Generative Modeling: Foundations and Applications

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published