Skip to content

Latest commit

 

History

History
224 lines (219 loc) · 118 KB

image-and-video-synthesis-and-generation.md

File metadata and controls

224 lines (219 loc) · 118 KB

CVPR-2023-Papers

Application App
New collections Conference

Image and Video Synthesis and Generation

Section Papers Preprint Papers Papers with Open Code Papers with Video

Title Repo Paper Video
Towards Universal Fake Image Detectors That Generalize Across Generative Models GitHub thecvf
arXiv
Implicit Diffusion Models for Continuous Super-Resolution GitHub thecvf
arXiv
High-Fidelity Guided Image Synthesis With Latent Diffusion Models GitHub Page
GitHub
thecvf
arXiv
YouTube
DBARF: Deep Bundle-Adjusting Generalizable Neural Radiance Fields GitHub thecvf
arXiv
YouTube
Deep Arbitrary-Scale Image Super-Resolution via Scale-Equivariance Pursuit GitHub thecvf YouTube
Balanced Spherical Grid for Egocentric View Synthesis GitHub thecvf
arXiv
YouTube
SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation GitHub thecvf
arXiv
YouTube
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
CVPR - Award
GitHub Page thecvf
arXiv
YouTube
Self-Guided Diffusion Models GitHub thecvf
arXiv
YouTube
Multi-Concept Customization of Text-to-Image Diffusion GitHub Page
GitHub
thecvf
arXiv
YouTube
3D-Aware Conditional Image Synthesis GitHub thecvf
arXiv
YouTube
QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity GitHub thecvf
arXiv
YouTube
SceneComposer: Any-Level Semantic Image Synthesis
CVPR - Highlight
GitHub Page
GitHub
thecvf
arXiv
YouTube
DiffCollage: Parallel Generation of Large Content With Diffusion Models WEB Page thecvf
arXiv
Putting People in Their Place: Affordance-Aware Human Insertion Into Scenes GitHub thecvf
arXiv
Hybrid Neural Rendering for Large-Scale Scenes With Motion Blur GitHub thecvf
arXiv
YouTube
Binary Latent Diffusion GitHub thecvf
arXiv
YouTube
StyleRes: Transforming the Residuals for Real Image Editing With StyleGAN GitHub thecvf
arXiv
YouTube
KD-DLGAN: Data Limited Image Generation via Knowledge Distillation GitHub thecvf
arXiv
SeaThru-NeRF: Neural Radiance Fields in Scattering Media GitHub Page
GitHub
thecvf
arXiv
YouTube
PointAvatar: Deformable Point-Based Head Avatars From Videos GitHub thecvf
arXiv
YouTube
3DAvatarGAN: Bridging Domains for Personalized Editable Avatars GitHub Page thecvf
arXiv
Neural Preset for Color Style Transfer GitHub thecvf
arXiv
YouTube
Zero-Shot Generative Model Adaptation via Image-Specific Prompt Learning GitHub thecvf
arXiv
YouTube
DyNCA: Real-Time Dynamic Texture Synthesis Using Neural Cellular Automata GitHub Page
GitHub
thecvf
arXiv
YouTube
Exploring Incompatible Knowledge Transfer in Few-Shot Image Generation GitHub Page
GitHub
thecvf
arXiv
YouTube
HouseDiffusion: Vector Floorplan Generation via a Diffusion Model With Discrete and Continuous Denoising GitHub thecvf
arXiv
YouTube
Towards Accurate Image Coding: Improved Autoregressive Image Generation With Dynamic Vector Quantization
CVPR - Highlight
GitHub thecvf
arXiv
YouTube
RiDDLE: Reversible and Diversified De-Identification With Latent Encryptor GitHub thecvf
arXiv
YouTube
LayoutDiffusion: Controllable Diffusion Model for Layout-to-Image Generation GitHub thecvf
arXiv
YouTube
LipFormer: High-Fidelity and Generalizable Talking Face Generation With a Pre-Learned Facial Codebook GitHub thecvf
Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation GitHub thecvf
arXiv
YouTube
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis GitHub thecvf
arXiv
YouTube
High-Fidelity Generalized Emotional Talking Face Generation With Multi-Modal Emotion Space Learning thecvf
arXiv
YouTube
Consistent View Synthesis With Pose-Guided Diffusion Models GitHub Page thecvf
arXiv
YouTube
StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-Based Generator GitHub thecvf
arXiv
YouTube
Imagic: Text-Based Real Image Editing With Diffusion Models GitHub Page thecvf
arXiv
YouTube
Large-Capacity and Flexible Video Steganography via Invertible Neural Network GitHub thecvf
arXiv
Quantitative Manipulation of Custom Attributes on 3D-Aware Image Synthesis thecvf
Learning Detailed Radiance Manifolds for High-Fidelity and 3D-Consistent Portrait Synthesis From Monocular Image GitHub thecvf
arXiv
YouTube
CF-Font: Content Fusion for Few-Shot Font Generation GitHub thecvf
arXiv
YouTube
One-Shot High-Fidelity Talking-Head Synthesis With Deformable Neural Radiance Field WEB Page thecvf
arXiv
YouTube
Unsupervised Domain Adaption With Pixel-Level Discriminator for Image-Aware Layout Generation thecvf
arXiv
YouTube
Diffusion Probabilistic Model Made Slim thecvf
arXiv
YouTube
Collaborative Diffusion for Multi-Modal Face Generation and Editing GitHub Page
GitHub
thecvf
arXiv
YouTube
High-Fidelity Facial Avatar Reconstruction From Monocular Video With Generative Priors GitHub thecvf
arXiv
YouTube
Network-Free, Unsupervised Semantic Segmentation With Synthetic Images thecvf
Amazon Science
YouTube
Visual Prompt Tuning for Generative Transfer Learning GitHub thecvf
arXiv
YouTube
Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models To Learn Any Unseen Style GitHub Page
GitHub
thecvf YouTube
Catch Missing Details: Image Reconstruction With Frequency Augmented Variational Autoencoder GitHub thecvf
arXiv
YouTube
Towards Bridging the Performance Gaps of Joint Energy-Based Models GitHub thecvf
arXiv
GLeaD: Improving GANs With a Generator-Leading Task GitHub thecvf
arXiv
YouTube
Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction GitHub thecvf
arXiv
YouTube
SPARF: Neural Radiance Fields From Sparse and Noisy Poses
CVPR - Highlight
GitHub thecvf
arXiv
YouTube
DeltaEdit: Exploring Text-Free Training for Text-Driven Image Manipulation GitHub thecvf
arXiv
YouTube
Inferring and Leveraging Parts From Object Shape for Improving Semantic Image Synthesis GitHub thecvf
arXiv
YouTube
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation thecvf
arXiv
YouTube
MaskSketch: Unpaired Structure-Guided Masked Image Generation
CVPR - Highlight
thecvf
arXiv
YouTube
Affordance Diffusion: Synthesizing Hand-Object Interactions GitHub thecvf
arXiv
YouTube
Interactive Cartoonization With Controllable Perceptual Factors thecvf
arXiv
YouTube
MetaPortrait: Identity-Preserving Talking Head Generation With Fast Personalized Adaptation GitHub thecvf
arXiv
YouTube
Paint by Example: Exemplar-Based Image Editing With Diffusion Models GitHub thecvf
arXiv
GLIGEN: Open-Set Grounded Text-to-Image Generation GitHub thecvf
arXiv
YouTube
L-CoIns: Language-Based Colorization With Instance Awareness GitHub thecvf
DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation GitHub thecvf
arXiv
YouTube
Evading DeepFake Detectors via Adversarial Statistical Consistency thecvf
arXiv
GlassesGAN: Eyewear Personalization Using Synthetic Appearance Discovery and Targeted Subspace Modeling thecvf
arXiv
YouTube
GP-VTON: Towards General Purpose Virtual Try-On via Collaborative Local-Flow Global-Parsing Learning GitHub thecvf
arXiv
YouTube
Where Is My Spot? Few-Shot Image Generation via Latent Subspace Optimization GitHub thecvf
Regularized Vector Quantization for Tokenized Image Synthesis thecvf
arXiv
EDICT: Exact Diffusion Inversion via Coupled Transformations GitHub thecvf
arXiv
YouTube
Scaling Up GANs for Text-to-Image Synthesis
CVPR - Highlight
GitHub Page thecvf
arXiv
YouTube
Shape-Aware Text-Driven Layered Video Editing GitHub Page thecvf
arXiv
A Unified Pyramid Recurrent Network for Video Frame Interpolation GitHub thecvf
arXiv
YouTube
TAPS3D: Text-Guided 3D Textured Shape Generation From Pseudo Supervision GitHub thecvf
arXiv
YouTube
Fine-Grained Face Swapping via Regional GAN Inversion GitHub thecvf
arXiv
YouTube
OTAvatar: One-Shot Talking Face Avatar With Controllable Tri-Plane Rendering GitHub thecvf
arXiv
YouTube
Deep Stereo Video Inpainting thecvf
StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle Transfer GitHub Page thecvf
arXiv
Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences Between Pretrained Generative Models GitHub thecvf
arXiv
Unsupervised Volumetric Animation GitHub thecvf
arXiv
SINE: SINgle Image Editing With Text-to-Image Diffusion Models GitHub thecvf
arXiv
Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis GitHub thecvf
arXiv
YouTube
CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer GitHub thecvf
arXiv
YouTube
DeepVecFont-v2: Exploiting Transformers To Synthesize Vector Fonts With Higher Quality GitHub thecvf
arXiv
LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization GitHub thecvf
arXiv
YouTube
SINE: Semantic-Driven Image-Based NeRF Editing With Prior-Guided Editing Field GitHub thecvf
arXiv
YouTube
Exploring Intra-Class Variation Factors With Learnable Cluster Prompts for Semi-Supervised Image Synthesis thecvf
Image Cropping With Spatial-Aware Feature and Rank Consistency thecvf
thecvf
YouTube
Picture That Sketch: Photorealistic Image Generation From Abstract Sketches GitHub thecvf
arXiv
YouTube
MonoHuman: Animatable Human Neural Field From Monocular Video GitHub thecvf
arXiv
YouTube
PixHt-Lab: Pixel Height Based Light Effect Generation for Image Compositing
CVPR - Highlight
thecvf
arXiv
YouTube
Neural Pixel Composition for 3D-4D View Synthesis From Multi-Views Web Page thecvf
arXiv
YouTube
SpaText: Spatio-Textual Representation for Controllable Image Generation Web Page thecvf
arXiv
YouTube
Exploring Motion Ambiguity and Alignment for High-Quality Video Frame Interpolation thecvf
arXiv
YouTube
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation GitHub thecvf
arXiv
YouTube
Synthesizing Photorealistic Virtual Humans Through Cross-Modal Disentanglement GitHub Page thecvf
arXiv
YouTube
Video Probabilistic Diffusion Models in Projected Latent Space Web Page
GitHub
thecvf
arXiv
Variational Distribution Learning for Unsupervised Text-to-Image Generation thecvf
arXiv
Linking Garment With Person via Semantically Associated Landmarks for Virtual Try-On Web Page thecvf
UV Volumes for Real-Time Rendering of Editable Free-View Human Performance GitHub thecvf
arXiv
YouTube
NULL-Text Inversion for Editing Real Images Using Guided Diffusion Models GitHub Page
GitHub
thecvf
arXiv
YouTube
Polynomial Implicit Neural Representations for Large Diverse Datasets
CVPR - Highlight
GitHub thecvf
arXiv
YouTube
Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation GitHub thecvf
arXiv
YouTube
Conditional Image-to-Video Generation With Latent Flow Diffusion Models GitHub thecvf
arXiv
YouTube
Local 3D Editing via 3D Distillation of CLIP Knowledge thecvf
arXiv
Private Image Generation With Dual-Purpose Auxiliary Classifier
CVPR - Highlight
thecvf YouTube
MAGVIT: Masked Generative Video Transformer
CVPR - Highlight
Web Page
GitHub
thecvf
arXiv
Dimensionality-Varying Diffusion Process thecvf
arXiv
YouTube
VIVE3D: Viewpoint-Independent Video Editing Using 3D-Aware GANs GitHub thecvf
arXiv
YouTube
LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data GitHub thecvf
arXiv
DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model GitHub thecvf
arXiv
YouTube
Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint GitHub thecvf
arXiv
YouTube
High-Fidelity and Freely Controllable Talking Head Video Generation thecvf
arXiv
YouTube
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation GitHub Page
GitHub
thecvf
arXiv
YouTube
StyleRF: Zero-Shot 3D Style Transfer of Neural Radiance Fields GitHub Page
GitHub
thecvf
arXiv
YouTube
MOSO: Decomposing MOtion, Scene and Object for Video Prediction GitHub thecvf
arXiv
YouTube
Multi Domain Learning for Motion Magnification GitHub thecvf YouTube
GazeNeRF: 3D-Aware Gaze Redirection With Neural Radiance Fields GitHub thecvf
arXiv
YouTube
Hierarchical B-Frame Video Coding Using Two-Layer CANF Without Motion Coding GitHub thecvf
arXiv
Blemish-Aware and Progressive Face Retouching With Limited Paired Data thecvf YouTube
Text-Guided Unsupervised Latent Transformation for Multi-Attribute Image Manipulation thecvf
NeuralField-LDM: Scene Generation With Hierarchical Latent Diffusion Models Web Page thecvf
Fix the Noise: Disentangling Source Feature for Controllable Domain Translation GitHub thecvf
arXiv
YouTube
Class-Balancing Diffusion Models GitHub thecvf
arXiv
YouTube
DPE: Disentanglement of Pose and Expression for General Video Portrait Editing GitHub thecvf
arXiv
YouTube
Inversion-Based Style Transfer With Diffusion Models GitHub thecvf
arXiv
YouTube
Deep Curvilinear Editing: Commutative and Nonlinear Image Manipulation for Pretrained Deep Generative Model thecvf
arXiv
YouTube
FlowGrad: Controlling the Output of Generative ODEs With Gradients GitHub thecvf
Graph Transformer GANs for Graph-Constrained House Generation thecvf
arXiv
YouTube
Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot Artistic Style Transfer thecvf
arXiv
YouTube
Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars
CVPR - Highlight
GitHub Page
GitHub
thecvf
arXiv
YouTube
Ham2Pose: Animating Sign Language Notation Into Pose Sequences GitHub Page
GitHub
thecvf
arXiv
YouTube
Neural Transformation Fields for Arbitrary-Styled Font Generation GitHub thecvf YouTube
LayoutDM: Transformer-Based Diffusion Model for Layout Generation GitHub thecvf
arXiv
YouTube
Removing Objects From Neural Radiance Fields GitHub thecvf
arXiv
YouTube
Person Image Synthesis via Denoising Diffusion Model GitHub thecvf
arXiv
YouTube
AdaptiveMix: Improving GAN Training via Feature Space Shrinkage GitHub thecvf
arXiv
YouTube
Learning Joint Latent Space EBM Prior Model for Multi-Layer Generator GitHub Page
GitHub
thecvf
arXiv
YouTube
3D Neural Field Generation Using Triplane Diffusion GitHub thecvf
arXiv
YouTube
OmniAvatar: Geometry-Guided Controllable 3D Head Synthesis thecvf
arXiv
YouTube
RWSC-Fusion: Region-Wise Style-Controlled Fusion Network for the Prohibited X-Ray Security Image Synthesis thecvf YouTube
ObjectStitch: Object Compositing With Diffusion Model thecvf
arXiv
YouTube
Persistent Nature: A Generative Model of Unbounded 3D Worlds GitHub Page thecvf
arXiv
YouTube
Masked and Adaptive Transformer for Exemplar Based Image Translation GitHub thecvf
arXiv
YouTube
Spider GAN: Leveraging Friendly Neighbors To Accelerate GAN Training GitHub thecvf
arXiv
YouTube
Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild GitHub thecvf
arXiv
YouTube
Align Your Latents: High-Resolution Video Synthesis With Latent Diffusion Models GitHub thecvf
arXiv
All Are Worth Words: A ViT Backbone for Diffusion Models GitHub thecvf
arXiv
YouTube
Few-Shot Semantic Image Synthesis With Class Affinity Transfer GitHub thecvf
arXiv
Blowing in the Wind: CycleNet for Human Cinemagraphs From Still Images GitHub Page thecvf
arXiv
YouTube
StyleGene: Crossover and Mutation of Region-Level Facial Genes for Kinship Face Synthesis
CVPR - Highlight
GitHub Page
GitHub
thecvf YouTube
MixNeRF: Modeling a Ray With Mixture Density for Novel View Synthesis From Sparse Inputs GitHub thecvf
arXiv
YouTube
MoStGAN-V: Video Generation With Temporal Motion Styles GitHub thecvf
arXiv
YouTube
Frame Interpolation Transformer and Uncertainty Guidance GitHub thecvf YouTube
Towards End-to-End Generative Modeling of Long Videos With Memory-Efficient Bidirectional Transformers GitHub thecvf
arXiv
YouTube
HOLODIFFUSION: Training a 3D Diffusion Model Using 2D Images GitHub thecvf
arXiv
YouTube
Neural Texture Synthesis With Guided Correspondence GitHub thecvf YouTube
PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360° GitHub thecvf
arXiv
YouTube
InstructPix2Pix: Learning To Follow Image Editing Instructions
CVPR - Highlight
Web Page
GitHub
thecvf
arXiv
YouTube
Unpaired Image-to-Image Translation With Shortest Path Regularization GitHub thecvf YouTube
Freestyle Layout-to-Image Synthesis
CVPR - Highlight
GitHub thecvf
arXiv
YouTube
On Distillation of Guided Diffusion Models
CVPR - Award
thecvf
arXiv
Single Image Backdoor Inversion via Robust Smoothed Classifiers GitHub thecvf
arXiv
YouTube
Make-a-Story: Visual Memory Conditioned Consistent Story Generation GitHub thecvf
arXiv
YouTube
Towards Practical Plug-and-Play Diffusion Models GitHub thecvf
arXiv
YouTube
Efficient Scale-Invariant Generator With Column-Row Entangled Pixel Synthesis GitHub thecvf
arXiv
YouTube
Wavelet Diffusion Models Are Fast and Scalable Image Generators GitHub thecvf
arXiv
YouTube
3D GAN Inversion With Facial Symmetry Prior GitHub thecvf
arXiv
YouTube
Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert GitHub thecvf
arXiv
YouTube
PCT-Net: Full Resolution Image Harmonization Using Pixel-Wise Color Transformations GitHub thecvf
ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model With Knowledge-Enhanced Mixture-of-Denoising-Experts
CVPR - Highlight
GitHub Page thecvf
arXiv
Video Compression With Entropy-Constrained Neural Representations thecvf YouTube
Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models GitHub thecvf
arXiv
YouTube
CoralStyleCLIP: Co-Optimized Region and Layer Selection for Image Editing GitHub thecvf
arXiv
YouTube
Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding GitHub thecvf
arXiv
YouTube
Sequential Training of GANs Against GAN-Classifiers Reveals Correlated Knowledge Gaps Present Among Independently Trained GAN Instances thecvf
arXiv
YouTube
Attribute-Preserving Face Dataset Anonymization via Latent Code Optimization
CVPR - Highlight
GitHub thecvf
arXiv
YouTube
Shifted Diffusion for Text-to-Image Generation GitHub thecvf
arXiv
HandsOff: Labeled Dataset Generation With No Additional Human Annotations
CVPR - Highlight
GitHub Page
GitHub
thecvf
arXiv
YouTube
Lookahead Diffusion Probabilistic Models for Refining Mean Estimation GitHub thecvf
arXiv
Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
CVPR - Highlight
Web Page thecvf
arXiv
Re-GAN: Data-Efficient GANs Training via Architectural Reconfiguration GitHub thecvf
BBDM: Image-to-Image Translation With Brownian Bridge Diffusion Models GitHub thecvf
arXiv
YouTube
VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models GitHub thecvf
arXiv