Skip to content

Latest commit

 

History

History
43 lines (38 loc) · 7.7 KB

vision-and-audio.md

File metadata and controls

43 lines (38 loc) · 7.7 KB

ICCV-2023-Papers

Application App

Vision and Audio

Section Papers Preprint Papers Papers with Open Code Papers with Video

Title Repo Paper Video
Sound Source Localization is All About Cross-Modal Alignment thecvf
arXiv
Class-Incremental Grouping Network for Continual Audio-Visual Learning GitHub thecvf
arXiv
Audio-Visual Class-Incremental Learning GitHub thecvf
arXiv
DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-Guided Speaker Embedding thecvf
arXiv
The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion GitHub Page
GitHub
thecvf
arXiv
SIDGAN: High-Resolution Dubbed Video Generation via Shift-Invariant Learning thecvf
Amazon Science
On the Audio-Visual Synchronization for Lip-to-Speech Synthesis thecvf
arXiv
Be Everywhere - Hear Everything (BEE): Audio Scene Reconstruction by Sparse Audio-Visual Samples thecvf
Dense 2D-3D Indoor Prediction with Sound via Aligned Cross-Modal Distillation GitHub Page
GitHub
thecvf
arXiv
Hyperbolic Audio-Visual Zero-Shot Learning thecvf
arXiv
AdVerb: Visually Guided Audio Dereverberation GitHub Page
GitHub
thecvf
arXiv
YouTube
Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation GitHub Page
GitHub
thecvf
arXiv