[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
-
Updated
Aug 18, 2025 - Python
[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
HunyuanVideo-Foley: Multimodal Diffusion with Representation Alignment for High-Fidelity Foley Audio Generation.
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
[ICML 2025] PyTorch Implementation of "OmniAudio: Generating Spatial Audio from 360-Degree Video"
AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Text and image to video generation: Kandinsky 4.0 (2024)
Official implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation
Extract, timestamp, and analyze specific content from video collections using LLM-powered audio/video processing.
Generate subtitles for all the videos in a folder with OpenAI's Whisper privately in your computer.
Benchmarking for Audio-Text and Audio-Visual Generation; Supports FAD, FD_VGG, FD_PANNs, FD_PaSST, IS_PaSST, IS_PANNs, KL_PaSST, KL_PANNs, LAION-CLAP, MS-CLAP, DeSync
[NeurIPS 2024] Code, Dataset, Samples for the VATT paper “ Tell What You Hear From What You See - Video to Audio Generation Through Text”
[AAAI 2024] V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models
A minimalistic wasm-based web application for instant audio extraction; video to audio conversion.
The code and weight for LoVA. LoVA is a novel model for Long-form Video-to-Audio generation. Based on the Diffusion Transformer (DiT) architecture, LoVA proves to be more effective at generating long-form audio compared to existing autoregressive models and UNet-based diffusion models.
javascript video to audio。前端视频转音频。使用FileReader加载视频,然后decodeAudioData对其进行解码,并使用OfflineAudioContext重新渲染,最后将audiobuffer转换为wav。
Luma AI Video + Audio + Image Generation and RunwayML Video Generation from Image and Text
Downloads public Instagram content
a simple video to audio converter
Video2Sound - Bring sound to your videos
Add a description, image, and links to the video-to-audio topic page so that developers can more easily learn about it.
To associate your repository with the video-to-audio topic, visit your repo's landing page and select "manage topics."