MMControl enables fine-grained multi-modal control for joint audio-video generation using reference images, reference audio, depth maps, and pose sequences.
- Project repository initialized.
- Project page and paper links will be updated upon public release.
Please refer to our project page and paper for more results and details.
- Coming soon...
License information will be added upon release.
BibTeX will be added after the paper is publicly released.
