LORIS

This is the official implementation of "Long-Term Rhythmic Video Soundtracker", ICML2023.

Jiashuo Yu, Yaohui Wang, Xinyuan Chen, Xiao Sun, and Yu Qiao.

OpenGVLab, Shanghai Artificial Intelligence Laboratory

Introduction

We present Long-Term Rhythmic Video Soundtracker (LORIS), a novel framework to synthesize long-term conditional waveforms in sync with visual cues. Our framework consists of a latent conditional diffusion probabilistic model to perform waveform synthesis. Furthermore, a series of context-aware conditioning encoders are proposed to take temporal information into consideration for a long-term generation. We also extend our model's applicability from dances to multiple sports scenarios such as floor exercise and figure skating. To perform comprehensive evaluations, we establish a benchmark for rhythmic video soundtracks including the pre-processed dataset, improved evaluation metrics, and robust generative baselines.

How to Start

pip install -r requirements.txt

Training

bash scripts/loris_{subset}_s{length}.sh

Inference

bash scripts/infer_{subset}_s{length}.sh

Dataset

Dataset is available in huggingface.

from datasets import load_dataset
dataset = load_dataset("OpenGVLab/LORIS")

Citation

@inproceedings{Yu2023Long,
title={Long-Term Rhythmic Video Soundtracker},
author={Yu, Jiashuo and Wang, Yaohui and Chen, Xinyuan and Sun, Xiao and Qiao, Yu },
booktitle={International Conference on Machine Learning (ICML)},
year={2023}
}

Acknowledgement

We would like to thank the authors of previous related projects for generously sharing their code and insights: audio-diffusion-pytorch, CDCD, D2M-GAN, VQ-Diffusion, and JukeBox.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
audio_diffusion_pytorch		audio_diffusion_pytorch
configs		configs
d2m		d2m
data		data
imgs		imgs
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
beats_scores.py		beats_scores.py
d2m_loris.py		d2m_loris.py
generate_loris.py		generate_loris.py
requirements.txt		requirements.txt

License

OpenGVLab/LORIS

Folders and files

Latest commit

History

Repository files navigation

LORIS

Introduction

How to Start

Training

Inference

Dataset

Citation

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Languages