ALCAP

In this paper we propose to learn the alignment between audio and lyrics using contrastive learning to achieve higher-quality music captions.

Framework

Data Download

For copyright considerations we are only able to provide the song interpretation dataset but not the netease dataset.

Download the metadata to data/music4all.
Download the song waveforms to data/music4all/audios.
(Optional) Download the song embeddings to data/music4all/audios. If not downloaded the code will generate the embeddings from scratch.
(Optional) Download the CNN music encoder to ckp/

Model Training

python run_train.py

Try different corpora and random seeds.

Inference

python run_eval.py

Try different corpora and random seeds.

Citation

@inproceedings{he2023alcap,
  title={ALCAP: Alignment-Augmented Music Captioner},
  author={He, Zihao and Hao, Weituo and Lu, Wei-Tsung and Chen, Changyou and Lerman, Kristina and Song, Xuchen},
  booktitle={Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing},
  pages={16501--16512},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
ckp		ckp
models		models
utils		utils
.gitignore		.gitignore
README.md		README.md
alcap.png		alcap.png
evaluator.py		evaluator.py
run_eval.py		run_eval.py
run_train.py		run_train.py
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ALCAP

Framework

Data Download

Model Training

Inference

Citation

About

Releases

Packages

Languages

zihaohe123/ALCAP

Folders and files

Latest commit

History

Repository files navigation

ALCAP

Framework

Data Download

Model Training

Inference

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages