Using pretrained encoder and language models to generate captions from multimedia inputs.
-
Updated
Mar 11, 2023 - Python
Using pretrained encoder and language models to generate captions from multimedia inputs.
Audio Captioning datasets for PyTorch.
Python code for handling the Clotho dataset.
Song Describer is a data collection platform for annotating music with textual descriptions.
Code base for WaveTransformer: A novel architecture for automated audio captioning
Audio captioning baseline system for DCASE 2020 challenge.
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
2nd place solution for 2020 DCASE challenge task 6 audio captioning. http://dcase.community/challenge2020/task-automatic-audio-captioning-results#wuyusong2020_t6
Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval.
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
6-th task solution of DCASE2020
PyTorch dataloader for Clotho dataset.
Code for using with the Clotho dataset
DCASE2024 Challenge Task 6 baseline system (Automated Audio Captioning)
This reporsitory code form Weakly Supervised Automaed Audio Captioning via Text Only Training
IRIT-UPS DCASE 2021 AUDIO CAPTIONING SYSTEM
Add a description, image, and links to the audio-captioning topic page so that developers can more easily learn about it.
To associate your repository with the audio-captioning topic, visit your repo's landing page and select "manage topics."