Original project:
• [Project Page] • [ArXiv] •
Purpose of this is educational. A simple demo showing how a mel spectrogram can be transformed via a neural vocoder into audio.
@InProceedings{SpecVQGAN_Iashin_2021,
title={Taming Visually Guided Sound Generation},
author={Iashin, Vladimir and Rahtu, Esa},
booktitle={British Machine Vision Conference (BMVC)},
year={2021}
}