Learning to Localize Sound Source in Visual Scenes [CVPR 2018]

The Sound Localization dataset can be downloaded from the following link:

https://drive.google.com/open?id=1P93CTiQV71YLZCmBbZA0FvdwFxreydLt

This dataset contains 5k image-sound pairs and their annotations in XML format. Each XML file has annotations of 3 annotators.

test_list.txt file includes the id of every pair that is used for testing.

If you end up using the dataset, we ask you to cite the following paper:

@InProceedings{Senocak_2018_CVPR,
author = {Senocak, Arda and Oh, Tae-Hyun and Kim, Junsik and Yang, Ming-Hsuan and So Kweon, In},
title = {Learning to Localize Sound Source in Visual Scenes},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}

Image-sound pairs are collected by using the Flickr-SoundNet dataset. Thus, please cite the Yahoo dataset the Yahoo dataset and SoundNet paper as well.

The dataset must be used for research purposes only.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
test_list.txt		test_list.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning to Localize Sound Source in Visual Scenes [CVPR 2018]

About

Releases

Packages

fengfan028/learning_to_localize_sound

Folders and files

Latest commit

History

Repository files navigation

Learning to Localize Sound Source in Visual Scenes [CVPR 2018]

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages