Repository consisting of important research papers on weak supervision - Image, Audio, Video.
Papers covering multiple sub-areas are listed in both the sections. If there are any areas, papers, and datasets I missed, please let me know or feel free to make a pull request. It would be greatly appreciated if you find this collection useful to star the repository.
Exploring the limits of Weakly Supervised Pretraining
Sound event detection and time–frequency segmentation from weakly labelled data, TASLP 2019
Weakly supervised scalable audio content analysis, ICME 2016
Audio Event Detection using Weakly Labeled Data, 24th ACM Multimedia Conference 2016
An approach for self-training audio event detectors using web data, 25th EUSIPCO 2017
A joint detection-classification model for audio tagging of weakly labelled data, ICASSP 2017
Connectionist Temporal Localization for Sound Event Detection with Sequential Labeling, ICASSP 2019
Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection, ArXiv 2020
A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition, ICML 2020
Non-Negative Matrix Factorization-Convolutional Neural Network (NMF-CNN) For Sound Event Detection, ArXiv 2020
Duration robust weakly supervised sound event detection, ICASSP 2020
SeCoST:: Sequential Co-Supervision for Large Scale Weakly Labeled Audio Event Detection, ICASSP 2020
Guided Learning for Weakly-Labeled Semi-Supervised Sound Event Detection, ICASSP 2020
Unsupervised Contrastive Learning of Sound Event Representations, ICASSP 2021
Sound Event Detection Based on Curriculum Learning Considering Learning Difficulty of Events, ICASSP 2021
Weakly-supervised audio event detection using event-specific Gaussian filters and fully convolutional networks, ICASSP 2017
Deep CNN Framework for Audio Event Recognition using Weakly Labeled Web Data, NIPS Workshop on Machine Learning for Audio 2017
Large-Scale Weakly Supervised Audio Classification Using Gated Convolutional Neural Network, ICASSP 2018
Orthogonality-Regularized Masked NMF for Learning on Weakly Labeled Audio Data, ICASSP 2018
Sound event detection and time–frequency segmentation from weakly labelled data, TASLP 2019
Attention-based Atrous Convolutional Neural Networks: Visualisation and Understanding Perspectives of Acoustic Scenes, ICASSP 2019
Sound Event Detection of Weakly Labelled Data With CNN-Transformer and Automatic Threshold Optimization, TASLP 2020
DD-CNN: Depthwise Disout Convolutional Neural Network for Low-complexity Acoustic Scene Classification, ArXiv 2020
Effective Perturbation based Semi-Supervised Learning Method for Sound Event Detection, INTERSPEECH 2020
Weakly-Supervised Sound Event Detection with Self-Attention, ICASSP 2020
Improving Deep Learning Sound Events Classifiers using Gram Matrix Feature-wise Correlations, ICASSP 2021
An Improved Event-Independent Network for Polyphonic Sound Event Localization and Detection, ICASSP 2021
AST: Audio Spectrogram Transformer, ArXiv 2021
Frequency-dependent auto-poolingfunction for weakly supervised sound event detection, EURASIP Journal on Audio, Speech, Music Processing
Adaptive Pooling Operators for Weakly Labeled Sound Event Detection, TASLP 2018
Comparing the Max and Noisy-Or Pooling Functions in Multiple Instance Learning for Weakly Supervised Sequence Learning Tasks, Interspeech 2018
A Comparison of Five Multiple Instance Learning Pooling Functions for Sound Event Detection with Weak Labeling, ICASSP 2019
Hierarchical Pooling Structure for Weakly Labeled Sound Event Detection, INTERSPEECH 2019
Weakly labelled audioset tagging with attention neural networks, TASLP 2019
Sound event detection and time–frequency segmentation from weakly labelled data, TASLP 2019
Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection, ArXiv 2019
Sound event detection and time–frequency segmentation from weakly labelled data, TASLP 2019
Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection, ArXiv 2019
Acoustic Scene Generation with Conditional Samplernn, ICASSP 2019
Contrastive Predictive Coding of Audio with an Adversary, INTERSPEECH 2020
ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection, ICASSP 2021
Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection, ArXiv 2019
Multi-Task Learning and post processing optimisation for sound event detection, DCASE 2019
Label-efficient audio classification through multitask learning and self-supervision, ICLR 2019
Transfer learning of weakly labelled audio, WASPAA 2017
Knowledge Transfer from Weakly Labeled Audio Using Convolutional Neural Network for Sound Events and Scenes, ICASSP 2018
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition, TASLP 2020
A first attempt at polyphonic sound event detection using connectionist temporal classification, ICASSP 2017 Polyphonic Sound Event Detection with Weak Labeling, Thesis 2018
Polyphonic Sound Event Detection and Localization using a Two-Stage Strategy, DCASE 2019
Evaluation of Post-Processing Algorithms for Polyphonic Sound Event Detection, WASPAA 2019
Specialized Decision Surface and Disentangled Feature for Weakly-Supervised Polyphonic Sound Event Detection, TASLP 2020
A Joint Separation-Classification Model for Sound Event Detection of Weakly Labelled Data, ICASSP 2018
A Joint Framework for Audio Tagging and Weakly Supervised Acoustic Event Detection Using DenseNet with Global Average Pooling, INTERSPEECH 2020
Impact of Sound Duration and Inactive Frames on Sound Event Detection Performance, ICASSP 2021
A Light-Weight Multimodal Framework for Improved Environmental Audio Tagging, ICASSP 2018
Large Scale Audiovisual Learning of Sounds with Weakly Labeled Data, IJCAI 2020
Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events, ICASSP 2021
Audio event and scene recognition: A unified approach using strongly and weakly labeled data, IJCNN 2017
Sound Event Detection Using Point-Labeled Data, WASPAA 2019
DCASE 2019 Task 4: Sound event detection in domestic environments
FSD50K: an open dataset of human-labeled sound events
AudioSet: A large-scale dataset of manually annotated audio events
List of old workshops (archived) and on-going workshops/conferences/journals:
Machine Learning for Audio Signal Processing, NIPS 2017 workshop
MLSP: Machine Learning for Signal Processing
WASPAA: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
ICASSP: IEEE International Conference on Acoustics Speech and Signal Processing
IEEE/ACM Transactions on Audio, Speech and Language Processing
Computational Analysis of Sound Scenes and Events
By Ankit Shah
Some of the links here were picked from repository by Soham Deshmukh