Skip to content

Repository consisting of important research papers on weak supervision - Image, Audio, Video

License

Notifications You must be signed in to change notification settings

ankitshah009/awesome-weak-supervision

Repository files navigation

awesome-weak-supervision

Repository consisting of important research papers on weak supervision - Image, Audio, Video.

Papers covering multiple sub-areas are listed in both the sections. If there are any areas, papers, and datasets I missed, please let me know or feel free to make a pull request. It would be greatly appreciated if you find this collection useful to star the repository.

Weak Supervision in Images

Research Papers

Exploring the limits of Weakly Supervised Pretraining

GitHub Repositories

WSLImages

Weak Supervision in Video

Weakly Supervised Sound Event Detection

Table of Contents

Research papers

Survey papers

Sound event detection and time–frequency segmentation from weakly labelled data, TASLP 2019

Core Areas

Learning formulation

Weakly supervised scalable audio content analysis, ICME 2016

Audio Event Detection using Weakly Labeled Data, 24th ACM Multimedia Conference 2016

An approach for self-training audio event detectors using web data, 25th EUSIPCO 2017

A joint detection-classification model for audio tagging of weakly labelled data, ICASSP 2017

Connectionist Temporal Localization for Sound Event Detection with Sequential Labeling, ICASSP 2019

Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection, ArXiv 2020

A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition, ICML 2020

Non-Negative Matrix Factorization-Convolutional Neural Network (NMF-CNN) For Sound Event Detection, ArXiv 2020

Duration robust weakly supervised sound event detection, ICASSP 2020

SeCoST:: Sequential Co-Supervision for Large Scale Weakly Labeled Audio Event Detection, ICASSP 2020

Guided Learning for Weakly-Labeled Semi-Supervised Sound Event Detection, ICASSP 2020

Unsupervised Contrastive Learning of Sound Event Representations, ICASSP 2021

Sound Event Detection Based on Curriculum Learning Considering Learning Difficulty of Events, ICASSP 2021

Network Architecture

Weakly-supervised audio event detection using event-specific Gaussian filters and fully convolutional networks, ICASSP 2017

Deep CNN Framework for Audio Event Recognition using Weakly Labeled Web Data, NIPS Workshop on Machine Learning for Audio 2017

Large-Scale Weakly Supervised Audio Classification Using Gated Convolutional Neural Network, ICASSP 2018

Orthogonality-Regularized Masked NMF for Learning on Weakly Labeled Audio Data, ICASSP 2018

Sound event detection and time–frequency segmentation from weakly labelled data, TASLP 2019

Attention-based Atrous Convolutional Neural Networks: Visualisation and Understanding Perspectives of Acoustic Scenes, ICASSP 2019

Sound Event Detection of Weakly Labelled Data With CNN-Transformer and Automatic Threshold Optimization, TASLP 2020

DD-CNN: Depthwise Disout Convolutional Neural Network for Low-complexity Acoustic Scene Classification, ArXiv 2020

Effective Perturbation based Semi-Supervised Learning Method for Sound Event Detection, INTERSPEECH 2020

Weakly-Supervised Sound Event Detection with Self-Attention, ICASSP 2020

Improving Deep Learning Sound Events Classifiers using Gram Matrix Feature-wise Correlations, ICASSP 2021

An Improved Event-Independent Network for Polyphonic Sound Event Localization and Detection, ICASSP 2021

AST: Audio Spectrogram Transformer, ArXiv 2021

Pooling functions

Frequency-dependent auto-poolingfunction for weakly supervised sound event detection, EURASIP Journal on Audio, Speech, Music Processing

Adaptive Pooling Operators for Weakly Labeled Sound Event Detection, TASLP 2018

Comparing the Max and Noisy-Or Pooling Functions in Multiple Instance Learning for Weakly Supervised Sequence Learning Tasks, Interspeech 2018

A Comparison of Five Multiple Instance Learning Pooling Functions for Sound Event Detection with Weak Labeling, ICASSP 2019

Hierarchical Pooling Structure for Weakly Labeled Sound Event Detection, INTERSPEECH 2019

Weakly labelled audioset tagging with attention neural networks, TASLP 2019

Sound event detection and time–frequency segmentation from weakly labelled data, TASLP 2019

Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection, ArXiv 2019

Missing or noisy audio:

Sound event detection and time–frequency segmentation from weakly labelled data, TASLP 2019

Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection, ArXiv 2019

Generative Learning:

Acoustic Scene Generation with Conditional Samplernn, ICASSP 2019

Representation Learning

Contrastive Predictive Coding of Audio with an Adversary, INTERSPEECH 2020

ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection, ICASSP 2021

Multi-Task Learning

Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection, ArXiv 2019

Multi-Task Learning and post processing optimisation for sound event detection, DCASE 2019

Label-efficient audio classification through multitask learning and self-supervision, ICLR 2019

Knowledge Transfer

Transfer learning of weakly labelled audio, WASPAA 2017

Knowledge Transfer from Weakly Labeled Audio Using Convolutional Neural Network for Sound Events and Scenes, ICASSP 2018

PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition, TASLP 2020

Polyphonic SED

A first attempt at polyphonic sound event detection using connectionist temporal classification, ICASSP 2017 Polyphonic Sound Event Detection with Weak Labeling, Thesis 2018

Polyphonic Sound Event Detection and Localization using a Two-Stage Strategy, DCASE 2019

Evaluation of Post-Processing Algorithms for Polyphonic Sound Event Detection, WASPAA 2019

Specialized Decision Surface and Disentangled Feature for Weakly-Supervised Polyphonic Sound Event Detection, TASLP 2020

Joint task

A Joint Separation-Classification Model for Sound Event Detection of Weakly Labelled Data, ICASSP 2018

A Joint Framework for Audio Tagging and Weakly Supervised Acoustic Event Detection Using DenseNet with Global Average Pooling, INTERSPEECH 2020

Loss function

Impact of Sound Duration and Inactive Frames on Sound Event Detection Performance, ICASSP 2021

Extension

Multimodal Audio and Visual

A Light-Weight Multimodal Framework for Improved Environmental Audio Tagging, ICASSP 2018

Large Scale Audiovisual Learning of Sounds with Weakly Labeled Data, IJCAI 2020

Multimodal Audio and Text

Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events, ICASSP 2021

Strongly and Weakly labelled data

Audio event and scene recognition: A unified approach using strongly and weakly labeled data, IJCNN 2017

Others

Sound Event Detection Using Point-Labeled Data, WASPAA 2019

Dataset

DCASE 2019 Task 4: Sound event detection in domestic environments

DCASE 2018 Task 4: Large-scale weakly labeled semi-supervised sound event detection in domestic environments

FSD50K: an open dataset of human-labeled sound events

AudioSet: A large-scale dataset of manually annotated audio events

Workshops/Conferences/Journals

List of old workshops (archived) and on-going workshops/conferences/journals:

Machine Learning for Audio Signal Processing, NIPS 2017 workshop

MLSP: Machine Learning for Signal Processing

WASPAA: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

ICASSP: IEEE International Conference on Acoustics Speech and Signal Processing

INTERSPEECH

IEEE/ACM Transactions on Audio, Speech and Language Processing

DCASE

Tutorials

Resources

Computational Analysis of Sound Scenes and Events

Credits

By Ankit Shah

Some of the links here were picked from repository by Soham Deshmukh

About

Repository consisting of important research papers on weak supervision - Image, Audio, Video

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published