Skip to content

Harper812/FFDConv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Full-frequency Dynamic Convolution arxiv

The official implementation of Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection. (Submitted to ICME 2024)
Authors: Haobo Yue, Zhicheng Zhang, Da Mu, Yonghao Dang, Jianqin Yin, Jin Tang

Issues 😊 | Lab 👏 | Contact 📫

Updating

Code will be released soon!

Introduction

Frequency-dependent modeling

Full-frequency dynamic convolution (FFDConv) is proposed as the first full-dynamic method in SED. It generates frequency kernels for every frequency band, which is designed directly in the structure for frequency-dependent modeling. FFDConv physically furnished 2D convolution with the capability of frequency-dependent modeling.

Fine-grained temporal coherence

Most SED models are trained in a frame-based supervised way, which always leads to the feature and output being discrete over time. FFDConv can alleviate this by frequency-dependent modeling. Besides, the convolution kernel of FFDConv for a frequency band is shared in all frames, which can produce temporally coherent representations. This is consistent with both the continuity of the sound waveform and the vocal continuity of sound events.

Performance

FFDConv is evaluated on DESED

Model PSDS1 PSDS2 EB-F1 IB-F1
CRNN 0.370 0.579 0.469 0.714
DDFConv 0.387 0.624 0.467 0.720
FTDConv 0.395 0.651 0.495 0.740
FFDConv 0.436 0.685 0.526 0.751

Reference

Our code is implemented based on FDY-SED and ddfnet.
Specifically, experimental environment is based on FDY-SED, and model structure is based on ddfnet.
Thanks for their great work!

Citation

If this repository helped your works, please cite papers below! 😘

@article{yue2024fullfrequency,
      title={Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection}, 
      author={Haobo Yue and Zhicheng Zhang and Da Mu and Yonghao Dang and Jianqin Yin and Jin Tang},
      journal={arXiv preprint arXiv:2401.04976},
      year={2024},
}

About

Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published