DPSL-ASR (Dual-Path Style Learning for End-to-End Noise-Robust Automatic Speech Recognition)

Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition

Interactive Feature Fusion for End-to-End Noise-Robust Speech Recognition

Introduction

DPSL-ASR is a novel method for end-to-end noise-robust speech recognition. It has extended our prior work IFF-Net (Interactive Feature Fusion Network) with dual-path inputs and style learning, which achieved better ASR performance on RATS Channel-A dataset and CHiME-4 1-Channel Track Dataset.

Left figure: (a) joint SE-ASR approach, (b) IFF-Net baseline, (c) the proposed DPSL-ASR approach.

Right figure: back-end ASR module with style learning and consistency loss in our DPSL-ASR. The dashed lines denote sharing parameters.

If you find DPSL-ASR useful in your research, please use the following BibTeX entry for citation:

@article{hu2022dualpath,
  title={Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition}, 
  author={Hu, Yuchen and Hou, Nana and Chen, Chen and Chng, Eng Siong},
  journal={arXiv preprint arXiv:2203.14838},
  year={2022}
}

@article{hu2021interactive,
  title={Interactive Feature Fusion for End-to-End Noise-Robust Speech Recognition},
  author={Hu, Yuchen and Hou, Nana and Chen, Chen and Chng, Eng Siong},
  journal={arXiv preprint arXiv:2110.05267},
  year={2021}
}

Usage

Our code implementation is based on ESPnet. You can intall it directly using our provided ESPnet(v.0.9.6) folder, or install from official website and then add files from our repo. Use the command pip install -e . to install ESPnet.

In our foler, the running scripts are at egs2/rats_chA/asr_with_enhancement/{run_rats_chA_dpsl_asr, rats_chA_dpsl_asr}.sh, and the network code are at espnet2/{asr/, enh/, layers/}.

Tips:

To go over the entire project, please start from the script egs2/rats_chA/asr_with_enhancement/run_rats_chA_dpsl_asr.sh
To read the network code only, please start from the script espnet2/asr/dpsl_asr.py

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
ci		ci
doc		doc
docker		docker
egs		egs
egs2/rats_chA/asr_with_enhancement		egs2/rats_chA/asr_with_enhancement
espnet.egg-info		espnet.egg-info
espnet		espnet
espnet2		espnet2
test		test
test_utils		test_utils
tools		tools
utils		utils
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

License

Alex-Songs/DPSL-ASR

Folders and files

Latest commit

History

Repository files navigation

DPSL-ASR (Dual-Path Style Learning for End-to-End Noise-Robust Automatic Speech Recognition)

Introduction

Usage

About

Resources

License

Stars

Watchers

Forks

Languages