Junkai Wu1, Jonah Casebeer1, Nicholas J. Bryan2, and Paris Smaragdis1
1 Department of Computer Science, University of Illinois at Urbana-Champaign
2 Adobe Research
Table of Contents
Adaptive filters are applicable to many signal processing tasks including acoustic echo cancellation, beamforming, and more. Adaptive filters are typically controlled using algorithms such as least-mean squares(LMS), recursive least squares(RLS), or Kalman filter updates. Such models are often applied in the frequency domain, assume frequency independent processing, and do not exploit higher-order frequency dependencies, for simplicity. Recent work on meta-adaptive filters, however, has shown that we can control filter adaptation using neural networks without manual derivation, motivating new work to exploit such information. In this work, we present higher-order meta-adaptive filters, a key improvement to meta-adaptive filters that incorporates higher-order frequency dependencies. We demonstrate our approach on acoustic echo cancellation and develop a family of filters that yield multi-dB improvements over competitive baselines, and are at least an order-of-magnitude less complex. Moreover, we show our improvements hold with or without a downstream speech enhancer.
For more details, please see: "Meta-Learning for Adaptive Filters with Higher-Order Frequency Dependencies", Junkai Wu, Jonah Casebeer, Nicholas J. Bryan, and Paris Smaragdis, IWAENC, 2022. If you use ideas or code from this work, please cite our paper:
@article{wu2022metalearning,
title={Meta-Learning for Adaptive Filters with Higher-Order Frequency Dependencies},
author={Wu, Junkai and Casebeer, Jonah and Bryan, Nicholas J. and Smaragdis, Paris},
booktitle={IEEE International Workshop on Acoustic Signal Enhancement (IWAENC)},
year={2022},
}
For audio demonstrations of this work, please check out our demo website. You'll be able to find demos for AEC and as well as joint AEC and speech enhancement.
This folder contains the implementation of our paper "Meta-Learning for Adaptive Filters with Higher-Order Frequency Dependencies". It leverages the metaaf
python package. This directory contains all necessary code to reproduce and run our experiments. The core file is hoaec.py
. It contains the main AEC code and training configuration. The hoaec_joint.py
file contains code for joint AEC and speech enhancement.
Run this command to train Meta-AEC model with banded
dependency structure, group_size
of 3, and hidden_size
of 32. Change the arguments to train other models.
python hoaec.py \
--n_frames 1 --window_size 4096 --hop_size 2048 --n_in_chan 1 --n_out_chan 1 \
--is_real --n_devices 1 --batch_size 16 \
--h_size 32 --n_layers 2 --total_epochs 500 \
--val_period 1 --reduce_lr_patience 5 --early_stop_patience 16 \
--name aec_32_banded3 \
--unroll 20 --double_talk --group_mode banded \
--group_size 3 --lr 0.0001 --random_roll
Replace the date
and epoch
arguments and use this command to evaluate the trained Meta-AEC model.
python hoaec_eval.py \
--name aec_32_banded3 \
--date <date> --epoch <epoch> \
--save_metrics --save_outputs
You can also run /hometa_aec/fig3.sh
to train and evaluate models shown in fig. 3 of the paper all at once. To calculate SERLE
we used in the paper, check out /hometa_aec/aec_results.ipynb
.
Replace the date
and epoch
arguments and use this command to store the aec
outputs of the trained Meta-AEC model.
python hoaec_eval.py \
--name aec_32_banded --date <aec_date> --epoch <aec_epoch> \
--generate_aec_data --fix_train_roll
Then run this command to train the DNN speech enhancer.
python hoaec_joint.py \
--window_size 4096 --hop_size 256 --is_real
--n_devices 1 --batch_size 32 \
--h_size 32 --n_layers 2
--total_epochs 200 \
--val_period 1 --reduce_lr_patience 5 --early_stop_patience 16 \
--unroll 150 --double_talk \
--group_mode block --group_size 4 \
--lr 0.0006 \
--m_n_frames 1 --m_window_size 512 --m_hop_size 256 \
--m_n_in_chan 1 --m_n_out_chan 1 --m_is_real \
--joint_mode aec_res --aec_res_mode train\
--name res_32_banded --aec_name aec_32_banded
Replace the date
and epoch
arguments and run this command to evaluate the trained DNN-RES model.
python hoaec_joint_eval.py \
--mode res --name res_32_banded --date <date> --epoch <epoch> \
--aec_name aec_32_banded --save_metrics --save_outputs
Follow the instructions here to get and download the Microsoft acoustic echo cancellation challenge dataset. Unzip the dataset and set the base path for AEC_DATA_DIR
in /zoo/__config__.py
. Also set RES_DATA_DIR
in /zoo/__config__.py
to a directory you want to store the aec
outputs for res
training and evaluation.
You can download the checkpoints for Meta-AEC models and DNN speech enhancers in fig. 4 of the paper from the tagged release here. Unzip and make sure the folders aec_diag
, aec_banded9
, aec_banded3
, res_diag
, res_banded9
and res_banded3
are under the /hometa_aec/ckpts
folder.
You can use /hometa_aec/aec_ckpts_run.sh
to run the released aec
models and /hometa_aec/res_ckpts_run.sh
to run the released res
models. Please keep the --iwaenc_release
flag on to use the correct filter. In the released code we added anti-aliasing for gradient and error terms, which the released models were not trained with.
All core utility code within the metaaf
folder is licensed via the University of Illinois Open Source License. All code within the zoo
and hometa_aec
folders and model weights are licensed via the Adobe Research License. Copyright (c) Adobe Systems Incorporated. All rights reserved.