Unofficial implementation of phase reconstruction method based on recurrent phase unwrapping with DNNs

This repository provides an unofficial implementation of phase reconstruction based on RPU with DNNs [1].

Additionally, a weighted RPU is also implemented [2].

Licence

MIT licence.

Dependencies

We tested the implemention on Ubuntu 22.04. The verion of Python was 3.10.12. The following modules are required:

hydra-core
joblib
librosa
numpy
progressbar2
pydub
pypesq
pyroomacoustics
pystoi
scikit-learn
scipy
soundfile
torch

Datasets

You need to prepare the following two datasets from JSUT corpus.

basic5000: for training
onomatopee300: for evaluation

Recipes

Download the two datasets. Put those in /root_dir/trainset_dir and /root_dir/evalset_dir/, respectively.
Modify config.yaml according to your environment. It contains settings for experimental conditions. For immediate use, you can edit mainly the directory paths according to your environment.
Run preprocess.py. It performs preprocessing steps.
Run training.py. It performs model training.
Run evaluate_scores.py. It generates reconstructed audio data and computes objective scores (PESQ, STOI, LSC). In this script, the function compute_rpu implements RPU and weighted RPU.
Run evaluate_scores_zerophase.py. It also generates reconstructed audio data and computes objective scores (PESQ, STOI, LSC), where phase spectrum is assumed to be zero (zero-phase).
Run evaluate_scores_randomphase.py. It also generates reconstructed audio data and computes objective scores (PESQ, STOI, LSC), where phase spectrum is assumed to be sampled uniformly between $-\pi$ and $\pi$.
Run plot_boxplot.py. It plots boxplot of objective scores.

References

[1] Y. Masuyama, K. Yatabe, Y. Koizumi, Y. Oikawa and N. Harada, "Phase reconstruction based on recurrent phase unwrapping with deep neural networks," IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), May 2020.

[2] N. B. Thien, Y. Wakabayashi, K. Iwai and T. Nishiura, "Inter-Frequency Phase Difference for Phase Reconstruction Using Deep Neural Networks and Maximum Likelihood," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1667-1680, 2023, doi: 10.1109/TASLP.2023.3268577.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
dataset.py		dataset.py
evaluate_scores.py		evaluate_scores.py
evaluate_scores_randomphase.py		evaluate_scores_randomphase.py
evaluate_scores_zerophase.py		evaluate_scores_zerophase.py
factory.py		factory.py
model.py		model.py
plot_boxplot.py		plot_boxplot.py
preprocess.py		preprocess.py
training.py		training.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unofficial implementation of phase reconstruction method based on recurrent phase unwrapping with DNNs

Licence

Dependencies

Datasets

Recipes

References

About

Releases

Packages

Languages

License

tam17aki/rpu

Folders and files

Latest commit

History

Repository files navigation

Unofficial implementation of phase reconstruction method based on recurrent phase unwrapping with DNNs

Licence

Dependencies

Datasets

Recipes

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages