SESNNet : Speech Enhancement using Spiking Neural Networks

1. Introduction

Welcome to the SESNNet repository, which showcases research conducted during my master's program at Université de Sherbrooke. This project focuses on speech enhancement using spiking neural networks (SNNs). Our objective is to develop an SNN model that can rival the performance of conventional artificial neural networks (ANNs) while demonstrating superior energy efficiency.

For an in-depth exploration of our findings, refer to our conference paper available at https://doi.org/10.1109/CCECE58730.2023.10288830.

2. Dataset

The audio dataset used in this project is available at https://datashare.ed.ac.uk/handle/10283/2791

3. Installation

To set up the essential dependencies, create a Python environment and use pip along with the provided requirements file:

$ python.exe -m pip install --upgrade pip
$ pip install -r requirements.txt

4. Web application using Gradio

Explore the capabilities of the proposed speech enhancement model based on spiking neural networks (SNNs) through the Gradio-powered web interface. Execute the following command to activate the web interface:

$ gradio app.py

NOTE: The audio examples used in this interactive interface are retrieved from the dataset mentioned in Section 2. For details about the dataset, including its origin and composition, refer to Dataset Section.

5. Experiment tracking using Comet ML

Sign up for an account on comet.ml.

6. Main Module Details: main.py

The main.py program serves as a versatile platform for conducting experiments with diverse speech enhancement models, employing either spiking neural networks or conventional artificial neural networks. conventional neural networks.

Basic arguments

Short	Long	Type	Default	Description
`-m`	`--model_name`	str	UNetSNN	Name of the model
`-tr`	`--train_flag`	boolean	False	Boolean that indicates weather to train the model
`-e`	`--nb_epochs`	int	30	Number of training iterations
`-b`	`--batch_size`	int	32	Number of data samples per batch
`-lr`	`--learning_rate`	float	0.0002	Learning rate of the model during training
`-tnp`	`--train_neuron_parameters`	boolean	False	Boolean that indicates weather to train neuron parameters
`-r`	`--recurrent_flag`	boolean	False	Boolean that indicates weather to add recurrence term to the input current equation
`-dr`	`--detach_reset`	boolean	False	Boolean that indicates weather to detach the computation graph of reset term in backward
`-dist`	`--use_ddp`	boolean	False	Boolean that indicates weather to use Pytorch Distributed Data Parallel (DDP) library
`-pm`	`--pin_memory`	boolean	False	Boolean that indicates pin_memory parameter for the data loader
`-eval`	`--evaluate_flag`	boolean	False	Boolean that indicates weather to evaluate model using the validation set during training
`-pmf`	`--perceptual_metric_flag`	boolean	False	Boolean that indicates weather to compute a perceptual metric
`-sme`	`--save_mem`	boolean	False	Boolean that indicates weather to log hidden activations
`-det`	`--deterministic`	boolean	False	Boolean that indicates weather to use deterministic mode for reproducibility
`-pr`	`--pretrained_flag`	boolean	False	Boolean that indicates weather to load a pretrained model
`-dbg`	`--debug_flag`	boolean	False	Boolean that indicates weather to use debugging dataset

Usage

To train a model, use the following command:

$ python main.py --model_name UNetSNN --train_flag --nb_epochs 5 --batch_size 4 --learning_rate 0.0004 --train_neuron_parameters --recurrent_flag --detach_reset --use_ddp --pin_memory --evaluate_flag --perceptual_metric_flag --save_mem --deterministic

To test a pretrained model, use the following command:

$ python main.py --model_name UNetSNN --pretrained_flag --batch_size 4 --train_neuron_parameters --recurrent_flag --detach_reset --use_ddp --pin_memory --save_mem --deterministic

Help

Run python main.py -h or python main.py --help for more information.

Debugging

For debugging purposes, include the --debug_flag argument.

Experiment tracking

For experiment tracking using Comet ML, include the following arguments: --workspace <Your Workspace> --api_key <Your API Key> --project_name <Your Project Name>

7. Documentation

For additional details, refer to the documentation file available at Docs.

8. Project Files Structure

Speech_enhancement_SNN/
├── app.py
├── constants.py
├── experiments/
├── main.py
├── requirements.txt
├── SpeechEnhancement_data/
│   ├── audio/
│   │   ├── __audio_info__/
│   │   ├── clean_test/
│   │   ├── clean_train/
│   │   ├── clean_valid/
│   │   ├── noisy_test/
│   │   ├── noisy_train/
│   │   └── noisy_valid/
│   └── STFT_4s_nfft=512_wl=512_hl=256/
│       ├── coefficients/
│       ├── metadata/
│       └── reconstruction/
├── src/
│   ├── data/
│   │   ├── constants.py
│   │   ├── DatasetManager.py
│   │   └── TransformManager.py
│   ├── evaluation/
│   │   ├── composite.py
│   │   ├── DNSMOS/
│   │   └── EvaluationManager.py
│   ├── model/
│   │   ├── ArtificialBlock.py
│   │   ├── ArtificialModel.py
│   │   ├── constants.py
│   │   ├── LossManager.py
│   │   ├── SpeechEnhancer.py
│   │   ├── SpikingBlock.py
│   │   ├── SpikingLayer.py
│   │   ├── SpikingModel.py
│   │   ├── SurrogateGradient.py
│   │   ├── TrainValidTestManager.py
│   │   └── utils.py
│   ├── stft/
│   │   ├── constants.py
│   │   ├── Stft.py
│   │   └── StftManager.py
│   └── visualization/
│       └── VisualizationManager.py
├── docsrc/
│   ├── make.bat
│   ├── Makefile
│   └── source/
├── docs/
└── trained_models/

9. Cite

@INPROCEEDINGS{10288830,
  author={Riahi, Abir and Plourde, Éric},
  booktitle={2023 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)}, 
  title={Single Channel Speech Enhancement Using U-Net Spiking Neural Networks}, 
  year={2023},
  volume={},
  number={},
  pages={111-116},
  doi={10.1109/CCECE58730.2023.10288830}}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
docs		docs
docsrc		docsrc
examples		examples
src		src
.gitignore		.gitignore
Evaluate_SESNNet.ipynb		Evaluate_SESNNet.ipynb
LICENSE		LICENSE
README.md		README.md
app.py		app.py
constants.py		constants.py
main.py		main.py
requirements.txt		requirements.txt

License

riaa3102/SESNNet

Folders and files

Latest commit

History

Repository files navigation

SESNNet : Speech Enhancement using Spiking Neural Networks

Table of Contents