GitHub

Masking Kernel for Learning Energy-Efficient Representations for Speaker Recognition and Mobile Health

This repository contains codes used to reproduce speaker recognition results in "Masking Kernel for Learning Energy-Efficient Representations for Speaker Recognition and Mobile Health" paper. Masking kernel is introduced to optimize winow length and sampling rate of input speech as energy-efficient parameters, together with other parameters in the DNN model.

To demonstate compatibility of our methods with various speech features and DNN model, we include AM-MobileNet1D model for MFCC, Spectrogram and Raw audio, and Sincnet on raw audio. For SincNet please check another link provided in manuscript.

For result and limitation, please chech readme file in SincNet branch.

Instruction

Dependencies

pyTorch 1.10
pysoundfile
Scipy
Numpy
THOP

How to run

Generate normalized TIMIT files following instruction on Sincnet.
Edit 'PATH_TO_NORMALIZED_TIMIT_DATASET' in dataloader.py line 12 to where you keep TIMIT dataset.
Run speaker recognition experiments using following code:
```
  python train.py --model MobileNetV2 --mask hamming --hard_mask True --sampling FFT --penalty 0.1
```
The supported models are MobileNetV2, MobileNetV2_MFCC and MobileNetV2_spectrogram with the choise of masking filter, e.g. gaussian, hamming and hann.

For more parameter configurations (such as training rate, training epoch, etc.), please check parser function in train.py. A sample of training result is provided in Results folder of SincNet branch.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data_lists		data_lists
README.md		README.md
dataloader.py		dataloader.py
dnn_models.py		dnn_models.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data_lists

data_lists

README.md

README.md

dataloader.py

dataloader.py

dnn_models.py

dnn_models.py

train.py

train.py

Repository files navigation

Masking Kernel for Learning Energy-Efficient Representations for Speaker Recognition and Mobile Health

Instruction

Dependencies

How to run

About

Releases

Packages

Languages

aditthapron/windowMasking

Folders and files

Latest commit

History

Repository files navigation

Masking Kernel for Learning Energy-Efficient Representations for Speaker Recognition and Mobile Health

Instruction

Dependencies

How to run

About

Resources

Stars

Watchers

Forks

Languages