Attacking Speaker Recognition with Deep Generative Models
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README.md
audio_processing.py
data_processing.py
demo_spectrograms.png
gan_attack.ipynb
gan_synthesis.ipynb
gan_train.py
layers.py
logger.py
models.py
plotting_utils.py
requirements.txt
sr_train.py
utils.py

README.md

Attacking Speaker Recognition Systems with Deep Generative Models

PyTorch implementation of Attacking Speaker Recognition Systems with Deep Generative Models.

Real and Fake Spectrograms

Pre-requisites

  1. NVIDIA GPU + CUDA cuDNN

Data and pre-trained models:

Setup

  1. Clone this repo: git clone https://github.com/rafaelvalle/asrgen.git
  2. CD into this repo: cd asrgen
  3. Download and unzip audio data into this repo
  4. Install python requirements: pip install -r requirements.txt

Training

  1. python gan_train.py
  2. (OPTIONAL) tensorboard --logdir=./

Synthesize audio samples with a Generator

  1. jupyter notebook --ip=127.0.0.1 --port=31337
  2. load gan_synthesis.ipynb

Acknowledgements

This implementation uses code from the following repos: [NVIDIA's Tacotron 2] (https://github.com/nvidia/tacotron2), Martin Arjovsky and Prem Seetharaman.

We are thankful to Prem Seetharaman and Markus Rabe for their feedback on the early draft of this paper.

We are grateful to NVIDIA for donating the Titan X used in this research.