Skip to content
Robust Audio Adversarial Example for a Physical Attack
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore
LICENSE
README.md
attack.py
filterbanks.npy
make_checkpoint.py
recognize.py
record.py
tf_logits.py
weight_decay_optimizers.py

README.md

Robust Audio Adversarial Example for a Physical Attack

This repository includes the implementation of our paper Robust Audio Adversarial Example for a Physical Attack.

Usage

Preparation

  1. Install the dependencies
  • librosa
  • numpy
  • pyaudio
  • scipy
  • tensorflow (<= 1.8.0)
  1. Clone Mozilla's DeepSpeech into a directory named DeepSpeech
$ git clone https://github.com/mozilla/DeepSpeech.git
  1. Change the version of DeepSpeech
$ cd DeepSpeech
$ git checkout tags/v0.1.0
$ cd ..
  1. Download the DeepSpeech model
$ wget https://github.com/mozilla/DeepSpeech/releases/download/v0.1.0/deepspeech-0.1.0-models.tar.gz
$ tar -xzf deepspeech-0.1.0-models.tar.gz
  1. Verify that you have downloaded the correct file
$ md5sum models/output_graph.pb 
08a9e6e8dc450007a0df0a37956bc795  output_graph.pb
  1. Convert the .pb to a TensorFlow checkpoint file
$ python make_checkpoint.py
  1. Collect room impulse responses and convert them into 16kHz mono wav files.

You can refer many databases from this script: https://github.com/kaldi-asr/kaldi/blob/master/egs/aspire/s5/local/multi_condition/prepare_impulses_noises.sh

Generate adversarial examples

Let us sample.wav to be a 16kHz mono wav file to add perturbations, and rir to be a directory containing room impulse responses.

Then, you can generate audio adversarial examples recognized as "hello world" as follows:

$ mkdir results
$ python attack.py --in sample.wav --imp rir/*.wav --target "hello world" --out results

Test generated adversarial examples

  1. Playback and record the generated adversarial example
$ python record.py ae.wav ae_recorded.wav
  1. Recognize with the DeepSpeech
$ python recognize.py models/output_graph.pb ae_recorded.wav models/alphabet.txt models/lm.binary models/trie
  1. Check the obtained transcription
$ cat ae_recorded.txt
hello world

Notice

The most of the source code in this repository is based on https://github.com/carlini/audio_adversarial_examples and distributed under the same license (2-clause BSD) except for the following files.

Citation

If you use this code for your research, please cite this paper:

@article{yakura2018robust,
  title={Robust Audio Adversarial Example for a Physical Attack},
  author={Yakura, Hiromu and Jun, Sakuma},
  journal={arXiv preprint arXiv:1810.11793},
  year={2018}
}
You can’t perform that action at this time.