ETH Robustness Analyzer for Deep Neural Networks
Switch branches/tags
Nothing to show
Clone or download
mmirman added overview
its an overview
Latest commit 40ece34 Nov 18, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data initial commit Nov 14, 2018
tf-verify initial commit Nov 14, 2018
LICENSE initial commit Nov 14, 2018
README.md Update README.md Nov 18, 2018
install.sh initial commit Nov 14, 2018
overview.png added overview Nov 18, 2018
requirements.txt initial commit Nov 14, 2018

README.md

ERAN portfolio_view

High Level

ETH Robustness Analyzer for Neural Networks (ERAN) is a state-of-the-art sound, precise, and scalable analyzer based on abstract interpretation. ERAN is developed at the SRI Lab, Department of Computer Science, ETH Zurich as part of the Safe AI project. The goal of ERAN is to automatically verify robustness of neural networks with feedforward, convolutional, and residual layers against input perturbations (e.g., L∞-norm attacks, geometric transformations, etc).

ERAN supports networks with ReLU, Sigmoid and Tanh activations and is sound under floating point arithmetic. It employs custom abstract domains which are specifically designed for the setting of neural networks and which aim to balance scalability and precision. Specifically, ERAN is based on two abstract domains:

  • DeepZ [NIPS'18]: Zonotope domain containing specialized abstract transformers for handling ReLU, Sigmoid and Tanh activation functions.

  • DeepPoly [POPL'19]: A domain that combines floating point Polyhedra with Intervals.

Both domains are implemented in the ELINA library for numerical abstractions. More details can be found in the publications below.

Requirements

GNU C compiler, ELINA

python3.6 or higher, tensorflow 1.11 or higher, numpy.

Installation

Clone the ERAN repository via git as follows:

git clone https://github.com/eth-sri/ERAN.git
cd ERAN

The dependencies for ERAN can be installed step by step as follows (sudo rights might be required):

Install m4:

wget ftp://ftp.gnu.org/gnu/m4/m4-1.4.1.tar.gz
tar -xvzf m4-1.4.1.tar.gz
cd m4-1.4.1
./configure
make
make install
cp src/m4 /usr/bin
cd ..
rm m4-1.4.1.tar.gz

Install gmp:

wget https://gmplib.org/download/gmp/gmp-6.1.2.tar.xz
tar -xvf gmp-6.1.2.tar.xz
cd gmp-6.1.2
./configure --enable-cxx
make
make install
cd ..
rm gmp-6.1.2.tar.xz

Install mpfr:

wget https://www.mpfr.org/mpfr-current/mpfr-4.0.1.tar.xz
tar -xvf mpfr-4.0.1.tar.xz
cd mpfr-4.0.1
./configure
make
make install
cd ..
rm mpfr-4.0.1.tar.xz

Install ELINA:

git clone https://github.com/eth-sri/ELINA.git
cd ELINA
./configure
make
make install

Update the LD_LIBRARY_PATH:

export LD_BLIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib

We also provide an install script that will install ELINA and all the necessary dependencies. One can run it as follows:

sudo./install.sh

To install the python dependencies (numpy and tensorflow), type:

pip3 install -r requirements.txt

ERAN may not be compatible with older versions of tensorflow (we have tested ERAN with versions >= 1.11.0), so if you have an older version and want to keep it, then we recommend using the python virtual environment for installing tensorflow.

Usage

cd tf-verify

python3 . <netname> <epsilon> <domain> <dataset>
  • <netname>: path to the network file.

  • <epsilon>: A float value between 0 and 1 specifying bound for the L∞-norm based perturbation.

  • <domain>: can be either 'deepzono' (for the DeepZ domain) or 'deeppoly'. Note that the residual layers are currently only supported with the DeepZ domain.

  • <dataset>: can be either 'mnist' or 'cifar10'

Example

python3 . ../nets/pytorch/mnist/convBig__DiffAI.pyt 0.1 deepzono mnist

will evaluate the local robustness of the MNIST convolutional network (upto 35K neurons) with ReLU activation trained using DiffAI on the 100 MNIST test images. In the above setting, epsilon=0.1 and the domain used by our analyzer is the deepzono domain. Our analyzer will print the following:

  • 'Verified' for an image when it can prove the robustness of the network and 'Failed' when it cannot. It will also print an error message when the network misclassifies an image.

  • the timing in seconds.

  • The ratio of images on which the network is robust versus the number of images on which it classifies correctly.

Publications

Neural Networks and Datasets

We provide a number of pretrained MNIST and CIAFR10 defended and undefended feedforward and convolutional neural networks with ReLU, Sigmoid and Tanh activations trained with the PyTorch and TensorFlow frameworks. The adversarial training to obtain the defended networks is performed using PGD and DiffAI.

Dataset Model Type #units #layers Activation Training Defense Download
MNIST 3x100 fully connected 310 3 ReLU None ⬇️
6x100 fully connected 610 6 ReLU None ⬇️
9x200 fully connected 1,810 9 ReLU None ⬇️
6x500 fully connected 3,010 6 ReLU None ⬇️
6x500 fully connected 3,010 6 ReLU PGD ε=0.1 ⬇️
6x500 fully connected 3,010 6 ReLU PGD ε=0.3 ⬇️
6x500 fully connected 3,010 6 Sigmoid None ⬇️
6x500 fully connected 3,010 6 Sigmoid PGD ε=0.1 ⬇️
6x500 fully connected 3,010 6 Sigmoid PGD ε=0.3 ⬇️
6x500 fully connected 3,010 6 Tanh None ⬇️
6x500 fully connected 3,010 6 Tanh PGD ε=0.1 ⬇️
6x500 fully connected 3,010 6 Tanh PGD ε=0.3 ⬇️
4x1024 fully connected 4,106 4 ReLU None ⬇️
ConvSmall convolutional 3,604 3 ReLU None ⬇️
ConvSmall convolutional 3,604 3 ReLU PGD ⬇️
ConvSmall convolutional 3,604 3 ReLU DiffAI ⬇️
ConvMed convolutional 4,804 3 ReLU None ⬇️
ConvMed convolutional 4,804 3 ReLU PGD ε=0.1 ⬇️
ConvMed convolutional 4,804 3 ReLU PGD ε=0.3 ⬇️
ConvMed convolutional 4,804 3 Sigmoid None ⬇️
ConvMed convolutional 4,804 3 Sigmoid PGD ε=0.1 ⬇️
ConvMed convolutional 4,804 3 Sigmoid PGD ε=0.3 ⬇️
ConvMed convolutional 4,804 3 Tanh None ⬇️
ConvMed convolutional 4,804 3 Tanh PGD ε=0.1 ⬇️
ConvMed convolutional 4,804 3 Tanh PGD ε=0.3 ⬇️
ConvMaxpool convolutional 13,798 9 ReLU None ⬇️
ConvBig convolutional 34,688 6 ReLU DiffAI ⬇️
ConvSuper convolutional 88,500 6 ReLU DiffAI ⬇️
Skip Residual 71,650 9 ReLU DiffAI ⬇️
CIFAR10 4x100 fully connected 410 4 ReLU None ⬇️
6x100 fully connected 610 6 ReLU None ⬇️
9x200 fully connected 1,810 9 ReLU None ⬇️
6x500 fully connected 3,010 6 ReLU None ⬇️
6x500 fully connected 3,010 6 ReLU PGD ε=0.0078 ⬇️
6x500 fully connected 3,010 6 ReLU PGD ε=0.0313 ⬇️
6x500 fully connected 3,010 6 Sigmoid None ⬇️
6x500 fully connected 3,010 6 Sigmoid PGD ε=0.0078 ⬇️
6x500 fully connected 3,010 6 Sigmoid PGD ε=0.0313 ⬇️
6x500 fully connected 3,010 6 Tanh None ⬇️
6x500 fully connected 3,010 6 Tanh PGD ε=0.0078 ⬇️
6x500 fully connected 3,010 6 Tanh PGD ε=0.0313 ⬇️
7x1024 fully connected 7,178 7 ReLU None ⬇️
ConvSmall convolutional 4,852 3 ReLU None ⬇️
ConvSmall convolutional 4,852 3 ReLU PGD ⬇️
ConvSmall convolutional 4,852 3 ReLU DiffAI ⬇️
ConvMed convolutional 6,244 3 ReLU None ⬇️
ConvMed convolutional 6,244 3 ReLU PGD ε=0.0078 ⬇️
ConvMed convolutional 6,244 3 ReLU PGD ε=0.0313 ⬇️
ConvMed convolutional 6,244 3 Sigmoid None ⬇️
ConvMed convolutional 6,244 3 Sigmoid PGD ε=0.0078 ⬇️
ConvMed convolutional 6,244 3 Sigmoid PGD ε=0.0313 ⬇️
ConvMed convolutional 6,244 3 Tanh None ⬇️
ConvMed convolutional 6,244 3 Tanh PGD ε=0.0078 ⬇️
ConvMed convolutional 6,244 3 Tanh PGD ε=0.0313 ⬇️
ConvMaxpool convolutional 53,938 9 ReLU None ⬇️
ConvBig convolutional 62,464 6 ReLU DiffAI ⬇️

We provide the first 100 images from the testset of both MNIST and CIFAR10 datasets in the 'data' folder. Our analyzer first verifies whether the neural network classifies an image correctly before performing robustness analysis.

Experimental Results

We ran our experiments for the feedforward networks on a 3.3 GHz 10 core Intel i9-7900X Skylake CPU with a main memory of 64 GB whereas our experiments for the convolutional networks were run on a 2.6 GHz 14 core Intel Xeon CPU E5-2690 with 512 GB of main memory. We first compare the precision and performance of DeepZ and DeepPoly vs Fast-Lin on the MNIST 6x100 network in single threaded mode. It can be seen that DeepZ has the same precision as Fast-Lin whereas DeepPoly is more precise while also being faster.

High Level

In the following, we compare the precision and performance of DeepZ and DeepPoly on a subset of the neural networks listed above in multi-threaded mode. In can be seen that DeepPoly is overall more precise than DeepZ but it is slower than DeepZ on the convolutional networks.

High Level

High Level

High Level

High Level

The table below compares the performance and precision of DeepZ and DeepPoly on our large networks trained with DiffAI.

Dataset Model ε % Verified Robustness % Average Runtime (s)
DeepZ DeepPoly DeepZ DeepPoly
MNIST ConvBig 0.1 97 97 5 50
ConvBig 0.2 79 78 7 61
ConvBig 0.3 37 43 17 88
ConvSuper 0.1 97 97 133 400
Skip 0.1 95 N/A 29 N/A
CIFAR10 ConvBig 0.006 50 52 39 322
ConvBig 0.008 33 40 46 331

More experimental results can be found in our papers.

Contributors

License and Copyright