Evaluating Robustness of Neural Networks with Mixed Integer Programming
Branch: master
Clone or download
vtjeng find_adversarial_example: provide solve time when solver does not sup…
…port getsolvetime (#11)

Some solvers (for exampe, Cbc) do not support getsolvetime. Previously, we used to specify a default :SolveTime of -1; we fix this by actually timing the solve process.
Latest commit aa8e164 Nov 21, 2018



Build Status codecov.io

A package for evaluating the robustness of neural networks using Mixed Integer Programming (MIP). See the companion paper for full details and results.

Evaluating Robustness of Neural Networks with Mixed Integer Programming Vincent Tjeng, Kai Xiao, Russ Tedrake https://arxiv.org/abs/1711.07356

Getting Started

See the documentation for installation instructions, a quick-start guide, and additional examples. Installation should only take a couple of minutes, including installing Julia itself.

Why Verify Neural Networks?

Neural networks trained only to optimize for training accuracy have been shown to be vulnerable to adversarial examples, with small perturbations to input potentially leading to large changes in the output. In the context of image classification, the perturbed input is often indistinguishable from the original input, but can lead to misclassifications into any target category chosen by the adversary.

There is now a large body of work proposing defense methods to produce classifiers that are more robust to adversarial examples. However, as long as a defense is evaluated only via attacks that find local optima, we have no guarantee that the defense actually increases the robustness of the classifier produced.

Fortunately, we can evaluate robustness to adversarial examples in a principled fashion. One option is to determine (for each test input) the minimum distance to the closest adversarial example, which we call the minimum adversarial distortion. The second option is to determine the adversarial test accuracy, which is the proportion of the test set for which no bounded perturbation causes a misclassification. An increase in the mean minimum adversarial distortion or in the adversarial test accuracy indicates an improvement in robustness.

Determining the minimum adversarial distortion for some input (or proving that no bounded perturbation of that input causes a misclassification) corresponds to solving an optimization problem. For piecewise-linear neural networks, the optimization problem can be expressed as a mixed-integer linear programming (MILP) problem.


MIPVerify.jl translates your query on the robustness of a neural network for some input into an MILP problem, which can then be solved by any solver supported by JuMP. Efficient solves are enabled by tight specification of ReLU and maximum constraints and a progressive bounds tightening approach where time is spent refining bounds only if doing so could provide additional information to improve the problem formulation.

The package provides

  • High-level abstractions for common types of neural network layers:
    • Layers that are linear transformations (fully-connected, convolution, and average-pooling layers)
    • Layers that use piecewise-linear functions (ReLU and maximum-pooling layers)
  • Support for bounding perturbations to:
    • Perturbations of bounded l-infty norm
    • Perturbations where the image is convolved with an adversarial blurring kernel
  • Utility functions for:
    • Evaluating the robustness of a network on multiple samples in a dataset, with good support for pausing and resuming evaluation or running solvers with different parameters
  • MNIST and CIFAR10 datasets for verification
  • Sample neural networks, including the networks verified in our paper.

Results in Brief

Below is a modified version of Table 1 from our paper, where we report the adversarial error for classifiers to bounded perturbations with l-infinity norm-bound eps. For our verifier, a time limit of 120s per sample is imposed. Gaps between our bounds correspond to cases where the solver reached the time limit for some samples. Error is over the full MNIST test set of 10,000 samples.

Dataset Training Approach eps Lower
(PGD Error)
Name in package*
MNIST Wong et al. (2017) 0.1 4.11% 4.38% 5.82% 4.38% MNIST.WK17a_linf0.1_authors
MNIST Ragunathan et al. (2018) 0.1 11.51% 14.36% 34.77% 30.81% MNIST.RSL18a_linf0.1_authors

^ Values in this column represent previous state-of-the-art (SOA), as described in our paper.
* Neural network available for import via listed name using get_example_network_params.

Citing this Library

  title={Evaluating Robustness of Neural Networks with Mixed Integer Programming},
  author={Tjeng, Vincent and Xiao, Kai and Tedrake, Russ},
  journal={arXiv preprint arXiv:1711.07356},