<h1> Toy Example for Breaking Linear Classifiers

The process for breaking linear classifiers is most easily understood if we take the simplest example that describes the problem. In this example, we take a binary logistic regression and have it map the input to one of two classes, which can be either class 1 or class 0.

First, we procure our input and the weights we will use to map the classifier.

In [None]:
import numpy

# training set
x = numpy.array([2, -1, 3, -2, 2, 2, 1, -4, 5, 1])
# weights
w = numpy.array([-1, -1, 1, -1, 1, -1, 1, 1, -1, 1])

In the next step, we do the dot product of the input and the weights and we run this through a sigmoid function to get the probability of the classifier.

In [None]:
# dot product
prod = numpy.dot(x, w)
# get the classifier probability
probability(sigmoid(prod))

The purpose of the *probability* function is to tell us whether the class of the input is either class 1 or class 0, and tells us how confident the classifier isin its prediction. It was found that the result of the dot product **-3**, and applying the sigmoid function to this, the result is **0.0474**. Correspondingly,  the probability function prints out the follwing:

In [None]:
import numpy
import math

# sigmoid function
def sigmoid(z):
    return 1/(1 + math.exp(-z))

# classifier probability
def probability(sig_val):
    if sig_val < 0.5:
        result = "Classifier is {}% certain that value is class 0".format(round((1-sig_val)*100, 2))
        print(result)
    else:
        result = "Classifier is {}% certain that value is class 1".format(round(sig_val*100, 2))
        print(result)


def main():
    # training set
    x = numpy.array([2, -1, 3, -2, 2, 2, 1, -4, 5, 1])
    # weights
    w = numpy.array([-1, -1, 1, -1, 1, -1, 1, 1, -1, 1])

    # dot product
    prod = numpy.dot(x, w)
    # get the classifier probability
    probability(sigmoid(prod))

main()

In order to break the classifier, we need to shift the values of the input in a certain direction by the samllest amount possible so as to get the classifier to predict the wrong class. The reason we want to shift by a small amount is because we want to leave the input as seemingly untouched as possible. 

For example, imagine if the input was the pixel coordinates of an image. In that case, we want to change the values of the coordinates by such an amount so that the image looks the same to the naked eye, but still tricks the classifier.

We know that the dot product is the directional growth of one vector to another. So, in order to increase the result of the dot product we need to increase this directional growth. Where the weight is positive, we increase the corresponding input by 0.5, and when the weight is negative we decrease it by the same amount. Our new input look as follows:

In [None]:
import numpy

# training set
x = numpy.array([2, -1, 3, -2, 2, 2, 1, -4, 5, 1])
# weights
w = numpy.array([-1, -1, 1, -1, 1, -1, 1, 1, -1, 1])

# increase training data by a fraction of each weight
xad = x + 0.5*w

print(xad)

The dot product is now **2**, which is greater than the previous result of **-3**, and we only changed each input value by a small amount. Applying sigmoid to this result yields the value **0.88**. Now, if we run the following code, the probability function will predict something entirely different.

In [None]:
import numpy
import math

# sigmoid function
def sigmoid(z):
    return 1/(1 + math.exp(-z))

# classifier probability
def probability(sig_val):
    if sig_val < 0.5:
        result = "Classifier is {}% certain that value is class 0".format(round((1-sig_val)*100, 2))
        print(result)
    else:
        result = "Classifier is {}% certain that value is class 1".format(round(sig_val*100, 2))
        print(result)


def main():
    # training set
    x = numpy.array([2, -1, 3, -2, 2, 2, 1, -4, 5, 1])
    # weights
    w = numpy.array([-1, -1, 1, -1, 1, -1, 1, 1, -1, 1])

    # increase training data by a fraction of each weight
    xad = x + 0.5*w

    # get the new classifier probability
    prod = numpy.dot(xad, w)
    probability(sigmoid(prod))

main()