Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to create a model with external predictions? #253

Closed
roughentomologyx opened this issue Dec 16, 2018 · 17 comments
Closed

How to create a model with external predictions? #253

roughentomologyx opened this issue Dec 16, 2018 · 17 comments

Comments

@roughentomologyx
Copy link

Scenario is the following one:
There is a website where i can send a 128x128 colored PNG to and it get's classified either as a cat with confidence >0 or as not-a-cat (confidence =0) ( i have full authorization to use that website how i want to and i can send really as much as i want, to prevent strange questions).
I would like to create an adversarial example with foolbox (is a car, get's classified as a cat with +90% confidence) but i can't seem to get my head around how i can create a model with that, so that i can attack it with foolbox boundary attack. Here's a template of what i have:

images = tf.placeholder(tf.int8, (None, 128, 128, 3))
label = ["cat", "not-a-cat"]

def get_prediction(image):
    #gets the confidence for the target, between 0 and 1 from the website
    return confidence #float32

Any help/hints would be appreciated, thank you in advance.

@wielandbrendel
Copy link
Member

That's actually pretty straight forward. Think about the website as the model. So instead of a tensorflow or pytorch model giving you the predictions, it's the website. Hence, you should implement a subclass of the Foolbox model base class for which you need to implement only batch_predictions and num_classes. It should look similar like this:

from foolbox.models.base import Model

class WebsiteModel(Model):

    def __init__(
            self,
            model,
            bounds,
            num_classes,
            channel_axis=1,
            preprocessing=(0, 1)):

        super(WebsiteModel, self).__init__(bounds=bounds,
                                           channel_axis=channel_axis,
                                           preprocessing=preprocessing)
        self._num_classes = num_classes

    def batch_predictions(self, images):
        return # GET PREDICTIONS OF WEBSITE HERE (should be like logits [shape = batch size, num_classes] but can be zero-one)

    def num_classes(self):
        return self._num_classes

@roughentomologyx
Copy link
Author

roughentomologyx commented Dec 20, 2018

I tried my best to do what you suggested to me but after a couple of epochs of uploading the pictures i always get:

line 63, in call
_, is_adversarial = a.predictions(random_image)
line 309, in predictions
assert predictions.ndim == 1
AssertionError

This happens as soon as i get a prediction for my noise picture to be a cat with confidence >0
Guess i messed up somehwere.

Here is my implementation:

def save_image_as(image, imagename):
   scipy.misc.toimage(image, cmin=0.0, cmax=1.0).save(imagename+".png")

def img2array(imagename):
    img = plt.imread(imagename)
    rows,cols,colors = img.shape
    img_size = rows*cols*colors
    img_1D_vector = img.reshape(img_size)
    return img_1D_vector

def find_in_nestlist(mylist, char):
    for sub_list in mylist:
        if char in sub_list:
            return mylist.index(sub_list)
    raise ValueError("'{char}' is not in list".format(char = char))  
  
class WModel:
    def upload_and_get_result(self, image):
        self.dirpath = os.path.dirname(os.path.realpath(__file__))
        self.driver= webdriver.Chrome(os.path.join(self.dirpath,'chromedriver.exe'))

        try: #upload
            self.driver.get('https://example.com')              
            self.driver.find_element_by_name('image').send_keys(os.path.join(self.dirpath,image)); 
            classify=self.driver.find_element_by_xpath("//input[@type='submit' and @value='Classify']") 
            classify.submit()
        except Exception as e:
            print(e)
           
        try:#get result of form confidence(float32), class("cat","not-a-cat")
            ListOfResults=[]
            result=self.driver.find_element_by_xpath("/html/body/pre").text  # Import results
            SplitResult=result.split('},{', 4)# formatting results
            for i in SplitResult:               
                a=i.strip('[')
                a=a.strip(']')
                a=a.replace('"class":', '')
                a=a.replace('"confidence":', '')
                a=a.strip('{')
                a=a.strip('}')
                ListOfResults.append(a)
            res=ListOfResults[0].split(', ')
            self.driver.quit()
            
            if any("car" in sl for sl in res):
                b0 = float(res[find_in_nestlist(res, "car")][1])
            else:
                b0 = 0
            if any("cat" in sl for sl in res):
                b1 = float(res[find_in_nestlist(res, "cat")][1])
            else:
                b1 = 0
            if any("dog" in sl for sl in res):
                b2 = float(res[find_in_nestlist(res, "dog")][1])
            else:
                b2 = 0
            return np.array([[b0, b1, b2]])      
        except Exception as e:
            print(e)
                        
class WebsiteModel(Model): #wrapper foolbox model for WModel
    def __init__(
            self,
            model=WModel(),
            bounds=(0,1),
            num_classes=3,
            channel_axis=1,
            preprocessing=(0, 1)):
        super(WebsiteModel, self).__init__(bounds=bounds,
                                           channel_axis=channel_axis,
                                           preprocessing=preprocessing)
        self._num_classes = num_classes

    def batch_predictions(self, images):
        
        for image in images:
            rows,cols,colors = 64,64,4
            arr2img = image.reshape(rows,cols,colors)
            save_image_as(arr2img, "currentimg")
            return WModel.upload_and_get_result(self, "currentimg.png")

    def num_classes(self):
        return 3


fmodel = WebsiteModel(WModel,(0,1))
criterion0 = TargetClassProbability(1, p=0.5535189)
img_1D_vector = img2array('car.png')
image, label = img_1D_vector, 0 # get source image and label
start_1D_vector = img2array('catadversarial.png')
attack = foolbox.attacks.BoundaryAttack(fmodel, criterion = criterion0)
adversarial = attack(image, label, tune_batch_size=False, starting_point=start_1D_vector)

i think it might be worth mentioning that the website either gives me confidence 100% for not-a-cat or confidence >0 for beeing a cat.

@wielandbrendel
Copy link
Member

The traceback says it all: the dimension / shape of your predictions is not correct. I am not exactly sure but I think the method upload_and_get_result should return something like [[0, 1]] (which is a 2D array of shape [1, 2] with a leading batch dimension which you need in batch_predictions) in both cases. Please make sure it really does that for all samples.

@roughentomologyx
Copy link
Author

roughentomologyx commented Dec 21, 2018

Thank you so much! I updated the code above.
It works now until excactly step 100 where i get the following error:
File"...\lib\site-packages\foolbox\adversarial.py"
line 337, in batch_predictions
assert predictions.shape[0] == images.shape[0]
I was able to observe that at step 100 images.shape[0] changes from 1 to 2.
What happens at step 100 that changes the value of images.shape[0]?
How can i escape this problem?

@wielandbrendel
Copy link
Member

That's hard to tell - it probably has to do with what is returned by upload_and_get_result. Please take a closer look at that.

@roughentomologyx
Copy link
Author

Okay so what is it supposed to return in upload_and_get_result? Right now it's returning the prediction for a specific class as float and the class identifier and I can't fit more inside a 2d array of shape (1,2).

@wielandbrendel
Copy link
Member

I think I know what happens: the BoundaryAttack uses batching, but your batch_prediction is already returning on the first image. I'd suggest you set "tune_batch_size=False" when you apply the attack, this should solve the issue. You might also want to fix your batch_prediction function to return an array that contains the logits for all images in the batch.

@roughentomologyx
Copy link
Author

roughentomologyx commented Jan 2, 2019

I again updated the code above with what you mentioned and it works better now! I also added a criterion to make sure i can classify my car.png as a cat with over 90% confidence. Also the website model was updated at my request to handle multiple classes and now knows cat, dog and car.
The initialization fails to find an adversarial of the given criterion (label 1(cat) with over 90% confidence), so i tried giving it a starting vector like mentioned here: #117 .

The starting vector in my case is a noise picture that gets classified as a cat with over 90% confidence therefore meeting the criterion. But when the attack tries to evaluate it, it says the starting point is not adversarial:
line 651, in initialize_starting_point assert a.image is not None, ('Invalid starting point provided.' AssertionError: Invalid starting point provided. Please provide a starting point that is adversarial.

Shape and size are the same as the car.png. Could you please again help me with one of your really competent suggestions. Thank you so much in advance!

@wielandbrendel
Copy link
Member

wielandbrendel commented Jan 3, 2019

If you look into the code

if starting_point is not None:
a.predictions(starting_point)
assert a.image is not None, ('Invalid starting point provided.'
' Please provide a starting point'
' that is adversarial.')
you'll see that the method calls a.predictions(starting_point) which is basically equivalent to calling model.predictions(starting_point). I suspect that something with your preprocessing is different between directly calling the API and calling it via model.predictions, and this is the best way to check. Alternatively, are you sure the API is deterministic (i.e. always yields > 90% confidence)?

@roughentomologyx
Copy link
Author

roughentomologyx commented Jan 3, 2019

There is indeed a difference between calling fmodel.predictions and the batch_predictions. fmodel.predictions(startingpoint) returns me [0. 0.91252393 0.00893538] which has a shape of (3,) (and stands for 91% confidence of a cat,).
The batch_predictions that also get's called on the starting point returns me [[0. 0.91252393 0.00893538]] which has the shape of (1, 3).
But that is not really what you meant i guess? If it is, how am i supposed to fix this? Also the only preprocessing i chose was the (0, 1) therefore subtracting 0 and dividing by 1 should just do nothing and as far as i can see this is the default preprocessing for pretty much everything in foolbox, so there shouldn't be a difference right?
I tried to see if the website predictions are deterministic by uploading my starting point 100 times and it always returned me the values above so i guess it is deterministic.

@wielandbrendel
Copy link
Member

The difference between predictions and batch_predictions is expected. I can only guess at this point but basically you have to check why the attack thinks that the starting point is adversarial. This might simply have to do with the adversarial criterion you defined again. Internally the adversarial test is as follows:

https://github.com/bethgelab/foolbox/blob/master/foolbox/adversarial.py#L212-L234

Take the model.prediction and test why the outcome of the above code is not adversarial.

@roughentomologyx
Copy link
Author

roughentomologyx commented Jan 3, 2019

I think i figured the problem out: I thought there are two criteria that allow me to generate pictures with >90% confidence: TargetClassProbability and ConfidentMisclassification. Both call softmax(fmodel.predictions(startingpoint)) which returns in my case [0.22224316 0.55351896 0.22423788] .
For ConfidentMisclassification:

def is_adversarial(self, predictions, label):
        top1 = np.argmax(predictions)
        probabilities = softmax(predictions)
return (np.max(probabilities) >= self.p) and (top1 != label)

So in my case 0.5535 (the probability value) gets >= to 0.9 (i thought this is the prediction value) which is false, therefore the starting point is not adversarial.
I guess i simply missunderstood that p in my case doesn't stand for the predicted value i want to reach but for the probability.
For anyone else who encounters this problem:
print(foolbox.utils.softmax(fmodel.predictions(startingpoint)))

@wielandbrendel
Copy link
Member

@roughentomologyx Great! By default Foolbox thinks that prediction outputs are logits, but that is easy to adapt. You have two options: (1) you write your own criterion (just copy TargetClassProbability and remove the softmax) or (2) you slightly change your model so that it returns logits instead of probabilities (as we typically do for Keras models, check out https://github.com/bethgelab/foolbox/blob/master/foolbox/models/keras.py#L137-L142).

@roughentomologyx
Copy link
Author

Thank you so much. Just changing p is also working fine.
I have 2 last Questions.
1: As far as i understand, the boundaryattack takes the image and starting point and tries to reduce the distance between both. So if my starting point is 90% cat and my image is 0% cat the confidence for the adversarial will meet somewhere in the middle. The more it looks like the original, the lower the confidence. So generating a picture that looks like a car and get's classified as a cat with >90% seems pretty impossible unless i find a pciture of a car that already get's classified as like 80% cat+.
Is there any decision based model or any other way with foolbox that would help me achieve my goal?
2: You helped me so much and did some really good work with foolbox, is there any way to support you or to donate something to you guys?

@wielandbrendel
Copy link
Member

@roughentomologyx Thanks for your kind words. You can always support Foolbox by spreading the word or by implementing a new attack. Also, we are always happy if you cite the Foolbox and the Boundary Attack in your publications.

As for you question, the Boundary attack will try to find an adversarial image that is adversarial (e.g. > 90% confidence) yet looks as closely as possible as the original image. That is usually possible and you can often not notice a difference between original and adversarial image after optimisation. That's the (sometimes unsettling) essence of adversarial vulnerability.

@roughentomologyx
Copy link
Author

Is there a way to stop the attack from converging? I always get attack converged after like 1500 steps. In your paper you run the attack like 20000 times to get optimal results. I want to do that to.

@jonasrauber
Copy link
Member

Closing this due to lack of activity. Please reopen if it's still relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants