Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding FastGradientMethod usage and how to force output label to desired value #589

Closed
ftarlao opened this issue Sep 25, 2018 · 10 comments

Comments

@ftarlao
Copy link
Contributor

ftarlao commented Sep 25, 2018

I have a keras model (CNN with final softmax) that is an RGB image classifier that labels input images in 5 categories (one-hot encoded).
A simplified version of my code is the following:

wrap = KerasModelWrapper(model)
fgsm = FastGradientMethod(wrap, sess=session)
fgsm_params = {'eps': 16. / 256,
               'clip_min': 0.,
               'clip_max': 1.
               }
x = tf.placeholder(tf.float32, shape=(None, img_rows, img_cols,
                                      nchannels))
adv_x = fgsm.generate(x, **fgsm_params)
#original image is a tensor containing only one RGB image, shape=(1,48,48,3) 
adv_image = adv_x.eval(session=session, feed_dict={x: original_image})

Question 1
From my understanding, 'eps' is the input variation step (minimum change).
I have noticed that the final outcome is highly affected by eps, sometimes I need high eps in order to obtain an effective adversarial image (having an image O, with label lO sometimes FGM fails to produce adversarial image O' with lO'!= lO, e.g., for lO = [0,0,1,0,0] we still obtain lO' = [0,0,1,0,0], failing to generate an adversarial image with different label.

Does FGM always find out a working adversarial image? Is it normal that it fails? Is there a way to obtain an estimated quality of the generated adversarial image (without predicting with model)? Why is the eps step so important? Is there a way to tell FGM to try harder searching for the adversarial image(e.g, more steps)?

Question 2
I have also experimented y and y_target params. Can you also explain me what are the params 'y', 'y_target'? I thought 'y_target' tells that we want to generate an adversarial image that targets a specific category. For example I thought that y_target = [[0,1,0,0,0]] in feed_dict should force to generate an adversarial image which is classified with the 2th class from the model.
Am I right?
..or do I miss something?
P.s: my problem is that setting y_target fails to produce adversarial images.

please give me few tips..
Regards

@ftarlao
Copy link
Contributor Author

ftarlao commented Sep 25, 2018

In case you prefer, or you like to level up on SO, I have also posted the question here:

https://stackoverflow.com/questions/52501833/doubts-with-cleverhans-fastgradientmethod-fgm-adversarial-image-generation

@npapernot
Copy link
Member

Q1:

  • FGSM (like any attack) is not guaranteed to find an adversarial image that is misclassified by the model because it makes approximations when solving the optimization problem that defines an adversarial example. The attack can fail to find adversarial images for various reasons, one common reason is gradient masking. You can read about it in this blog post and in this paper as well as this paper
  • The eps step is important because it is the magnitude of the perturbation. The attack first computes the direction in which to perturb the image (using gradients of the model) and then takes a step of size eps in that direction. Hence, eps corresponds roughly to what intuitively one would think of the "power" of the attack.
  • You can find a multi-step variant of FGSM in BasicIterativeMethod

Q2: y is used to specify labels in the case of an untargeted attack (any wrong class is considered as success for the adversary) whereas y_target is used to specify a target class in the targeted attack case (the adversary is successful only if the model makes a particular misclassification in a chosen class). It is often the case that targeted attacks require more perturbation (i.e., higher eps values in the FGSM case) than untargeted attacks.

@ftarlao
Copy link
Contributor Author

ftarlao commented Sep 25, 2018

Thank you for the clarifications. I'll read the articles and I'll try also the BasicIterativeMethod.

@ftarlao
Copy link
Contributor Author

ftarlao commented Sep 26, 2018

I copied your answer to SO question. Regards

@michaelshiyu
Copy link
Contributor

I am facing checkpointable error for following line.
adv_x = fgsm.generate_np(x_validation, **fgsm_params)
what can be the reason?

Could you please provide the entire error traceback? Thanks.

@maithal
Copy link

maithal commented Sep 12, 2019

adv_x = fgsm.generate_np(x_validation, **fgsm_params)
[INFO 2019-09-12 15:19:43,534 cleverhans] Constructing new graph for attack FastGradientMethod
Traceback (most recent call last):

File "", line 1, in
adv_x = fgsm.generate_np(x_validation, **fgsm_params)

File "c:\users\maithal1\src\cleverhans\cleverhans\attacks\attack.py", line 186, in generate_np
self.construct_graph(fixed, feedable, x_val, hash_key)

File "c:\users\maithal1\src\cleverhans\cleverhans\attacks\attack.py", line 158, in construct_graph
x_adv = self.generate(x, **new_kwargs)

File "c:\users\maithal1\src\cleverhans\cleverhans\attacks\fast_gradient_method.py", line 50, in generate
labels, _nb_classes = self.get_or_guess_labels(x, kwargs)

File "c:\users\maithal1\src\cleverhans\cleverhans\attacks\attack.py", line 278, in get_or_guess_labels
preds = self.model.get_probs(x)

File "c:\users\maithal1\src\cleverhans\cleverhans\utils_keras.py", line 188, in get_probs
return self.get_layer(x, name)

File "c:\users\maithal1\src\cleverhans\cleverhans\utils_keras.py", line 249, in get_layer
output = self.fprop(x)

File "c:\users\maithal1\src\cleverhans\cleverhans\utils_keras.py", line 225, in fprop
self.keras_model = KerasModel(new_input, layer_outputs)

File "C:\Users\maithal1\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\training.py", line 113, in init
super(Model, self).init(*args, **kwargs)

File "C:\Users\maithal1\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\network.py", line 79, in init
self._init_graph_network(*args, **kwargs)

File "C:\Users\maithal1\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\tensorflow\python\training\checkpointable\base.py", line 364, in _method_wrapper
method(self, *args, **kwargs)

File "C:\Users\maithal1\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\network.py", line 266, in _init_graph_network
self._track_layers(layers)

File "C:\Users\maithal1\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\network.py", line 379, in _track_layers
layer, name='layer-%d' % layer_index, overwrite=True)

File "C:\Users\maithal1\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\tensorflow\python\training\checkpointable\base.py", line 616, in _track_checkpointable
"Checkpointable.") % (type(checkpointable),))

TypeError: Checkpointable._track_checkpointable() passed type <class 'keras.engine.input_layer.InputLayer'>, not a Checkpointable.

This is the entire error traceback.
Thanks!

@michaelshiyu
Copy link
Contributor

adv_x = fgsm.generate_np(x_validation, **fgsm_params)
[INFO 2019-09-12 15:19:43,534 cleverhans] Constructing new graph for attack FastGradientMethod
Traceback (most recent call last):

File "", line 1, in
adv_x = fgsm.generate_np(x_validation, **fgsm_params)

File "c:\users\maithal1\src\cleverhans\cleverhans\attacks\attack.py", line 186, in generate_np
self.construct_graph(fixed, feedable, x_val, hash_key)

File "c:\users\maithal1\src\cleverhans\cleverhans\attacks\attack.py", line 158, in construct_graph
x_adv = self.generate(x, **new_kwargs)

File "c:\users\maithal1\src\cleverhans\cleverhans\attacks\fast_gradient_method.py", line 50, in generate
labels, _nb_classes = self.get_or_guess_labels(x, kwargs)

File "c:\users\maithal1\src\cleverhans\cleverhans\attacks\attack.py", line 278, in get_or_guess_labels
preds = self.model.get_probs(x)

File "c:\users\maithal1\src\cleverhans\cleverhans\utils_keras.py", line 188, in get_probs
return self.get_layer(x, name)

File "c:\users\maithal1\src\cleverhans\cleverhans\utils_keras.py", line 249, in get_layer
output = self.fprop(x)

File "c:\users\maithal1\src\cleverhans\cleverhans\utils_keras.py", line 225, in fprop
self.keras_model = KerasModel(new_input, layer_outputs)

File "C:\Users\maithal1\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\training.py", line 113, in init
super(Model, self).init(*args, **kwargs)

File "C:\Users\maithal1\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\network.py", line 79, in init
self._init_graph_network(*args, **kwargs)

File "C:\Users\maithal1\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\tensorflow\python\training\checkpointable\base.py", line 364, in _method_wrapper
method(self, *args, **kwargs)

File "C:\Users\maithal1\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\network.py", line 266, in _init_graph_network
self._track_layers(layers)

File "C:\Users\maithal1\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\tensorflow\python\keras\engine\network.py", line 379, in _track_layers
layer, name='layer-%d' % layer_index, overwrite=True)

File "C:\Users\maithal1\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\tensorflow\python\training\checkpointable\base.py", line 616, in _track_checkpointable
"Checkpointable.") % (type(checkpointable),))

TypeError: Checkpointable._track_checkpointable() passed type <class 'keras.engine.input_layer.InputLayer'>, not a Checkpointable.

This is the entire error traceback.
Thanks!

Did this help?

@maithal
Copy link

maithal commented Sep 12, 2019

Nope! I am getting the error.

@michaelshiyu
Copy link
Contributor

Nope! I am getting the error.

Could you please provide the training script so that I can try and reproduce the bug?

@maithal
Copy link

maithal commented Sep 12, 2019

http://everettsprojects.com/2018/01/30/mnist-adversarial-examples.html

I am trying the exact same script and exact same method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants