Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeepFool doesn't exactly match the latest reference implementation #283

Closed
jonasrauber opened this issue Mar 14, 2019 · 12 comments
Closed

Comments

@jonasrauber
Copy link
Member

This was reported to me by @max-andr. Most of the differences are actually explicitly mentioned in comments in our implementation, but we should check again if we can match the reference implementation more closely and possible mention deviations in the docs, not just in comments.

@max-andr might create a PR to fix this

@jonasrauber
Copy link
Member Author

related to #282

@jonasrauber
Copy link
Member Author

Apparently, the DeepFool reference implementation changed after release (and after the Foolbox version was created), which explains some of the undocumented deviations, see e.g. LTS4/DeepFool@10cf642

@jonasrauber
Copy link
Member Author

In case we switch to logits instead of softmax, we might want to keep softmax as an option.

@wielandbrendel
Copy link
Member

By softmax you mean cross entropy? The difference of cross entropies is equivalent to the difference in logits.

@jonasrauber
Copy link
Member Author

yes, I meant cross-entropy

@wielandbrendel
Copy link
Member

Ok, but at least in that sentence our cross-entropy implementation should be equivalent to the logit-based original implementation.

@jonasrauber
Copy link
Member Author

in that sentence

which sentence?

@max-andr
Copy link

By softmax you mean cross entropy? The difference of cross entropies is equivalent to the difference in logits.

Sorry, I overlooked the fact that they are exactly equivalent. Then the cross-entropy part is fine.
Then the main problem that we encountered was actually #282 .

So regarding this issue, there is only one question left.
In the official pytorch implementation the overshooting is added on every step. However, the original paper suggests to do this overshooting only in the end, which also agrees with the official matlab implementation.
Foolbox implements the former, and adds the overshooting term on every iteration:
perturbed = perturbed + 1.05 * perturbation

This difference is explicitly mentioned in the comments, so potentially Foolbox users are aware of this, so this is good.

However, a potential problem with this implementation (I'm not sure how thoroughly the authors of DeepFool tested their pytorch implementation) is that this kind of overshooting may fail in some cases. Namely, we observed that in some cases perturbation can already become a 0-vector (i.e. the point is at the decision boundary), and thus on every iteration we just add 1.05*0 = 0. So the point stays exactly at the decision boundary, and not on its opposite side as the idea of overshooting would suggest.

I think that it's actually fine to some extent (although it differs from the original paper), but the main question is whether you count such a point (if 2 classes have exactly the same maximum logit) as an adversarial example in the end? Or is it decided non-deterministically, which class is argmax in the end?

@jonasrauber
Copy link
Member Author

If I am not mistaken, it is deterministic and well-defined (numpy.argmax returns the smaller one) nevertheless it might not necessary be what we want.

@wielandbrendel
Copy link
Member

wielandbrendel commented Mar 15, 2019

The Pytorch implementation performs the overshoot in a better way by multiplying the total deviation:

x_adv = x_original + (1 + overshoot) * (x_adv - x_original)

@max-andr
Copy link

If I am not mistaken, it is deterministic and well-defined (numpy.argmax returns the smaller one)

Indeed. From the docs:
"In case of multiple occurrences of the maximum values, the indices corresponding to the first occurrence are returned."

The Pytorch implementation performs the overshoot in a better way by multiplying the total deviation:
x_adv = x_original + (1 + overshoot) * (x_adv - x_original)

Oh yes, I didn't notice that. The variant with multiplying the total perturbation should work better.

@max-andr
Copy link

nevertheless it might not necessary be what we want.

Seems like the Misclassification criterion would work (i.e. output that the point is adversarial) roughly 50% times for the cases when 2 logits are exactly the same:
https://github.com/bethgelab/foolbox/blob/master/foolbox/criteria.py#L184

def is_adversarial(self, predictions, label):
        top1 = np.argmax(predictions)
        return top1 != label

i.e. if top1 < label, then the point on the decision boundary between 2 classes will be counted as an adversarial example, but if top1 > label, then not.

Although a proper overshooting scheme (i.e. with multiplying the total perturbation by 1+overshoot) will make such cases on the decision boundary extremely unlikely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants