Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paper Discussion: Adversarial Logit Pairing #40

Closed
Simsso opened this issue Sep 2, 2018 · 2 comments
Closed

Paper Discussion: Adversarial Logit Pairing #40

Simsso opened this issue Sep 2, 2018 · 2 comments
Assignees
Labels
meeting Important key results of meetings and appointment reminders research Scientific items

Comments

@Simsso
Copy link
Owner

Simsso commented Sep 2, 2018

Discussion of the paper "Adversarial Logit Pairing" by Harini Kannan, Alexey Kurakin, Ian Goodfellow (16 Mar 2018).

Let's see whether having an issue for a discussion event is helpful or unnecessary overhead. At least it's a good way of documenting it.

Abstract:

In this paper, we develop improved techniques for defending against adversarial examples at scale. First, we implement the state of the art version of adversarial training at unprecedented scale on ImageNet and investigate whether it remains effective in this setting - an important open scientific question (Athalye et al., 2018). Next, we introduce enhanced defenses using a technique we call logit pairing, a method that encourages logits for pairs of examples to be similar. When applied to clean examples and their adversarial counterparts, logit pairing improves accuracy on adversarial examples over vanilla adversarial training; we also find that logit pairing on clean examples only is competitive with adversarial training in terms of accuracy on two datasets. Finally, we show that adversarial logit pairing achieves the state of the art defense on ImageNet against PGD white box attacks, with an accuracy improvement from 1.5% to 27.9%. Adversarial logit pairing also successfully damages the current state of the art defense against black box attacks on ImageNet (Tramer et al., 2018), dropping its accuracy from 66.6% to 47.1%. With this new accuracy drop, adversarial logit pairing ties with Tramer et al.(2018) for the state of the art on black box attacks on ImageNet.

@Simsso Simsso added meeting Important key results of meetings and appointment reminders research Scientific items labels Sep 2, 2018
@Simsso Simsso added this to the 11. Working Group Meeting milestone Sep 2, 2018
@Simsso
Copy link
Owner Author

Simsso commented Sep 3, 2018

Just fyi: I have added

  • the attack "PGD" to our wiki, as well as
  • the corresponding defense, and
  • the defense "Mixup".

PGD... Not to be confused with PDG (paper discussion group)

@Simsso
Copy link
Owner Author

Simsso commented Sep 4, 2018

Potential follow-up paper:
Certified Defenses against Adversarial Examples (rather out of curiosity than relevance for the challenge)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meeting Important key results of meetings and appointment reminders research Scientific items
Projects
None yet
Development

No branches or pull requests

2 participants