Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant and fundamental flaws in methodology, analysis, and conclusions #13

Open
carlini opened this issue Feb 26, 2019 · 1 comment

Comments

@carlini
Copy link

carlini commented Feb 26, 2019

This framework is designed to "systematically evaluate the existing adversarial attack and defense methods". The research community would be well served by such an analysis. When new defenses are proposed, authors must choose which set of attacks to apply in order to perform an evaluation. A systematic evaluation of which attacks have been most effective in the past could help inform the decision of which attacks should be tried in the future. Similarly, when designing new attacks, a comprehensive review of defenses could help researchers decide which defenses to test against.

Unfortunately, the analysis performed in the DeepSec paper is fundamentally flawed and does not achieve any of these goals. It neither accurately measures the power of attacks not measures the efficacy of defenses. I have filed a number of issues that summarizes the many ways in which the report is misleading in its methodology and analysis. (Almost all of the conclusions are misleading as a result of these other flaws. I do not make comments on the conclusions but I expect they will need to be completely re-written once true results are obtained.)

The issues raised are ordered roughly by importance:

#1 Attacks are not run on defenses in an all-pairs manner
#2 Paper uses averages instead of the minimum for security analysis
#3 FGSM implementation is incorrect
#4 PGD adversarial training is implemented incorrectly
#5 Computing the average over different threat models is meaningless
#6 Comparing attack effectiveness is done incorrectly
#7 Epsilon values studied are too large to be meaningful
#8 Detection defenses set per-attack thresholds
#9 Attack success rate decreases with distortion bound
#10 Reporting success rate of unbounded attacks is meaningless
#11 Paper does not report attack success rate for targeted adversarial examples
#12 Discrepancies between tables, text, and code

@ryderling
Copy link
Owner

This framework is designed to "systematically evaluate the existing adversarial attack and defense methods". The research community would be well served by such an analysis. When new defenses are proposed, authors must choose which set of attacks to apply in order to perform an evaluation. A systematic evaluation of which attacks have been most effective in the past could help inform the decision of which attacks should be tried in the future. Similarly, when designing new attacks, a comprehensive review of defenses could help researchers decide which defenses to test against.

Unfortunately, the analysis performed in the DeepSec paper is fundamentally flawed and does not achieve any of these goals. It neither accurately measures the power of attacks not measures the efficacy of defenses. I have filed a number of issues that summarizes the many ways in which the report is misleading in its methodology and analysis. (Almost all of the conclusions are misleading as a result of these other flaws. I do not make comments on the conclusions but I expect they will need to be completely re-written once true results are obtained.)

The issues raised are ordered roughly by importance:

#1 Attacks are not run on defenses in an all-pairs manner
#2 Paper uses averages instead of the minimum for security analysis
#3 FGSM implementation is incorrect
#4 PGD adversarial training is implemented incorrectly
#5 Computing the average over different threat models is meaningless
#6 Comparing attack effectiveness is done incorrectly
#7 Epsilon values studied are too large to be meaningful
#8 Detection defenses set per-attack thresholds
#9 Attack success rate decreases with distortion bound
#10 Reporting success rate of unbounded attacks is meaningless
#11 Paper does not report attack success rate for targeted adversarial examples
#12 Discrepancies between tables, text, and code

Many thanks for your careful reviews of our methodologies, experiments, and analysis.

To point out first, we evaluate the existing adversarial attack/defense methods in the non-adaptive scenarios from a statistical or practical security perspective. That is, we try to make the overall observation of different types of attacks or defenses, instead of only reporting one particular attack/defense is secure or not in the worst-case. We hope different metrics can be useful for users with different purposes instead of only reporting the worst case results as suggested before which is essentially saying that no model is secure under strong adaptive attackers.

In addition, DEEPSEC is the first step to evaluate the performance of adversarial attacks and defenses towards a deeper understanding of adversarial machine learning. Although those observations we made in our evaluations might not 100% exact given other various settings or scenarios, we think more people can reuse this repository to examine other datasets, applications, and settings, and more interesting and meaningful observations can be obtained for different kinds of purposes. That is the reason we open source DEEPSEC and welcome more contributions and discussion in this direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants