Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancies between tables, text, and code #12

Open
carlini opened this issue Feb 26, 2019 · 7 comments
Open

Discrepancies between tables, text, and code #12

carlini opened this issue Feb 26, 2019 · 7 comments

Comments

@carlini
Copy link

carlini commented Feb 26, 2019

Table XIII states that on CIFAR-10 the R+FGSM attack was executed with eps=0.05 and alpha=0.05 whereas the README in the Attack module of the open source code gives eps=0.1 and alpha=0.5. I assume the code is correct and the table is wrong. Table XIII states that the “box” constraint for CWL2 is set to -0.5, 0.5 but in the code the (correct) values of 0.0, 1.0 are used.

Other hyperparameters are completely missing (e.g., Table XIII does not give the number of iterations used for any of the gradient-based attacks). This is especially confusing when the default values differ from the original attack implementations; for example, this code sets the number of binary search steps for CW2 to 5 (and does not state this in the paper) whereas the original code uses the value 10; fortunately, this setting often has only a minimal impact on accuracy.

@ryderling
Copy link
Owner

Table XIII states that on CIFAR-10 the R+FGSM attack was executed with eps=0.05 and alpha=0.05 whereas the README in the Attack module of the open source code gives eps=0.1 and alpha=0.5. I assume the code is correct and the table is wrong. Table XIII states that the “box” constraint for CWL2 is set to -0.5, 0.5 but in the code the (correct) values of 0.0, 1.0 are used.

Other hyperparameters are completely missing (e.g., Table XIII does not give the number of iterations used for any of the gradient-based attacks). This is especially confusing when the default values differ from the original attack implementations; for example, this code sets the number of binary search steps for CW2 to 5 (and does not state this in the paper) whereas the original code uses the value 10; fortunately, this setting often has only a minimal impact on accuracy.

As all codes have been re-constructed for better readability and consistency after we finished the paper, some discrepancies do exists, but most of them do not affect the final result. For example, for the epsilon and alpha, both of code and table in Table XIII are correct because these two notations are not exactly matched but have the same results (‘eps’=’alpha’=0.05 in the paper is the absolute value of epsilon distortion, while in codes eps=0.1 is the total value of epsilon distortion, and alpha=0.5 is the ratio of random, so the absolute value of random alpha epsilon is 0.1*0.5=0.05.). In order to be consistent with the box constraint in other attacks (FGSM, BIM, etc), we make the CWL2 constraint same with others.

In general, we detailed hyperparameters that have a great impact on the attack or defense instead of all the parameters. As for the binary search step value, we suggest you to carefully check the default value in CW2_Generation.py
https://github.com/kleincup/DEEPSEC/blob/master/Attacks/CW2_Generation.py#L106
It is 10 binary search steps, not 5 as you mentioned.

@carlini
Copy link
Author

carlini commented Mar 16, 2019

If you read the original paper that proposes R+FGSM it defines alpha as the initial step size that's taken randomly, and then (epsilon-alpha) as the gradient step size. So clearly according to this definition the table numbers are incorrect: otherwise the gradient step taken would be wrong. To make consistent notation I would suggest you change this.

I don't understand what you mean by the box constraint is [-0.5, 0.5] on CW2. Setting this to the box constraint would clip actual images that you have which are in [0,1] to a maximum value of solid grey and allow the attack to introduce values of -0.5 which isn't within the typical data.

Again, this is a minor issue compared to the others.

@ryderling
Copy link
Owner

If you read the original paper that proposes R+FGSM it defines alpha as the initial step size that's taken randomly, and then (epsilon-alpha) as the gradient step size. So clearly according to this definition the table numbers are incorrect: otherwise the gradient step taken would be wrong. To make consistent notation I would suggest you change this.

I don't understand what you mean by the box constraint is [-0.5, 0.5] on CW2. Setting this to the box constraint would clip actual images that you have which are in [0,1] to a maximum value of solid grey and allow the attack to introduce values of -0.5 which isn't within the typical data.

Again, this is a minor issue compared to the others.

The table XIII numbers are correct.
Let me make more clearly.
For CIFAR10, the total budget distortion epsilon = 0.1.
If the ratio of alpha to epsilon = 0.5, then the absolute value of alpha that will take randomly = 0.1 * 0.5 = 0.05,
the remain eps that will not take randomly = 0.1 - 0.05 = 0.05.
Therefore, it is correct Table XIII states that on CIFAR-10 the R+FGSM attack was executed with eps=0.05 and alpha=0.05.
As for the codes, the 'alpha' is actually the ratio of the alpha to the budget epsilon as we comment in the codes.
More details you can read from here.
https://github.com/kleincup/DEEPSEC/blob/master/Attacks/RFGSM_Generation.py#L92
https://github.com/kleincup/DEEPSEC/blob/master/Attacks/AttackMethods/RFGSM.py#L48

For the difference of box constraints, we set box constraint to [-0.5, 0.5] because we used to normalize the image into [-0.5, 0.5] when we run experiments and write the paper.
After we re-constructed our code, we normalize the image into [0, 1], then the box constraint goes to [0, 1].

@carlini
Copy link
Author

carlini commented Mar 17, 2019

Because the DeepSec paper doesn't give a definition of R+FGSM and cites Tramer et al., the only way to interpret what epsilon and alpha mean is by referring to their original paper. Equation 7 in Ensemble Adversarial Training defines

x^{adv} = x' + (\epsilon - \alpha) * sign(\nabla_{x'}J(x', y_{true})

According to this definition, \epsilon should be equal to 0.1, and \alpha should be equal to 0.05, assuming you would like to first step 0.05 randomly and then step 0.05 again.

@ftramer
Copy link

ftramer commented Mar 17, 2019

I agree with Nicholas that this discrepancy in notation for R+FGSM is very confusing. Here's my take:

  • In our original paper on R+FGSM, \epsilon refers to the total perturbation budget, and \alpha refers to the budget for the initial random step. That is, you first take a random step of size \alpha, and then a non-random step of size \epsilon-\alpha, as Nicholas describes above.

  • Your code implements R+FGSM "correctly", but uses the command-line parameter "alpha" to refer to the ratio \frac{\alpha}{\epsilon} of the parameters of the original R+FGSM attack. This is super confusing. If you really want to think in terms of this ratio, just rename this parameter to \beta or something and comment it appropriately.

  • Table XIII reports yet a different set of parameters. What the table calls \alpha seems to be the same as the \alpha in our original paper (and so is different from the alpha parameter in the code...). What the table calls \epsilon, refers to the residual budget \epsilon-\alpha in our paper. Indeed, otherwise \epsilon=\alpha=0.05 doesn't make sense. In our original notation, this corresponds to a random step of 0.05, followed by a gradient step of size 0...

My suggestion for fixing this would be the following:

  • In your code, rename the ratio of \epsilon and \alpha to something else, e.g., \beta and comment this properly (or have the user pass in parameters \epsilon and \alpha that are defined as in the original paper).
  • In the table, use the parameters \epsilon and \alpha as they are defined in our paper, unless you clearly mention the difference in notation. So for MNIST, this should be \epsilon=0.3, \alpha=0.15 and for CIFAR10, this should be \epsilon=0.1, \alpha=0.05. The same thing holds for the R+LLC attack. This will make this table much easier to read, as all the attacks use the same \epsilon parameter. Defining \epsilon differently for the R+FGSM and R+LLC attacks is confusing, and seems to suggest that these attacks operate in a different attack model.

@ryderling
Copy link
Owner

Thanks for your suggestion, I will update the parameter 'alpha' as the 'alpha_ratio'.

@ryderling
Copy link
Owner

Already updated the parameter 'alpha' as the 'alpha_ratio' in 2c67afa.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants