PatchFool implementation #2163

sechkova · 2023-05-24T11:01:21Z

Description

Initial draft implementation of PatchFool attack from the paper:

Patch-Fool: Are Vision Transformers Always Robust Against Adversarial Perturbations?

Currently there is an example notebook of the attack in colab. I do plan to contribute the notebook too once ready.

Fixes # (issue)

Type of change

Please check all relevant options.

Improvement (non-breaking)
Bug fix (non-breaking)
New feature (non-breaking)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Testing

Please describe the tests that you ran to verify your changes. Consider listing any relevant details of your test configuration.

Test A
Test B

Test Configuration:

OS
Python version
ART version or commit number
TensorFlow / Keras / PyTorch / MXNet version

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

codecov-commenter · 2023-05-24T11:07:51Z

Codecov Report

Attention: 21 lines in your changes are missing coverage. Please review.

Comparison is base (3de2078) 85.08% compared to head (da05de1) 85.16%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2163      +/-   ##
==========================================
+ Coverage   85.08%   85.16%   +0.07%     
==========================================
  Files         324      325       +1     
  Lines       29331    29480     +149     
  Branches     5409     5431      +22     
==========================================
+ Hits        24956    25106     +150     
+ Misses       2997     2973      -24     
- Partials     1378     1401      +23

Files	Coverage Δ
art/attacks/evasion/__init__.py	`98.24% <100.00%> (+0.03%)`	⬆️
art/estimators/pytorch.py	`84.73% <76.92%> (-0.99%)`	⬇️
art/attacks/evasion/patchfool.py	`86.76% <86.76%> (ø)`

... and 12 files with indirect coverage changes

sechkova · 2023-05-24T11:25:49Z

This is only a draft implementations but I wanted to discuss a few issues that I am facing.

The first one comes from getting the attention weights of a transformer model. I added one implementation for the ViT model that comes pre-trained from the torchvision models library (the papers' authors use DeiT but I am more familiar with the architecture of this one). The problem I see is that it is very challenging to implement one common method that extracts the weights even for different implementations of the same model architecture. Extracting the weights in my case required tracing the graph and even changing one of the operations.
One way to go would be to provide a classifier working only for one specific model. Otherwise I imagine this can work if the user of ART who provides the model provides also the method to extract the weights and ART could provide an abstraction class and an example? But there could be a better option that I cannot see right now?

Second issue is that the PyTorch model I used behaves incorrectly if the benign input is cast to float ... which makes it hard to test the attack. (there's an example in the attack's notebook ). Is this a problem coming from the mixture of frameworks? Have you seen such behaviour before?

beat-buesser · 2023-05-26T13:32:12Z

Hi @sechkova Thank you very much for your pull request!

I agree about your first question that general support for all possible architectures is challenging or not reasonably possible. ART does have multiple model specific estimators, for example art.estimators.object_detection.PyTorchYolo, that are easier to implement and maintain. I think this approach would be the best for your PR too.

About your second question, does the model you are working with expect integer arrays as input? If yes, you could accepts float arrays as input to your new ART tools to follow the ART APIs and inside of the tools convert them to integer arrays before providing the input data to the model. We would have to investigate how this conversion affects the adversarial attacks.

sechkova · 2023-07-17T08:21:44Z

About your second question, does the model you are working with expect integer arrays as input? If yes, you could accepts float arrays as input to your new ART tools to follow the ART APIs and inside of the tools convert them to integer arrays before providing the input data to the model. We would have to investigate how this conversion affects the adversarial attacks.

At the end I used convert_image_dtype from PyTorch which both converts and scales the values and now the model works properly. I couldn't figure out how the other attacks' implementations are able to handle this.

sechkova · 2023-07-17T08:29:11Z

I agree about your first question that general support for all possible architectures is challenging or not reasonably possible. ART does have multiple model specific estimators, for example art.estimators.object_detection.PyTorchYolo, that are easier to implement and maintain. I think this approach would be the best for your PR too.

For now I added art.estimators.classification.PyTorchDeiT but the way I've hardcoded the attention layers works I think with either PyTorch < 2.0 or with setting 'TIMM_FUSED_ATTN' = '0'

sechkova · 2023-08-25T14:07:49Z

@beat-buesser the PR is updated and the attack algorithm now shows good results.
Can you do an initial review?

What I think is still to be resolved is the custom PyTorch DeiT classifier. For now I have implemented just the very basics for the attack to work with a pre-trained one from timm . It involves hardcoding the layers names, therefore there is a difference between PyTorch versions, which I've circumvented by setting 'TIMM_FUSED_ATTN' = '0' (you can see the example notebook below). It is not a very subtle approach for sure.

Here is an example notebook that I wish to contribute once the implementation is finalised:
https://colab.research.google.com/drive/1QfdZEUI0hhO-AYFL12RZvB0dA95l2NAS?usp=sharing

art/attacks/evasion/patchfool.py

beat-buesser

Hi @sechkova Thank you very much for implementing the PatchFool attack in ART! I have added a few comments in my review, please take a look and let me know what you think. In addition to that could you please add a unit test in pytest format for the new attack class and a notebook showing how the implementation reproduces the original paper?

beat-buesser · 2023-10-27T12:07:18Z

art/attacks/evasion/patchfool.py

@@ -0,0 +1,258 @@
+# MIT License
+#
+# Copyright (C) The Adversarial Robustness Toolbox (ART) Authors 2022


Suggested change

# Copyright (C) The Adversarial Robustness Toolbox (ART) Authors 2022

# Copyright (C) The Adversarial Robustness Toolbox (ART) Authors 2023

beat-buesser · 2023-10-27T12:07:39Z

art/attacks/evasion/__init__.py

@@ -67,3 +67,4 @@
 from art.attacks.evasion.wasserstein import Wasserstein
 from art.attacks.evasion.zoo import ZooAttack
 from art.attacks.evasion.sign_opt import SignOPTAttack
+from art.attacks.evasion.patchfool import PatchFool


Suggested change

from art.attacks.evasion.patchfool import PatchFool

from art.attacks.evasion.patchfool import PatchFoolPyTorch

beat-buesser · 2023-10-27T12:08:17Z

art/attacks/evasion/patchfool.py

+    ):
+        """
+        Create a :class:`PatchFool` instance.
+        TODO


Is there still a TODO here?

beat-buesser · 2023-10-27T12:08:57Z

art/attacks/evasion/patchfool.py

+
+    def _generate_batch(self, x: "torch.Tensor", y: Optional["torch.Tensor"] = None) -> "torch.Tensor":
+        """
+        TODO


Please update docstring.

beat-buesser · 2023-10-27T12:15:45Z

art/attacks/evasion/patchfool.py

+    def _get_patch_index(self, x: "torch.Tensor", layer: int) -> "torch.Tensor":
+        """
+        Select the most influencial patch according to a predefined `layer`.
+        TODO


Please update docstring.

beat-buesser · 2023-10-27T12:16:02Z

art/attacks/evasion/patchfool.py

+    def _get_attention_loss(self, x: "torch.Tensor", patch_idx: "torch.Tensor") -> "torch.Tensor":
+        """
+        Sum the attention weights from each layer for the most influencail patches
+        TODO


Please update docstring.

beat-buesser · 2023-10-27T12:16:14Z

art/attacks/evasion/patchfool.py

+
+    def pcgrad(self, grad1, grad2):
+        """
+        TODO


Please update docstring.

beat-buesser · 2023-10-27T12:24:11Z

art/estimators/classification/pytorch.py

+        """
+        return self.model.patch_embed.patch_size[0]
+
+    def get_attention_weights(self, x: Union[np.ndarray, "torch.Tensor"]) -> "torch.Tensor":


I think this method could be of interest for other models too. Please move it to PyTorchEstimator and generalise it by making return_nodes a list of strings provided by the user as an argument.

beat-buesser · 2023-10-27T12:25:21Z

art/estimators/classification/pytorch.py

+        )
+
+    @property
+    def patch_size(self):


Shouldn't the patch size be defined on the attack side? If yes, we could just reuse the existing PyTorchClassifier.

beat-buesser · 2023-10-27T12:28:37Z

art/attacks/evasion/patchfool.py

+        optim = torch.optim.Adam([perturbation], lr=self.learning_rate)
+        scheduler = torch.optim.lr_scheduler.StepLR(optim, step_size=self.step_size, gamma=self.step_size_decay)
+
+        for i_max_iter in tqdm(range(self.max_iter)):


The variable i_max_iter seems not be used, you can replace it with _ to avoid the CodeQL alert.

Add a new evasion attack on vision transformers. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Skip the class token when calculating the most influential image patch. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Update classifier to use DeiT from the timm library. Fix algorithm details. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

- Calculate the attention loss as negative log likelihood - Clamp perturbations after random init Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

- Fix input normalisation and scaling. - Fix patch application to happen only once after final iteration - Add skip_loss_att option Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Use tqdm indication bar showing the attack iterations. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

- Move get_attention weights to PyTorchEstimator and generalise it by making return_nodes a list of strings provided by the user as an argument. - Define patch size on the attack side. - Remove PyTorchClassifierDeiT and reuse the exisitng PyTorchClassifier. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Add verbose option for tqdm. Remove unused variable i_max_iter. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Use directly the attribute patch_layer. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

tests/attacks/evasion/test_patchfool_pytorch.py

+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+import os


tests/attacks/evasion/test_patchfool_pytorch.py

+import pytest
+
+from art.attacks.evasion import PatchFoolPyTorch
+from art.estimators.classification.classifier import ClassGradientsMixin


tests/attacks/evasion/test_patchfool_pytorch.py

+
+from art.attacks.evasion import PatchFoolPyTorch
+from art.estimators.classification.classifier import ClassGradientsMixin
+from art.estimators.classification.pytorch import PyTorchClassifier


tests/attacks/evasion/test_patchfool_pytorch.py

+from art.attacks.evasion import PatchFoolPyTorch
+from art.estimators.classification.classifier import ClassGradientsMixin
+from art.estimators.classification.pytorch import PyTorchClassifier
+from art.estimators.estimator import BaseEstimator


tests/attacks/evasion/test_patchfool_pytorch.py

+from art.estimators.classification.pytorch import PyTorchClassifier
+from art.estimators.estimator import BaseEstimator
+
+from tests.attacks.utils import backend_test_classifier_type_check_fail


sechkova · 2023-12-22T17:54:08Z

Hi @sechkova Thank you very much for implementing the PatchFool attack in ART! I have added a few comments in my review, please take a look and let me know what you think. In addition to that could you please add a unit test in pytest format for the new attack class and a notebook showing how the implementation reproduces the original paper?

@beat-buesser Can you advise how should the tests be defined? PatchFool attack works on transformer models, using information from the attention layers to calculate the attack. I can use a downloaded pre-trained model for the tests but they are usually trained on ImageNet while the tests in ART use other smaller test datasets. This causes issues with the number of classes etc.

I added one initial draft test with the last commit (da05de1).

beat-buesser self-requested a review May 25, 2023 16:44

beat-buesser self-assigned this May 25, 2023

beat-buesser added the enhancement New feature or request label May 25, 2023

sechkova force-pushed the patchfool branch from 9a086f5 to 6995c8f Compare July 17, 2023 08:10

sechkova marked this pull request as ready for review August 25, 2023 13:56

sechkova force-pushed the patchfool branch from 2aca038 to 2199fac Compare August 25, 2023 14:10

github-advanced-security bot found potential problems Aug 26, 2023

View reviewed changes

art/attacks/evasion/patchfool.py Fixed Show fixed Hide fixed

beat-buesser requested changes Oct 27, 2023

View reviewed changes

sechkova added 15 commits December 22, 2023 18:41

Add initial PatchFool implementation

9c92350

Add a new evasion attack on vision transformers. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Add ViT classifier

ec385c3

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Fix pylint errors

57b54e4

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Skip class token

a3ae044

Skip the class token when calculating the most influential image patch. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Add preprocessing before feature extraction

c9cc837

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Update classifier and algorithm steps

8f0ed1b

Update classifier to use DeiT from the timm library. Fix algorithm details. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Use NLL for attention loss

1f51b79

- Calculate the attention loss as negative log likelihood - Clamp perturbations after random init Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Fix normalisation

d0c4ba7

- Fix input normalisation and scaling. - Fix patch application to happen only once after final iteration - Add skip_loss_att option Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Add tqdm for the attack loop

2c312a9

Use tqdm indication bar showing the attack iterations. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Update the attack parameters checks

9f6ad4b

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Update docstrings

71d51de

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Add verbose parameter

b8e6c28

Add verbose option for tqdm. Remove unused variable i_max_iter. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Remove layer as an internal function parameter

a69d93f

Use directly the attribute patch_layer. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Add PatchFool attack example notebook

28a0ff2

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

sechkova force-pushed the patchfool branch from 91168f6 to 28a0ff2 Compare December 22, 2023 16:56

Add PatchFool attack tests

da05de1

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

github-advanced-security bot found potential problems Dec 22, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PatchFool implementation #2163

PatchFool implementation #2163

sechkova commented May 24, 2023 •

edited

codecov-commenter commented May 24, 2023 •

edited

sechkova commented May 24, 2023

beat-buesser commented May 26, 2023

sechkova commented Jul 17, 2023 •

edited

sechkova commented Jul 17, 2023

sechkova commented Aug 25, 2023

beat-buesser left a comment

beat-buesser Oct 27, 2023

beat-buesser Oct 27, 2023

beat-buesser Oct 27, 2023

beat-buesser Oct 27, 2023

beat-buesser Oct 27, 2023

beat-buesser Oct 27, 2023

beat-buesser Oct 27, 2023

beat-buesser Oct 27, 2023

beat-buesser Oct 27, 2023

beat-buesser Oct 27, 2023

sechkova commented Dec 22, 2023

	# Copyright (C) The Adversarial Robustness Toolbox (ART) Authors 2022
	# Copyright (C) The Adversarial Robustness Toolbox (ART) Authors 2023

	from art.attacks.evasion.patchfool import PatchFool
	from art.attacks.evasion.patchfool import PatchFoolPyTorch

PatchFool implementation #2163

Are you sure you want to change the base?

PatchFool implementation #2163

Conversation

sechkova commented May 24, 2023 • edited

Description

Type of change

Testing

Checklist

codecov-commenter commented May 24, 2023 • edited

Codecov Report

sechkova commented May 24, 2023

beat-buesser commented May 26, 2023

sechkova commented Jul 17, 2023 • edited

sechkova commented Jul 17, 2023

sechkova commented Aug 25, 2023

beat-buesser left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sechkova commented Dec 22, 2023

sechkova commented May 24, 2023 •

edited

codecov-commenter commented May 24, 2023 •

edited

sechkova commented Jul 17, 2023 •

edited