# [Dynamic Generative Targeted Attacks with Pattern Injection<br />利用模式注入的动态生成定向攻击](https://ieeexplore.ieee.org/document/10203887/)

- The cross-attention guided convolution module consists of a static convolutional kernel and a dynamic convolutional kernel that is computed according to the input instance. Consequently, this static and dynamic mixup module can not only encode the global information of the dataset, but also learn specialized convolutional kernels for each input instance.  
交叉注意力引导卷积模块由一个静态卷积核和一个动态卷积核组成，后者根据输入实例进行计算。因此，这种静态和动态混合模块不仅能对数据集的全局信息进行编码，还能针对每个输入实例学习专门的卷积核。
- The pattern injection module is designed to model the pattern or style information of the target class and guide the generation of targeted adversarial examples. Concretely, we propose a pattern prototype to learn a global pattern representation over images from the target class, and use the prototype to guide the generation of more transferable targeted adversarial examples.  
模式注入模块旨在对目标类的模式或风格信息进行建模，并指导目标对抗样本的生成。具体而言，我们提出了一个模式原型，用于学习目标类图像上的全局模式表示，并使用该原型来指导更多可转移的目标对抗样本的生成。

![image.png](attachment:image.png)

### Instance-specific Attacks<br />特定实例攻击

However, these methods pose modest transferability under the targeted attacks setting, because they rely too much on the target label and the classification boundary information of white-box models. Meanwhile, among these methods, they all face the problem of data-specific overfitting, because of ignoring the global data distribution.  
然而，这些方法在定向攻击设置下的转移性能较差，因为它们**过于依赖目标标签和白盒模型的分类边界信息**。同时，在这些方法中，它们都面临着数据特定的**过拟合**问题，因为它们忽略了全局数据分布。

### Instance-agnostic Attacks<br />特定实例无关攻击

Distinguished from instancespecific attacks, instance-agnostic attacks learn a universal perturbation or a generative function to craft adversarial examples.  
与特定实例攻击不同，特定实例无关攻击学习**通用扰动**或**利用生成模型**来制作对抗样本。

Existing generative attack methods apply the same network weights to each input instance, which may limit the transferability of adversarial examples. And most of them also rely too much on the target label and the classification boundary of white-box models, ignoring the realistic data distribution of the target class.  
现有的生成攻击方法将**相同的网络权重应用于每个输入实例**，这可能会限制对抗样本的转移性。而且，它们中的大多数也过于**依赖白盒模型的目标标签和分类边界**，忽略了目标类的现实数据分布。

## Preliminaries and Motivation<br />预备知识和动机

![image.png](attachment:image.png)

For the input images $x$, we propose to group the whole causes of $x$ into two categories, content-related cause $C$ and content-independent cause $S$ that can be dubbed as style or pattern cause. This indicates that $C → x ← S$, and $C ⊥ S$. According to human visual intuition, only the content variable $C$ is relevant to the prediction class $y$.  
对于输入图像$x$，我们提出将$x$的整个原因分为两类，**内容相关原因$C$和内容无关原因$S$**，可以称为风格或模式原因。这表明$C → x ← S$，且$C ⊥ S$。根据人类视觉直觉，只有内容变量$C$与预测类$y$相关。

![image-2.png](attachment:image-2.png)

Deep learning models can learn not only the dependencies between the content $C$ and the label $y$, but also the statistical correlation between the style $S$ and the label $y$ (i.e., $P_{θ}(y | x, s)$ )  
深度学习模型不仅可以学习内容$C$和标签$y$之间的依赖关系，还可以学习风格$S$和标签$y$之间的统计相关性（即$P_{θ}(y | x, s)$）

![image-3.png](attachment:image-3.png)

We first propose to exploit the statistical correlation between $y_{t}$ and $S$ (i.e., $P_{θ}(y_{t} | x_{adv}, s)$), via injecting the specific style or pattern of images from the given target class $y_{t}$ to generate targeted transferable adversarial examples.  
我们首先提出利用$y_{t}$和$S$之间的统计相关性（即$P_{θ}(y_{t} | x_{adv}, s)$），通过注入给定目标类$y_{t}$的图像的特定风格或模式来生成有针对性的可转移的对抗样本。

![image-4.png](attachment:image-4.png)

$W$ is a smoothing operator with fixed weights, $p_{t}$ represents the semantic pattern or style of the target class.  
$W$是一个具有固定权重的平滑算子，$p_{t}$表示目标类的语义模式或风格。



## Network Architecture<br />网络架构

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

### Cross-attention guided dynamic convolution module<br />交叉注意力引导动态卷积模块

![image-5.png](attachment:image-5.png)

![image-3.png](attachment:image-3.png)

![image-4.png](attachment:image-4.png)

$N$ indicates the number of learnable kernels, $k$ represents the kernel size  
$N$表示可学习核的数量，$k$表示核大小

![image-6.png](attachment:image-6.png)

![image-7.png](attachment:image-7.png)

![image-8.png](attachment:image-8.png)

![image-9.png](attachment:image-9.png)

Then we can get the attention weight $α = [α_{1}, α_{2}, · · · , α_{N} ] ∈ R^{1×N}$ by conducting the average pooling on $att$.  
然后，我们可以通过对$att$进行平均池化来得到注意权重$α = [α_{1}, α_{2}, · · · , α_{N} ] ∈ R^{1×N}$。

$$
∆W = α_{1} ∗ W_{1} + α_{2} ∗ W_{2} + · · · + α_{N} ∗ W_{N}
$$

### Pattern injection module<br />模式注入模块

$p_{t} = \{γ_{t}, β_{t}\}$

![image-10.png](attachment:image-10.png)

We update $p^{running}_{t}$ during each training iteration, which can be indicated as $p^{running}_{t} = λp_{t} + (1 − λ)p^{running}_{t}$.  
我们在每次训练迭代中更新$p^{running}_{t}$，可以表示为$p^{running}_{t} = λp_{t} + (1 − λ)p^{running}_{t}$。

It is necessary to replace $p_{t} = {γ_{t}, β_{t}}$ with $p^{running}_{t}$ during inference.  
在推理过程中，有必要用$p^{running}_{t}$替换$p_{t} = {γ_{t}, β_{t}}$。

### Objective Function<br />目标函数

![image-11.png](attachment:image-11.png)

![image-12.png](attachment:image-12.png)

![image-13.png](attachment:image-13.png)

![image-14.png](attachment:image-14.png)

![image-15.png](attachment:image-15.png)

![image-16.png](attachment:image-16.png)

![image-17.png](attachment:image-17.png)

### Theoretical Analyses<br />理论分析

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)

![image-4.png](attachment:image-4.png)

![image-5.png](attachment:image-5.png)

![image-6.png](attachment:image-6.png)

![image-7.png](attachment:image-7.png)

![image-8.png](attachment:image-8.png)

![image-9.png](attachment:image-9.png)

![image-10.png](attachment:image-10.png)