> Reference: Adversarial Attacks with Carlini & Wagner Approach: https://medium.com/@zachariaharungeorge/adversarial-attacks-with-carlini-wagner-approach-8307daa9a503

# Formulating the C&W Attack as an Optimization Problem

The Carlini & Wagner (C&W) attack formulates the generation of adversarial examples as an optimization problem, seeking to find the smallest perturbation to the input data that causes a misclassification by the target model. This optimization problem is crafted to balance the imperceptibility of the perturbation with the effectiveness of inducing misclassification. Let's delve deeper into the formulation of the C&W attack:

## 1. Defining the Objective Function

The C&W attack begins by defining an objective function, $J(x')$, that quantifies the goals of the attack.

$$
J(x') = \alpha \cdot \text{dist}(x, x') + \beta \cdot \text{loss}(f(x'), y_t)
$$

where:
- $x$ is the original input.
- $x'$ is the perturbed input.
- $\text{dist}(x, x')$ measures the perturbation, typically using the L2 or L∞ norm.
- $\text{loss}(f(x'), y_t)$ represents the misclassification loss of the target model $f$ on the perturbed input with respect to the target class $y_t$.
- $\alpha$ and $\beta$ are weights that balance the two objectives.

The objectives are:
- **Minimizing the perturbation:** To ensure that the adversarial example remains visually similar to the original input.
- **Maximizing the misclassification confidence:** To guarantee that the perturbed input is misclassified by the target model.

## 2. Optimization Algorithm

The C&W attack is an iterative process that refines the adversarial example through multiple iterations. The optimization algorithm adjusts the perturbation to improve the chances of misclassification while keeping the perturbation imperceptible.

**Gradient Descent:** This common optimization algorithm uses the gradients of the objective function with respect to the input, adjusting the input in the opposite direction of these gradients. This process is repeated iteratively to converge towards an adversarial example.

$$
x'_n = x' - \eta \cdot \nabla_{x'} J(x')
$$

where $\eta$ is the step size, determining the magnitude of adjustments.

## 3. Balancing Trade-offs

**Trade-off Parameter Tuning:** The weights $\alpha$ and $\beta$ in the objective function determine the trade-off between minimizing perturbation and maximizing misclassification. Tuning these parameters allows for emphasis on one aspect over the other based on specific requirements of the attack.

## 4. Adaptability to Threat Models

The optimization problem is tailored to different threat models by considering different norms, such as the L2 norm (Euclidean distance) or the L∞ norm (maximum perturbation). This adaptability allows the C&W attack to address a variety of scenarios and evaluation criteria.

For example:
- For the L2 norm: $\text{dist}(x, x') = \|x - x'\|_2$
- For the L∞ norm: $\text{dist}(x, x') = \max(\|x - x'\|_\infty - \epsilon, 0)$, where $\epsilon$ is a constraint on the maximum perturbation.

## 5. Handling Model Uncertainties

To counter gradient masking, where models intentionally obscure their gradients, the C&W attack incorporates strategies such as randomization during optimization. This introduces an element of uncertainty into the gradient computation process.

$$
\nabla_{x'} J(x') = \nabla_{x'} J(x') + \text{random noise}
$$

Introducing random noise ensures that the gradient estimation remains resilient even when the model attempts to hide its true gradients.

