# Tabular attack

The objective of this practical is to adapt a powerful attack from image classification to tabular data. As shown in the class, the main challenge is to respect domain constraints.

The translation table of constraints can be found here: https://arxiv.org/pdf/2112.01156, Table 1.

## Import package

It is good practice to import all necessary packages at the top of Python files or in the first code cell of a Python notebook.

In [1]:
import torch
import mlc
from mlc.datasets.dataset_factory import get_dataset
from sklearn.metrics import roc_auc_score
import torch
import numpy as np
from torch import nn
import torch.nn.functional as F
from torch import optim
from torch.utils.data import DataLoader, TensorDataset
from tqdm import tqdm
from typing import Tuple


We check the correct version are installed.

In [2]:
for pkg, version in [(mlc, "0.1.0")]:
    if version in pkg.__version__:
        print(f"OK: {pkg.__name__}=={pkg.__version__}.")
    else:
        print(f"Version mismatch: expected version {version} for package {pkg.__name__} but is currently {pkg.__version__}")

OK: mlc==0.1.0.


## Retrieve data

In this section we will download and load a feature engineered version of the URL dataset. The ojective is to classify URL as legitimate or potential phishing attack.
We only consider type, boundary and relationship constraints. All features are mutable.

In [3]:
dataset = get_dataset("lcld_v2_iid")
x, y = dataset.get_x_y()
metadata = dataset.get_metadata(only_x=True)


In [4]:
# Simplify the problem to non categorical
# Select only non-categorical features
non_cat_features = metadata[metadata["type"] != "cat"]["feature"].tolist()
x = x[non_cat_features]
metadata = metadata[metadata["feature"].isin(non_cat_features)].reset_index(drop=True)

In [5]:
# Splitting the data
splits = dataset.get_splits()
x_train, x_val, x_test = x.iloc[splits["train"]].to_numpy(), x.iloc[splits["val"]].to_numpy(), x.iloc[splits["test"]].to_numpy()
y_train, y_val, y_test = y[splits["train"]], y[splits["val"]], y[splits["test"]]


As you can see below, the dataset only contains numerical values: 13 continous and 10 discretes.

In [6]:
metadata["type"].value_counts()

type
real    13
int     10
Name: count, dtype: int64

Neural networks needs scaled data to obtain the best performance.
We usually use min/max or standard scaling.
Attacks from image classification also suppose min/max scaling in the [0 , 1] range.
For simplicity we will use min/max scaling in this notebook.
However, constraints penalty function evaluations need to be perform in the unscaled/original domain.
Hence we will use extensively the following transform / inverse transform functions.

In [7]:
class Scaler:
    def __init__(self, x_min, x_max):
        self.x_min = x_min
        self.x_max = x_max

        # Define the scale and set to 1 if equals to 0.
        scale = x_max - x_min
        constant_mask = scale < 10 * torch.finfo(torch.from_numpy(scale).dtype).eps
        scale = scale.copy()
        scale[constant_mask] = 1.0
        self.scale = scale

    def transform(self, x):
        x_min = self.x_min
        scale = self.scale

        if isinstance(x, torch.Tensor):
            x_min = torch.from_numpy(x_min).float()
            scale = torch.from_numpy(scale).float()

        return (x - x_min) / scale

    def inverse_transform(self, x):
        x_min = self.x_min
        scale = self.scale

        if isinstance(x, torch.Tensor):
            x_min = torch.from_numpy(x_min).float()
            scale = torch.from_numpy(scale).float()

        return x * scale + x_min



In [8]:
x_min = metadata["min"].to_numpy().astype("float")
x_max = metadata["max"].to_numpy().astype("float")

scaler = Scaler(x_min, x_max)

In [9]:
x_t = scaler.transform(x_train)

In [10]:
x_t.max()

1.0

In [11]:
x_it = scaler.inverse_transform(x_t)

In [12]:
np.max((x_train - x_it))

2.3283064365386963e-10

## Fit a Neural Network

### Architecture

We define a simple neural network architecture.

In [None]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.l1 = nn.Linear(x_train.shape[1], 128)
        self.l2 = nn.Linear(128, 128)
        self.l3 = nn.Linear(128, 128)
        self.l4 = nn.Linear(128, 2)

    def forward(self, x):
        x = self.l1(x)
        x = self.l2(x)
        x = self.l3(x)
        x = self.l4(x)
        return x

We create a scaler module that will scale the input based on a scaler before feeding the results to the neural network.
To chain two such nn.Module (Net and ScalerModule), we can use the nn.Sequential nn.Module: https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html.

In [15]:
class ScalerModule(nn.Module):
    def __init__(self, scaler):
        super(ScalerModule, self).__init__()
        self.scaler = scaler

    def forward(self, x):
        x = scaler.transform(x)
        return x


### Training

We use the class weight to give importance to the underrepresented class during training. Here, the class are balanced but it is not always the case. For instance, in fraud detection we observe a huge imbalance with a few frauds for a large number of legitimate transactions.

In [16]:
class_weight = torch.Tensor(
    1 - torch.unique(torch.tensor(y_train), return_counts=True)[1] / len(y_train)
)
print(f"Class weight {class_weight}")

Class weight tensor([0.2009, 0.7991])


Here we use the aforementioned nn.Sequential module.

In [18]:
model = nn.Sequential(ScalerModule(scaler), Net()).float()
optimizer = optim.AdamW(
    filter(lambda p: p.requires_grad, model.parameters()),
    lr=0.001,
)

In [None]:
def train_loop(dataloader, model, loss_fn, optimizer, batch_size):
    size = len(dataloader.dataset)
    for batch, (X, y) in tqdm(enumerate(dataloader), total=int(size/batch_size)):
        # if batch % 10 == 0:
        #     print(f"Batch {batch}.")
        # Compute prediction and loss
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

def val_loop(dataloader, model, loss_fn, epoch_i):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    test_loss, correct = 0, 0

    with torch.no_grad():
        for X, y in dataloader:
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y[:, 1]).type(torch.float).sum().item()

    test_loss /= num_batches
    correct /= size
    print(f"Epoch {epoch_i}, Val Error: Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f}")



def train_model(model, x_train, y_train, x_val, y_val, optimizer, batch_size, loss_func, epochs):
    # Data processing
    train_dataset = TensorDataset(x_train, y_train)
    train_loader = DataLoader(
        dataset=train_dataset,
        batch_size=batch_size,
        shuffle=True,m_num
        num_workers=2,
    )
    val_dataset = TensorDataset(x_val, y_val)
    val_loader = DataLoader(
        dataset=val_dataset,
        batch_size=2000,
        shuffle=True,
        num_workers=2,
    )

    # Main train loop
    for epoch in range(epochs):
        train_loop(train_loader, model, loss_func, optimizer, batch_size)
        val_loop(val_loader, model, loss_func, epoch)




In [20]:
loss = nn.CrossEntropyLoss(weight=class_weight)
train_model(
    model,
    torch.from_numpy(x_train).float(),
    torch.from_numpy(np.array([1 - y_train, y_train]).T).float(),
    torch.from_numpy(x_val).float(),
    torch.from_numpy(np.array([1 - y_val, y_val]).T).float(),
    optimizer,
    64,
    loss,
    3
)

7721it [00:13, 587.63it/s]                          


Epoch 0, Val Error: Accuracy: 67.0%, Avg loss: 0.200277


7721it [00:12, 597.86it/s]                          


Epoch 1, Val Error: Accuracy: 67.6%, Avg loss: 0.201033


7721it [00:12, 595.39it/s]                          


Epoch 2, Val Error: Accuracy: 66.9%, Avg loss: 0.199877


In [21]:
# Model prediction
y_score = model(torch.from_numpy(x_test).float()).detach().numpy()


In [22]:
# Model scoring
auc = roc_auc_score(y_test, y_score[:, 1])
print(f"The AUROC score of the model is {auc}")

The AUROC score of the model is 0.7120895698942986


## Generating adversarial examples

### PGD Attack

Bellow is the PGD attack for image classification.
The perturbation is bounded by a maximum L2 norm, called epsilon (eps).
We initialy set the maximum perturbation to eps = 5.

In [23]:
n_examples = 1000
eps = 5
n_iter = 100
alpha = 2*eps
eps_for_division=1e-10

In [24]:
def perturb(scaler, x_origin, x_adv, grad, eps, alpha, n_iter, iter):
    
    x_origin = scaler.transform(x_origin)
    x_adv = scaler.transform(x_adv)

    # Compute L2 pertubation
    grad_norms = (
        torch.norm(grad.view(x_adv.shape[0], -1), p=2, dim=1)
        + eps_for_division
    )  # nopep8
    grad = grad / grad_norms.view(x_adv.shape[0], 1)
    
    
    decay_steps = max(n_iter // 10, 1)
    decay_factor = iter // decay_steps
    l_alpha = alpha / (2 ** decay_factor)

    # Apply L2 perturbation
    x_adv = x_adv + l_alpha * grad

    # Project on L2
    delta = x_origin - x_adv
    delta_norms = torch.norm(delta.view(x_adv.shape[0], -1), p=2, dim=1)
    factor = eps / delta_norms
    factor = torch.min(factor, torch.ones_like(delta_norms))
    delta = delta * factor.view(
        -1,
        1,
    )
    x_adv = x_origin + delta

    # Clamp
    x_adv = torch.clamp(x_adv, 0, 1)
    
    x_adv = scaler.inverse_transform(x_adv)

    return x_adv.detach()



def generate_adversarial2(model,  x, y, eps, alpha, iter, scaler, verbose=1):
    x_adv = x.clone().detach()

    iterable = range(iter)
    if verbose >0:
        iterable = tqdm(iterable)
    for i in iterable:
        x_adv.requires_grad = True
        output = model(x_adv)
        loss = F.cross_entropy(output, y)

        model.zero_grad()
        loss.backward()

        data_grad =  x_adv.grad.data
        x_adv = perturb(scaler, x, x_adv, data_grad, eps, alpha, n_iter, i)
    return x_adv


## Tasks

1. Write a `is_constrained_adversarial` function that, for a set of examples x and their correct labels y, determines if:
- x is adversarial,
- x respects the boundary constraints,
- x respects the type constraints,
- x respects the feature relation constraints,
- all of the above.

For boundary, you can tolerate 10 * torch.finfo((x).dtype).eps difference, due to float precision.

Type constraints can be access with:
```
metadata["type"]
```

Feature relation constraints are:

```
int_rate = Feature("int_rate") / Constant(1200)
term = Feature("term")
installment = Feature("loan_amnt") * (
    (int_rate * ((Constant(1) + int_rate) ** term))
    / ((Constant(1) + int_rate) ** term - Constant(1))
)
g1 = ABS(Feature("installment") - installment) <= 0.1

g2 = Feature("open_acc") <= Feature("total_acc")

g3 = Feature("pub_rec_bankruptcies") <= Feature("pub_rec")
```

Use the feat_to_idx function bellow to retrieve the feature in x, e.g.
```
int_rate = x[:, feat_to_ix("int_rate")]



In [None]:
def feat_to_idx(feature:str) -> int:
    return metadata[metadata["feature"] == feature].index[0]

In [34]:
# YOUR CODE HERE

model(torch.from_numpy(x_test).float()[:5])

def is_constrained_adversarial(x, y, model, metadata) -> Tuple[float,float,float,float,float]:
    features = torch.from_numpy(x).float()
    y_pred = model(features).detach().numpy()
    y_class = np.argmax(y, axis=1)

    correct = (y_pred == y).astype(int)
    percent_correct = correct.mean()
    percent_adv = 1 - percent_correct

    respects_boundary = features <= (1 + 10 * torch.finfo(features.dtype).eps)
    percent_respects_boundary = respects_boundary.sum().item() / (respects_boundary.shape[0] * respects_boundary.shape[1])

    return percent_adv, percent_respects_boundary, 0.0, 0.0, 0.0

is_constrained_adversarial(x_test[:5], np.array([1 - y_test, y_test]).T[:5], model, metadata)

(1.0, 0.3652173913043478, 0.0, 0.0, 0.0)

2. Verify your is constrained function by running it on the test set instead of adversarial set. Few examples should be adversarial, but all should pass the constraints.

In [None]:
# YOUR CODE HERE

3. Run PGD and evaluate the success rate of the attack based on the `is_constrained_adversarial` function.


In [None]:
# YOUR CODE HERE

4. Comment your results.

YOUR TEXT HERE

5. Adapt PGD to respect type constraints.

PGD is implemented for continuous numerical values only, hence it generates real values.
Write a function that converts reals to integer and guarantees that it does not break boundaries and epsilon constraints.
Integrates this function into PGD.

DO NOT remove/modify the cell with the original implementation of PGD, you will need it later.

In [None]:
# YOUR CODE HERE

7. Compare the  success rate with the original implementation of PGD.

In [None]:
# YOUR CODE HERE

8. Comment your results.

YOUR TEXT HERE

9. Write a function that for a sample X returns the constraints penalty function of the constraints above.

In [None]:
# YOUR CODE HERE

10. Integrates the constraints penalty function in the loss of the PGD attack as in CPGD (shown in class).



In [None]:
# YOUR CODE HERE

11. Compare the success rate with previous implemenations of PGD.


In [None]:
# YOUR CODE HERE

12. Comment your results.

YOUR TEXT HERE