# Tabular attack

The objective of this practical is to adapt a powerful attack from image classification to tabular data. As shown in the class, the main challenge is to respect domain constraints.

The translation table of constraints can be found here: https://arxiv.org/pdf/2112.01156, Table 1.

## Import package

It is good practice to import all necessary packages at the top of Python files or in the first code cell of a Python notebook.

In [1]:
import torch
import mlc
from mlc.datasets.dataset_factory import get_dataset
from sklearn.metrics import roc_auc_score
import torch
import numpy as np
from torch import nn
import torch.nn.functional as F
from torch import optim
from torch.utils.data import DataLoader, TensorDataset
from tqdm import tqdm
from typing import Tuple


We check the correct version are installed.

In [2]:
for pkg, version in [(mlc, "0.1.0")]:
    if version in pkg.__version__:
        print(f"OK: {pkg.__name__}=={pkg.__version__}.")
    else:
        print(f"Version mismatch: expected version {version} for package {pkg.__name__} but is currently {pkg.__version__}")

OK: mlc==0.1.0.


## Retrieve data

In this section we will download and load a feature engineered version of the URL dataset. The ojective is to classify URL as legitimate or potential phishing attack.
We only consider type, boundary and relationship constraints. All features are mutable.

In [3]:
dataset = get_dataset("lcld_v2_iid")
x, y = dataset.get_x_y()
metadata = dataset.get_metadata(only_x=True)


In [4]:
# Simplify the problem to non categorical
# Select only non-categorical features
non_cat_features = metadata[metadata["type"] != "cat"]["feature"].tolist()
x = x[non_cat_features]
metadata = metadata[metadata["feature"].isin(non_cat_features)].reset_index(drop=True)

print(metadata)

                                              feature                 min  \
0                                           loan_amnt              1000.0   
1                                                term                  36   
2                                            int_rate                5.31   
3                                         installment                4.93   
4                                           sub_grade                   0   
5                                          emp_length                   0   
6                                          annual_inc                33.0   
7                                                 dti                -1.0   
8                                            open_acc                   1   
9                                             pub_rec                 0.0   
10                                          revol_bal                 0.0   
11                                         revol_util                 0.0   

In [5]:
# Splitting the data
splits = dataset.get_splits()
x_train, x_val, x_test = x.iloc[splits["train"]].to_numpy(), x.iloc[splits["val"]].to_numpy(), x.iloc[splits["test"]].to_numpy()
y_train, y_val, y_test = y[splits["train"]], y[splits["val"]], y[splits["test"]]


As you can see below, the dataset only contains numerical values: 13 continous and 10 discretes.

In [6]:
metadata["type"].value_counts()

type
real    13
int     10
Name: count, dtype: int64

Neural networks needs scaled data to obtain the best performance.
We usually use min/max or standard scaling.
Attacks from image classification also suppose min/max scaling in the [0 , 1] range.
For simplicity we will use min/max scaling in this notebook.
However, constraints penalty function evaluations need to be perform in the unscaled/original domain.
Hence we will use extensively the following transform / inverse transform functions.

In [7]:
class Scaler:
    def __init__(self, x_min, x_max):
        self.x_min = x_min
        self.x_max = x_max

        # Define the scale and set to 1 if equals to 0.
        scale = x_max - x_min
        constant_mask = scale < 10 * torch.finfo(torch.from_numpy(scale).dtype).eps
        scale = scale.copy()
        scale[constant_mask] = 1.0
        self.scale = scale

    def transform(self, x):
        x_min = self.x_min
        scale = self.scale

        if isinstance(x, torch.Tensor):
            x_min = torch.from_numpy(x_min).float()
            scale = torch.from_numpy(scale).float()

        return (x - x_min) / scale

    def inverse_transform(self, x):
        x_min = self.x_min
        scale = self.scale

        if isinstance(x, torch.Tensor):
            x_min = torch.from_numpy(x_min).float()
            scale = torch.from_numpy(scale).float()

        return x * scale + x_min



In [8]:
x_min = metadata["min"].to_numpy().astype("float")
x_max = metadata["max"].to_numpy().astype("float")

scaler = Scaler(x_min, x_max)

In [9]:
x_t = scaler.transform(x_train)

In [10]:
x_t.max()

1.0

In [11]:
x_it = scaler.inverse_transform(x_t)

In [12]:
np.max((x_train - x_it))

2.3283064365386963e-10

## Fit a Neural Network

### Architecture

We define a simple neural network architecture.

In [13]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.l1 = nn.Linear(x_train.shape[1], 128)
        self.l2 = nn.Linear(128, 128)
        self.l3 = nn.Linear(128, 128)
        self.l4 = nn.Linear(128, 2)

    def forward(self, x):
        x = self.l1(x)
        x = self.l2(x)
        x = self.l3(x)
        x = self.l4(x)
        return x

We create a scaler module that will scale the input based on a scaler before feeding the results to the neural network.
To chain two such nn.Module (Net and ScalerModule), we can use the nn.Sequential nn.Module: https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html.

In [14]:
class ScalerModule(nn.Module):
    def __init__(self, scaler):
        super(ScalerModule, self).__init__()
        self.scaler = scaler

    def forward(self, x):
        x = scaler.transform(x)
        return x


### Training

We use the class weight to give importance to the underrepresented class during training. Here, the class are balanced but it is not always the case. For instance, in fraud detection we observe a huge imbalance with a few frauds for a large number of legitimate transactions.

In [15]:
class_weight = torch.Tensor(
    1 - torch.unique(torch.tensor(y_train), return_counts=True)[1] / len(y_train)
)
print(f"Class weight {class_weight}")

Class weight tensor([0.2009, 0.7991])


Here we use the aforementioned nn.Sequential module.

In [16]:
model = nn.Sequential(ScalerModule(scaler), Net()).float()
optimizer = optim.AdamW(
    filter(lambda p: p.requires_grad, model.parameters()),
    lr=0.001,
)

In [17]:
def train_loop(dataloader, model, loss_fn, optimizer, batch_size):
    size = len(dataloader.dataset)
    for batch, (X, y) in tqdm(enumerate(dataloader), total=int(size/batch_size)):
        # if batch % 10 == 0:
        #     print(f"Batch {batch}.")
        # Compute prediction and loss
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

def val_loop(dataloader, model, loss_fn, epoch_i):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    test_loss, correct = 0, 0

    with torch.no_grad():
        for X, y in dataloader:
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y[:, 1]).type(torch.float).sum().item()

    test_loss /= num_batches
    correct /= size
    print(f"Epoch {epoch_i}, Val Error: Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f}")



def train_model(model, x_train, y_train, x_val, y_val, optimizer, batch_size, loss_func, epochs):
    # Data processing
    train_dataset = TensorDataset(x_train, y_train)
    train_loader = DataLoader(
        dataset=train_dataset,
        batch_size=batch_size,
        shuffle=True,
        num_workers=2,
    )
    val_dataset = TensorDataset(x_val, y_val)
    val_loader = DataLoader(
        dataset=val_dataset,
        batch_size=2000,
        shuffle=True,
        num_workers=2,
    )

    # Main train loop
    for epoch in range(epochs):
        train_loop(train_loader, model, loss_func, optimizer, batch_size)
        val_loop(val_loader, model, loss_func, epoch)




In [18]:
loss = nn.CrossEntropyLoss(weight=class_weight)

train_model(
    model,
    torch.from_numpy(x_train).float(),
    torch.from_numpy(np.array([1 - y_train, y_train]).T).float(),
    torch.from_numpy(x_val).float(),
    torch.from_numpy(np.array([1 - y_val, y_val]).T).float(),
    optimizer,
    64,
    loss,
    3
)

7721it [00:13, 585.41it/s]                          


Epoch 0, Val Error: Accuracy: 64.8%, Avg loss: 0.200121


7721it [00:12, 605.64it/s]                          


Epoch 1, Val Error: Accuracy: 66.5%, Avg loss: 0.200017


7721it [00:13, 590.48it/s]                          


Epoch 2, Val Error: Accuracy: 61.0%, Avg loss: 0.200782


In [19]:
# Model prediction
y_score = model(torch.from_numpy(x_test).float()).detach().numpy()


In [20]:
# Model scoring
auc = roc_auc_score(y_test, y_score[:, 1])
print(f"The AUROC score of the model is {auc}")

The AUROC score of the model is 0.712450798254593


## Generating adversarial examples

### PGD Attack

Bellow is the PGD attack for image classification.
The perturbation is bounded by a maximum L2 norm, called epsilon (eps).
We initialy set the maximum perturbation to eps = 5.

In [21]:
n_examples = 1000
eps = 5
n_iter = 100
alpha = 2*eps
eps_for_division=1e-10

In [22]:
def perturb(scaler, x_origin, x_adv, grad, eps, alpha, n_iter, iter):
    x_origin = scaler.transform(x_origin)
    x_adv = scaler.transform(x_adv)

    # Compute L2 pertubation
    grad_norms = (
        torch.norm(grad.view(x_adv.shape[0], -1), p=2, dim=1)
        + eps_for_division
    )  # nopep8
    grad = grad / grad_norms.view(x_adv.shape[0], 1)
    
    
    decay_steps = max(n_iter // 10, 1)
    decay_factor = iter // decay_steps
    l_alpha = alpha / (2 ** decay_factor)

    # Apply L2 perturbation
    x_adv = x_adv + l_alpha * grad

    # Project on L2
    delta = x_origin - x_adv
    delta_norms = torch.norm(delta.view(x_adv.shape[0], -1), p=2, dim=1)
    factor = eps / delta_norms
    factor = torch.min(factor, torch.ones_like(delta_norms))
    delta = delta * factor.view(
        -1,
        1,
    )
    x_adv = x_origin + delta

    # Clamp
    x_adv = torch.clamp(x_adv, 0, 1)
    
    x_adv = scaler.inverse_transform(x_adv)

    return x_adv.detach()



def generate_adversarial2(model,  x, y, eps, alpha, iter, scaler, verbose=1):
    x_adv = x.clone().detach()

    iterable = range(iter)
    if verbose >0:
        iterable = tqdm(iterable)
    for i in iterable:
        x_adv.requires_grad = True
        output = model(x_adv)
        loss = F.cross_entropy(output, y)

        model.zero_grad()
        loss.backward()

        data_grad =  x_adv.grad.data
        x_adv = perturb(scaler, x, x_adv, data_grad, eps, alpha, n_iter, i)
    return x_adv


## Tasks

1. Write a `is_constrained_adversarial` function that, for a set of examples x and their correct labels y, determines if:
- x is adversarial,
- x respects the boundary constraints,
- x respects the type constraints,
- x respects the feature relation constraints,
- all of the above.

For boundary, you can tolerate 10 * torch.finfo((x).dtype).eps difference, due to float precision.

Type constraints can be access with:
```
metadata["type"]
```

Feature relation constraints are:

```
int_rate = Feature("int_rate") / Constant(1200)
term = Feature("term")
installment = Feature("loan_amnt") * (
    (int_rate * ((Constant(1) + int_rate) ** term))
    / ((Constant(1) + int_rate) ** term - Constant(1))
)
g1 = ABS(Feature("installment") - installment) <= 0.1

g2 = Feature("open_acc") <= Feature("total_acc")

g3 = Feature("pub_rec_bankruptcies") <= Feature("pub_rec")
```

Use the feat_to_idx function bellow to retrieve the feature in x, e.g.
```
int_rate = x[:, feat_to_ix("int_rate")]



In [23]:
def feature_to_idx(feature:str) -> int:
    return metadata[metadata["feature"] == feature].index[0]

In [54]:
# YOUR CODE HERE
# model(torch.from_numpy(x_t).float()[:5])

def verify_type_constraints(features: torch.Tensor, metadata) -> bool:
    # This code possibly can be improved
    for row in metadata.iterrows():
        feature_name = row[1]["feature"]
        feature_type = row[1]["type"]
        for value in features[:, feature_to_idx(feature_name)]:
            match feature_type:
                case "real":
                    if type(value.item()) not in (float, np.floating, torch.float):
                        return False
                case "int":
                    # if type(value.item()) not in (int, np.integer, torch.int) and value.item() % 1 != 0: #not value.item().is_integer():
                    if type(value.item()) not in (int, np.integer, torch.int) and not value.item().is_integer():
                        return False
    return True

def verify_feature_relations(features: torch.Tensor) -> bool:
    # Extract relevant features
    int_rate = features[: , feature_to_idx("int_rate")] / 1200
    term = features[:, feature_to_idx("term")]
    loan_amnt = features[:, feature_to_idx("loan_amnt")]
    installment = features[:, feature_to_idx("installment")]
    open_acc = features[:, feature_to_idx("open_acc")]
    total_acc = features[:, feature_to_idx("total_acc")]
    pub_rec_bankruptcies = features[:, feature_to_idx("pub_rec_bankruptcies")]
    pub_rec = features[:, feature_to_idx("pub_rec")]

    if (int_rate * 1200).le(0.01).any():
        print("Warning: int_rate is very low, check for possible division by zero.")
        
    expected_installment = loan_amnt * (
        (int_rate * ((1 + int_rate) ** term))
        / ((1 + int_rate) ** term - 1)
    )

    # Constraints
    g1 = (installment - expected_installment).abs() <= 0.1
    respects_g1 = g1.sum().item() == len(g1)
    print(f"Max deviation was: {(installment - expected_installment).abs().max().item()}")

    g2 = open_acc <= total_acc
    respects_g2 = g2.sum().item() == len(g2)
    print(f"Max difference in accs was: {(open_acc - total_acc).max().item()}")
    
    g3 = pub_rec_bankruptcies <= pub_rec
    respects_g3 = g3.sum().item() == len(g3)
    print(f"Max difference in pub rec was: {(pub_rec_bankruptcies - pub_rec).max().item()}")

    respects_all = respects_g1 and respects_g2 and respects_g3
    print(f"Respects feature relations: {respects_all} (g1: {respects_g1}, g2: {respects_g2}, g3: {respects_g3})")
    return respects_all

def is_constrained_adversarial(x, y, model, metadata) -> Tuple[float,float,float,float,float]:
    """
    - x is adversarial,
    - x respects the boundary constraints,
    - x respects the type constraints,
    - x respects the feature relation constraints,
    - all of the above.
    """
    features = torch.from_numpy(x).float()
    norm_features = scaler.transform(features)
    y_pred = model(features).detach().numpy()
    
    y_pred_class = np.argmax(y_pred, axis=1)
    y_class = np.argmax(y, axis=1)

    correct = (y_pred_class == y_class).astype(int)
    percent_correct = correct.mean()
    
    # - x is adversarial (Prediction different from true label)
    percent_adv = 1 - percent_correct

    # - x respects the boundary constraints,
    respects_boundary = norm_features <= (1 + 10 * torch.finfo(features.dtype).eps)
    percent_respects_boundary = respects_boundary.sum().item() / (respects_boundary.shape[0] * respects_boundary.shape[1])

    # - x respects the type constraints
    respects_type_constraint = verify_type_constraints(features, metadata)

    # - x respects the feature relation constraints
    respects_features_relations = verify_feature_relations(features)

    constraints_met = (
        percent_adv,
        percent_respects_boundary,
        respects_type_constraint, 
        respects_features_relations
    )

    return (*constraints_met, all(constraints_met))

2. Verify your is constrained function by running it on the test set instead of adversarial set. Few examples should be adversarial, but all should pass the constraints.

In [25]:
# YOUR CODE HERE
# is_constrained_adversarial(x_test[:5], np.array([1 - y_test, y_test]).T[:5], model, metadata)
is_constrained_adversarial(x_test, np.array([1 - y_test, y_test]).T, model, metadata)

Max deviation was: 431.1004333496094
Max difference in accs was: 0.0
Max difference in pub rec was: 0.0
Respects feature relations: False (g1: False, g2: True, g3: True)


(0.3877889052049913, 1.0, False, False, False)

3. Run PGD and evaluate the success rate of the attack based on the `is_constrained_adversarial` function.


In [26]:
# YOUR CODE HERE
from art.attacks.evasion import ProjectedGradientDescent
from art.estimators.classification import PyTorchClassifier

classifier = PyTorchClassifier(model=model, loss=loss, input_shape=(x_train.shape[1],), nb_classes=2, device_type=torch.device('cpu' if torch.cuda.is_available() else 'cpu').type)
attack_projected = ProjectedGradientDescent(
    estimator=classifier,
    eps=16 / 255 * 784**0.5,
    norm=2,
)
x_copy = x_test.astype(np.float32).copy()
y_copy = y_test.astype(np.int64).copy()
# By passing also the y copy into the adversarial examples generation function, an accuracy of 0% was achieved
x_test_adv_projected = attack_projected.generate(x_copy)


  from .autonotebook import tqdm as notebook_tqdm
                                                                    

In [None]:
adversarial_examples = torch.from_numpy(x_test_adv_projected)

no_constraint_examples_results = is_constrained_adversarial(x_test, np.array([1 - y_test, y_test]).T, model, metadata)
print(f"Results obtained from real examples: {no_constraint_examples_results}\n\n")

post_convert_results = is_constrained_adversarial(x_test_adv_projected, np.array([1 - y_test, y_test]).T, model, metadata)
print(f"Results obtained from adversarial examples: {post_convert_results}")

Max deviation was: 431.1004333496094
Max difference in accs was: 0.0
Max difference in pub rec was: 0.0
Respects feature relations: False (g1: False, g2: True, g3: True)
Results obtained from real examples: (0.3877889052049913, 1.0, False, False, False)


Max deviation was: 431.17205810546875
Max difference in accs was: 0.0011386871337890625
Max difference in pub rec was: 0.012365341186523438
Respects feature relations: False (g1: False, g2: False, g3: False)
Results obtained from adversarial examples: (0.6122157344449934, 0.9421065483515576, False, False, False)


4. Comment your results.

#### Real dataset
The real examples gave a % of adversarial examples of 0.32% (0.68% accuracy) which is not the best model per say.\
All the others constraints passed as expected except for G1, which has at least one element that does not respect the condition that the difference between installment and the expected installment value is wayy bigger than expected (431.1004 instead of a max of 0.1 due to two values mismatching: 483.3204(expected) and 52.200(real value))

#### Adversarial dataset
When we look at the adversarial examples set, the adversarial examples percentage is noticeably larger, which means that both the attack was successfull and that the metric evaluation looks correct.\
The remaining constraints all fail, as expected.

5. Adapt PGD to respect type constraints.

PGD is implemented for continuous numerical values only, hence it generates real values.
Write a function that converts reals to integer and guarantees that it does not break boundaries and epsilon constraints.
Integrates this function into PGD.

DO NOT remove/modify the cell with the original implementation of PGD, you will need it later.

In [None]:
# YOUR CODE HERE
from art.attacks.evasion import ProjectedGradientDescent
from art.estimators.classification import PyTorchClassifier

classifier = PyTorchClassifier(model=model, loss=loss, input_shape=(x_train.shape[1],), nb_classes=2, device_type=torch.device('cpu' if torch.cuda.is_available() else 'cpu').type)
attack_projected = ProjectedGradientDescent(
    estimator=classifier,
    eps=eps,
    norm=2,
)
x_copy = x_test.astype(np.float32).copy()
y_copy = y_test.astype(np.int64).copy()


x_test_pre_convert = attack_projected.generate(x_copy)

                                                                    

In [41]:
def PGD_convert_reals_to_integer(x_adversarial: np.ndarray, x_original: np.ndarray, metadata, eps) -> np.ndarray:
    x_rounded = x_adversarial.copy()
    int_features = metadata[metadata["type"] == "int"].index.tolist()

    x_rounded[:, int_features] = np.round(x_rounded[:, int_features])
    delta = x_rounded - x_original
    delta_norm = np.linalg.norm(delta, ord=np.inf, axis=1)

    for i in range(len(x_rounded)):
        if delta_norm[i] > eps:
            scale = eps / delta_norm[i]
            x_rounded[i] = x_original[i] + (x_rounded[i] - x_original[i]) * scale

    x_min = metadata["min"].to_numpy().astype(float)
    x_max = metadata["max"].to_numpy().astype(float)
    x_rounded = np.clip(x_rounded, x_min, x_max)
    
    x_rounded[:, int_features] = x_rounded[:, int_features].astype(int)
    return x_rounded

x_test_post_convert = PGD_convert_reals_to_integer(x_test_pre_convert, x_copy, metadata, eps=eps)

7. Compare the  success rate with the original implementation of PGD.

In [55]:
no_constraint_examples_results = is_constrained_adversarial(x_test_pre_convert, np.array([1 - y_test, y_test]).T, model, metadata)
print(f"Results obtained before type restriction: {no_constraint_examples_results}\n\n")

post_convert_results = is_constrained_adversarial(x_test_post_convert, np.array([1 - y_test, y_test]).T, model, metadata)
print(f"Results obtained after type restriction: {post_convert_results}")

Max deviation was: 431.29803466796875
Max difference in accs was: 0.00324249267578125
Max difference in pub rec was: 0.03526002913713455
Respects feature relations: False (g1: False, g2: False, g3: False)
Results obtained before type restriction: (0.6122110947950087, 0.9393474695399414, False, False, False)


Max deviation was: 431.1859130859375
Max difference in accs was: 0.0
Max difference in pub rec was: 0.0
Respects feature relations: False (g1: False, g2: True, g3: True)
Results obtained after type restriction: (0.7483685830740929, 1.0, True, False, False)


8. Comment your results.

While the first example g1, g2 and g3 constraints failed, g2 and g3 passed after the conversion. 
Moreover, it would be expected that the test for type constraint from the is_constrained_advesarial obtained after correcting the data would pass as True in the cleaned data but it would fail in the first data, and as seen in the "Results obtained" section, in the second example the third element is indeed true, therefore it passed the type constraint, whereas for the uncleaned data, it indeed failed the constraint

9. Write a function that for a sample X returns the constraints penalty function of the constraints above.

In [30]:
# YOUR CODE HERE
def constraint_penalty(features: torch.Tensor) -> torch.Tensor:
    # Unnormalize and extract relevant features
    print(features)
    model[0].scaler.inverse_transform(features)
    int_rate: torch.Tensor             = features[:, feature_to_idx("int_rate")]
    term: torch.Tensor                 = features[:, feature_to_idx("term")]
    loan_amnt: torch.Tensor            = features[:, feature_to_idx("loan_amnt")]
    installment: torch.Tensor          = features[:, feature_to_idx("installment")]
    open_acc: torch.Tensor             = features[:, feature_to_idx("open_acc")]
    total_acc: torch.Tensor            = features[:, feature_to_idx("total_acc")]
    pub_rec_bankruptcies: torch.Tensor = features[:, feature_to_idx("pub_rec_bankruptcies")]
    pub_rec: torch.Tensor              = features[:, feature_to_idx("pub_rec")]
    
    int_rate = int_rate / 1200
    expected_installment = loan_amnt * (
        (int_rate * ((1 + int_rate) ** term))
        / ((1 + int_rate) ** term - 1)
    )

    print(f"Expecter installment: {expected_installment}, installment: {installment}")
    g1_penalty = (expected_installment - installment).abs()
    g2_penalty = (open_acc - total_acc).clamp_min(0)
    g3_penalty = (pub_rec_bankruptcies - pub_rec).clamp_min(0)

    return g1_penalty + g2_penalty + g3_penalty

10. Integrates the constraints penalty function in the loss of the PGD attack as in CPGD (shown in class).



In [31]:
# YOUR CODE HERE
from art.attacks.evasion import ProjectedGradientDescent
from art.estimators.classification import PyTorchClassifier
from torch import Tensor

class ConstraintLoss(nn.CrossEntropyLoss):
    def forward(self, input: Tensor, target: Tensor) -> Tensor:
        print(input.shape)
        print(target.shape)
        ce_loss = super().forward(input, target)
        penalty = constraint_penalty(input).mean()
        loss = ce_loss + penalty
        return loss

constraint_loss = ConstraintLoss()

classifier = PyTorchClassifier(model=model, loss=constraint_loss, input_shape=x_train.shape, nb_classes=2, optimizer=optimizer, device_type="cpu")
attack_projected = ProjectedGradientDescent(
    estimator=classifier,
    eps=eps,
    norm=2,
    max_iter=20
)
x_copy = x_test.astype(np.float32).copy()
y_copy = y_test.astype(np.int64).copy()

x_test_constraint_l = attack_projected.generate(x_copy)

                                                        

torch.Size([32, 2])
torch.Size([32])
tensor([[-0.1719,  0.1545],
        [-0.0504,  0.0361],
        [-0.4274,  0.4002],
        [ 0.0802, -0.0938],
        [ 0.5210, -0.5036],
        [ 0.4330, -0.4225],
        [ 0.1290, -0.1359],
        [-0.5887,  0.5534],
        [-0.1173,  0.0931],
        [ 0.5733, -0.5492],
        [ 0.1978, -0.2070],
        [-0.7872,  0.7342],
        [ 0.1500, -0.1644],
        [-0.0495,  0.0338],
        [-0.3402,  0.3039],
        [ 0.0715, -0.0818],
        [ 0.0223, -0.0380],
        [ 0.0241, -0.0387],
        [-0.0094, -0.0078],
        [-0.2594,  0.2276],
        [ 0.6017, -0.5856],
        [ 0.0355, -0.0546],
        [ 0.3495, -0.3484],
        [-0.5410,  0.5027],
        [-1.2285,  1.1491],
        [ 0.6748, -0.6515],
        [ 0.2264, -0.2324],
        [ 0.2501, -0.2546],
        [-0.3273,  0.3050],
        [-0.1651,  0.1509],
        [ 0.0292, -0.0430],
        [-0.4640,  0.4253]], grad_fn=<AddmmBackward0>)


RuntimeError: The size of tensor a (2) must match the size of tensor b (23) at non-singleton dimension 1

11. Compare the success rate with previous implemenations of PGD.


In [None]:
# YOUR CODE HERE
no_constraint_examples_results = is_constrained_adversarial(x_test_adv_projected, np.array([1 - y_test, y_test]).T, model, metadata)
print(f"Results obtained without constraints set: {no_constraint_examples_results}\n\n")

constraint_examples_results = is_constrained_adversarial(x_test_constraint, np.array([1 - y_test, y_test]).T, model, metadata)
print(f"Results obtained with constraints set: {constraint_examples_results}")

Max deviation was: 430.99298095703125
Max difference in accs was: 0.0042877197265625
Max difference in pub rec was: 0.0794677734375
Respects feature relations: False (g1: False, g2: False, g3: False)
Results obtained without constraints set: (0.7412281617474777, 0.9156183831404803, False, False, False)


Max deviation was: 430.99285888671875
Max difference in accs was: 0.005474090576171875
Max difference in pub rec was: 0.11863107979297638
Respects feature relations: False (g1: False, g2: False, g3: False)
Results obtained with constraints set: (0.7057835556885588, 0.9196476165664149, False, False, False)


12. Comment your results.

PGD attack is passing only two elements per example to the input of the loss function, which is unexpected, as I could not find the error in my code, I was unfortunately unable to finish this exercise.
However, I believe that the expected result would be adversarial examples that would indeed be considered a real example by our is_constrained_adversarial but it would be an adversarial example, bypassing our security function implemented