## Using ML anonymization to defend against attribute inference attacks

### Load data

In [1]:
import pandas as pd
import numpy as np
from sklearn.metrics import accuracy_score

import warnings
# Filter out all warnings
warnings.filterwarnings("ignore")


#### First of all, we need to import the required packages to perform our privacy analysis and mitigation. You will need to have the `holisticai` package installed on your system, remember that you can install it by running: 
```bash
!pip install holisticai[all]
```

In [2]:
from holisticai.datasets import load_dataset
loaded = load_dataset(dataset='adult', preprocessed=False, as_array=False)
df = pd.DataFrame(data=loaded.data, columns=loaded.feature_names)
df['class'] = loaded.target.apply(lambda x: 1 if x == '>50K' else 0)

In [3]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline

X = df.iloc[:, :-1]
y = df.iloc[:, -1]

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Identify categorical features
categorical_features = X.select_dtypes(include=['category']).columns

# Create transformers for numerical and categorical features
numeric_transformer = Pipeline(steps=[('scaler', StandardScaler())])
categorical_transformer = Pipeline(steps=[('onehot', OneHotEncoder(handle_unknown='ignore'))])

# Combine transformers into a preprocessor using ColumnTransformer
preprocessor = ColumnTransformer(
    transformers=[
        ('num', numeric_transformer, X.select_dtypes(exclude=['category']).columns),
        ('cat', categorical_transformer, categorical_features)
    ])

# Fit and transform your data using the ColumnTransformer
X_train_transformed = preprocessor.fit_transform(X_train)
X_test_transformed = preprocessor.transform(X_test)

### Train decision tree model

In [4]:
from sklearn.tree import DecisionTreeClassifier

DTC = DecisionTreeClassifier()
DTC.fit(X_train_transformed, y_train)
# Predict values
y_pred = DTC.predict(X_test_transformed)
y_proba = DTC.predict_proba(X_test_transformed)
print('Base model accuracy: ', DTC.score(X_test_transformed, y_test))

Base model accuracy:  0.8185075237997748


### BlackBox Attack

In [5]:
from holisticai.privacy.metrics import BlackBoxAttack

attack_feature = 'education'
predictions_of_attack_feature_1 = BlackBoxAttack(attack_feature, X_train, y_train, X_test, y_test)
print(accuracy_score(X_test['education'], predictions_of_attack_feature_1))


0.8921076875831713


#### This means that for 89% of the training set, the attacked feature is inferred correctly using this attack.



### Anonymized data. Improving privacy 

In [6]:
from holisticai.privacy.mitigation import Anonymize

X_train = X_train.set_index(pd.Series(range(len(X_train))))
features = X_train.columns
QI = ['education', 'marital-status', 'age']
anonymizer = Anonymize(100, QI, categorical_features=list(categorical_features), features_names=features)
anon = anonymizer.anonymize(X_train, y_train)

### Train decision tree model on anonymized data


In [7]:
# Fit and transform your data using the ColumnTransformer
X_train_transformed_anon = preprocessor.fit_transform(anon)
X_test_transformed = preprocessor.transform(X_test)

In [8]:
DTC_anon = DecisionTreeClassifier()
DTC_anon.fit(X_train_transformed_anon, y_train)
# Predict values
y_pred_anon = DTC_anon.predict(X_test_transformed)
y_proba_anon = DTC_anon.predict_proba(X_test_transformed)

print('Anonymized model accuracy: ', DTC_anon.score(X_test_transformed, y_test))

Anonymized model accuracy:  0.8026410072678882


### BlackBox Attack on Anonymized model

In [9]:
# Extract dtypes from DataFrame 1
dtypes_to_apply = X_train.dtypes.to_dict()
# Set dtypes for all columns in DataFrame 2 based on DataFrame 1
anon = anon.astype(dtypes_to_apply)

attack_feature = 'education'
predictions_of_attack_feature_2 = BlackBoxAttack(attack_feature, anon, y_train, X_test, y_test)
print(accuracy_score(X_test['education'], predictions_of_attack_feature_2))

0.5658716347630259


#### This means that for 56% of the training set, the attacked feature is inferred correctly using this attack.
