
**Permutation Feature Importance (PFI):**

The idea behind PFI is to evaluate the importance of a feature by measuring the increase in the model's prediction error after permuting the feature. This is done to "destroy" the structure of the feature while keeping the marginal distribution $ P(x_j) $ unchanged. The steps are as follows:

1. Measure the baseline error of the model using the test set without permuting any features.
2. For the feature of interest, permute its values across the observations in the test set, thus creating a new dataset where the association between the feature and the outcome is broken.
3. Measure the error of the model on this new dataset.
4. The importance of the feature is then the difference between the error with the permuted feature and the baseline error.
5. Repeat this process multiple times and average the differences to get the Permutation Feature Importance score for that feature.

**Testing Importance using Permutations (PIMP):**

PIMP is a method to test the statistical significance of the Permutation Feature Importance scores. This is done by creating a null distribution of feature importance scores under the hypothesis that the feature is not informative (i.e., it has no relationship with the response variable). The steps are:

1. Permute the response variable $ y $ and retrain the model with the permuted $ y $ and original features $ X $ to compute feature importance under the null hypothesis $ H_0 $. This is repeated for a number of repetitions to create a distribution of importance scores under $ H_0 $.
2. Train the model with the original $ X $ and $ y $ to compute the actual feature importance.
3. For each feature, fit a probability distribution to the null importance scores (can be Gaussian, lognormal, gamma, or non-parametric).
4. Compute the actual feature importance for the model with the original $ y $ (under $ H_1 $, the alternative hypothesis).
5. Retrieve the p-value of the actual feature importance based on the fitted null distribution.

The PIMP method allows one to assess whether the observed importance is significantly different from what would be expected by random chance, thus providing a statistical significance test for feature importance scores.


In [None]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.utils import shuffle
from scipy.stats import norm, gamma, lognorm

# Assume data_original is a pandas DataFrame with the last column being the binary target
X = data_original.iloc[:, :-1]  # Features
y = data_original.iloc[:, -1]   # Labels (binary)

# Function to compute permutation importance for a single feature
def permutation_importance(model, X, y, feature, n_repeats=30):
    baseline_accuracy = accuracy_score(y, model.predict(X))
    scores = np.zeros(n_repeats)
    for n in range(n_repeats):
        X_permuted = X.copy()
        X_permuted[feature] = shuffle(X_permuted[feature].values, random_state=None)
        permuted_accuracy = accuracy_score(y, model.predict(X_permuted))
        scores[n] = baseline_accuracy - permuted_accuracy
    return scores

# Calculate the actual importances for each feature with unpermuted y
actual_importances = {feature: permutation_importance(trained_model, X, y, feature) for feature in X.columns}

# actual_importances[feature] contains the PFI scores for each feature

# Initialize null importances
null_importances = {column: [] for column in X.columns}

n_repetitions = 100  # Number of repetitions for null distribution
for _ in range(n_repetitions):
    y_permuted = shuffle(y, random_state=None)
    null_model = trained_model.fit(X, y_permuted)  # Retrain model with permuted y
    for column in X.columns:
        null_importance = permutation_importance(null_model, X, y_permuted, n_repeats=1)
        null_importances[column].extend(null_importance)

# Assuming a normal distribution for the null importances
p_values = {}
for column in X.columns:
    # Fit a normal distribution to the null importances
    mu, std = norm.fit(null_importances[column])
    # Compute the p-value for the actual importance
    p_value = 1 - norm.cdf(actual_importances[X.columns.get_loc(column)], mu, std)
    p_values[column] = p_value

# Print p-values for each feature
print(p_values)