Skip to content

ncaptier/radshap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

radshap

Documentation Status Code style: black PyPI version Downloads

This repository proposes a python tool for highlighting the contribution of different regions of interest (ROIs) to the predictions of radiomic models. It estimates the Shapley value of the different ROIs of an image that a trained radiomic model uses to obtain a prediction.

Graphical abstract

a. schematic view of a generic aggregated radiomic model - b. computation of a Shapley value for a specific region.

Documentation

https://radshap.readthedocs.io/en/latest

Install

Install the latest stable version with PyPi

pip install radshap

Install from source

pip install git+https://github.com/ncaptier/radshap.git

Experiments

We provide a jupyter notebook for an illustration with PET images and simple aggregation strategies:

We provide a jupyter notebook for an illustration with PET images and custom aggregation strategies:

We provide a jupyter notebook for an illustration of a robust strategy for computing Shapley values:

Examples

Explanation with Shapley values

import numpy as np
import joblib
from radshap.shapley import Shapley

model = joblib.load("trained_logistic_regression.joblib")
shap = Shapley(predictor = lambda x: model.predict_proba(x)[:, 1], aggregation = ('mean', None))
shapvalues = shap.explain(X) # X a 2D array of shape (n_instances, n_instance_features)

Robust explanation with Shapley values

import numpy as np
import joblib
from radshap.shapley import RobustShapley

model = joblib.load("trained_logistic_regression.joblib")
shap = RobustShapley(predictor = lambda x: model.predict_proba(x)[:, 1],
                     aggregation = ('nanmean', None),
                     background_data = Xback) # Xback a 2D array of shape (n_samples_background, n_input_features)
shapvalues = shap.explain(X) # X a 2D array of shape (n_instances, n_instance_features)

Explanation with Shapley values and custom aggregation function

import numpy as np
import joblib
from radshap.shapley import Shapley

model = joblib.load("trained_linear_regression.joblib")
# Compute the average prediction to approximate a "random" prediction with no information (required for RadShap)
predictions = np.load('predictions.npy')
mean_pred = predictions.mean()

def custom_agg_function(Xsub):
    """ Aggregate an arbitrary subset of regions (Xsub array with and arbitray 
    number of rows) into a valid aggregated input for the predictive model.
    
    Parameters
    ---------
    Xsub: 2D array of shape (n_instances, n_instance_features)
    
    Returns
    -------
    agg_input: 1D array of shape (1, n_input_features)
    """ 
    
    ... #aggregate information from the differente regions in Xsub (i.e rows)
    ... #to obtain a valid aggregated input for the predictive model
    
    return agg_input

shap = Shapley(predictor = lambda x: model.predict(x),
               aggregation = custom_agg_function,
               empty_value = mean_pred)
shapvalues = shap.explain(X) # X a 2D array of shape (n_instances, n_instance_features)

License

This project is licensed under a custom open-source license (see the LICENSE.md file for more details).

Acknowledgements

This package was created as a part of the PhD project of Nicolas Captier in the Laboratory of Translational Imaging in Oncology (LITO) of Institut Curie.