<a href="https://colab.research.google.com/github/JayJaewonYoo/RadiomicsLiverFibrosisDetection/blob/main/inference.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Inference using radiomics-based liver fibrosis detection models
This notebook serves to demonstrate how to run the radiomics-based liver fibrosis detection models proposed in the paper [Non-invasive Liver Fibrosis Screening on CT Images using Radiomics](https://arxiv.org/abs/2211.14396).

## Imports

In [1]:
!python3 -m pip install numpy==1.26.4
!python3 -m pip install scikit-learn==1.5.1
!python3 -m pip install pandas==2.2.2
!python3 -m pip install onnxruntime==1.17.1
!python3 -m pip install skops==0.10.0

Collecting scikit-learn==1.5.1
  Downloading scikit_learn-1.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Downloading scikit_learn-1.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.4/13.4 MB[0m [31m82.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: scikit-learn
  Attempting uninstall: scikit-learn
    Found existing installation: scikit-learn 1.5.2
    Uninstalling scikit-learn-1.5.2:
      Successfully uninstalled scikit-learn-1.5.2
Successfully installed scikit-learn-1.5.1
Collecting onnxruntime==1.17.1
  Downloading onnxruntime-1.17.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (4.3 kB)
Collecting coloredlogs (from onnxruntime==1.17.1)
  Downloading coloredlogs-15.0.1-py2.py3-none-any.whl.metadata (12 kB)
Collecting humanfriendly>=9.1 (from coloredlogs->onnxruntime==1.17.1)
  Downloading humanfriendly-10.0-py2.py3

In [2]:
# The following set of imports are loaded directly to check versions
import numpy as np
import sklearn
import pandas
import onnxruntime
import skops

# The following imports are required for loading and processing the data, and then training the model
# These imports will also be required if skops is used
from sklearn import set_config
from pandas import DataFrame

# These imports are used to save the model to ONNX and run the model from ONNX, note numpy is also required
from onnxruntime import InferenceSession

# This import is used to save the model using skops and run the model from ONNX
import skops.io as sio

## Versions used for this notebook example

In [3]:
!python --version

Python 3.10.12


In [4]:
print(np.__version__)
print(sklearn.__version__)
print(pandas.__version__)

1.26.4
1.5.1
2.2.2


In [5]:
# Only one of these will be necessary depending on how the model is loaded
print(onnxruntime.__version__)
print(skops.__version__)

1.17.1
0.10.0


## Download model files
The model files can be found in the GitHub repository.

In [6]:
!mkdir persisted_models/

In [7]:
!wget https://github.com/JayJaewonYoo/RadiomicsLiverFibrosisDetection/raw/refs/heads/main/persisted_models/liverfibrosisdetectionmodel.skops -P persisted_models/
!wget https://github.com/JayJaewonYoo/RadiomicsLiverFibrosisDetection/raw/refs/heads/main/persisted_models/liverfibrosisdetectionmodel.onnx -P persisted_models/

--2024-10-05 04:15:34--  https://github.com/JayJaewonYoo/RadiomicsLiverFibrosisDetection/raw/refs/heads/main/persisted_models/liverfibrosisdetectionmodel.skops
Resolving github.com (github.com)... 140.82.113.4
Connecting to github.com (github.com)|140.82.113.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/JayJaewonYoo/RadiomicsLiverFibrosisDetection/refs/heads/main/persisted_models/liverfibrosisdetectionmodel.skops [following]
--2024-10-05 04:15:34--  https://raw.githubusercontent.com/JayJaewonYoo/RadiomicsLiverFibrosisDetection/refs/heads/main/persisted_models/liverfibrosisdetectionmodel.skops
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 32078 (31K) [application/octet-stream]
Saving to: ‘

## Generating data

Generating toy data to demonstrate how the data should look and how to perform inference using the public liver fibrosis models. The toy data is not indicative of any real-world radiomics data and is used purely for demonstrative purposes.

Note that the input data is in data type Float32.

In [8]:
features = ['original_firstorder_Energy', 'original_firstorder_Kurtosis', 'original_firstorder_Skewness', 'lbp-3D-k_firstorder_Maximum', 'wavelet-LHL_glszm_SmallAreaHighGrayLevelEmphasis']
toy_feature_data = {
    'original_firstorder_Energy': np.random.uniform(low=7e+02, high=5e+04, size=(20, 1)),
    'original_firstorder_Kurtosis': np.random.uniform(low=3e+00, high=3e+01, size=(20, 1)),
    'original_firstorder_Skewness': np.random.uniform(low=-3e-01, high=4e+00, size=(20, 1)),
    'lbp-3D-k_firstorder_Maximum': np.random.uniform(low=2e+00, high=1e+01, size=(20, 1)),
    'wavelet-LHL_glszm_SmallAreaHighGrayLevelEmphasis': np.random.uniform(low=7e-01, high=2e+00, size=(20, 1)),
}
toy_data = DataFrame(data=np.concatenate([toy_feature_data[feature] for feature in features], axis=1), columns=features, dtype=np.float32)

In [9]:
toy_data.head()

Unnamed: 0,original_firstorder_Energy,original_firstorder_Kurtosis,original_firstorder_Skewness,lbp-3D-k_firstorder_Maximum,wavelet-LHL_glszm_SmallAreaHighGrayLevelEmphasis
0,35107.714844,19.26366,-0.285537,9.950083,1.03098
1,11977.683594,5.789762,3.633077,2.280541,1.706167
2,41588.144531,17.310648,2.977247,8.721092,1.422808
3,28302.169922,3.197089,1.912581,2.842958,1.528551
4,3265.68457,29.984007,0.60736,6.906138,1.473105


## Loading and inference using skops.io saved model

### Verifying that there are no untrustworthy types in the saved model

In [10]:
unknown_types = sio.get_untrusted_types(file="persisted_models/liverfibrosisdetectionmodel.skops")
print(unknown_types)

[]


### Loading in the skop.io saved liver fibrosis detection model

In [11]:
liverfibrosisdetectionmodel_skops = sio.load("persisted_models/liverfibrosisdetectionmodel.skops")

### Generating predictions using the skops.io saved liver fibrosis detection model
The set_config function call ensures that the input is processed as a DataFrame.

The probabilities acquired from the model can be used to acquire the predictions of whether or not input radiomic features are indicative of liver fibrosis. The decision threshold can be altered depending on the desired sensitivity and specificity of the model.

In [12]:
set_config(transform_output="pandas")

probabilities = liverfibrosisdetectionmodel_skops.predict_proba(toy_data)[:, 1]
print(probabilities)

decision_threshold = 0.5 # Setting decision threshold to 0.5 for this example
predictions = probabilities >= decision_threshold
print(predictions)

[0.70843535 0.60461219 0.70266035 0.50842795 0.74156593 0.58430507
 0.6399475  0.59643141 0.64439643 0.84082755 0.41273224 0.62485578
 0.39055361 0.6102398  0.7945987  0.59036812 0.8222094  0.69808621
 0.55235688 0.5796561 ]
[ True  True  True  True  True  True  True  True  True  True False  True
 False  True  True  True  True  True  True  True]


## Loading and inference using ONNX saved model

### Loading in the ONNX saved liver fibrosis detection model

In [13]:
with open("persisted_models/liverfibrosisdetectionmodel.onnx", "rb") as f:
    onnx = f.read()
sess = InferenceSession(onnx, providers=["CPUExecutionProvider"])

### Helper function used to convert the input dataframes into valid inputs for the ONNX model

The hyphens in the features names are converted to underscores because ONNX seems to automatically perform that conversion.

In [14]:
def convert_onnx_inputs(input_df):
    return {c.replace("-", "_"): input_df[c].values.reshape((-1, 1)) for c in input_df.columns}

### Generating predictions using the skops.io saved liver fibrosis detection model
If a decision threshold of 0.5 is used, the predictions can be acquired simply by using outputs[0], which will have 0 for F0 and 1 for F1-F4.

Note that the probabilities are not the same as the skops.io saved model. This is due to the conversion to ONNX format. The predictions for the most part will be similar enough that there is no significant difference but it can affect the predictions of some patients.

In [15]:
# Note that the probabilities are different but similar enough to not affect predictions at decision thresholds that don't significantly vary from 0.5 (for example, a difference will be seen at decision thresholds of 0.9), this is a result of the ONNX conversion
outputs = sess.run(None, convert_onnx_inputs(toy_data))
probabilities = outputs[1][:, 1]
print(probabilities)

# predictions = outputs[0] == 1
decision_threshold = 0.5 # Setting decision threshold to 0.5 for this example
predictions = probabilities >= decision_threshold
print(predictions)

[0.85515195 0.7004497  0.8481285  0.5168511  0.8917022  0.6639492
 0.7595607  0.68594635 0.76656127 0.9654036  0.33062413 0.7350545
 0.29111573 0.71025866 0.9373647  0.67501915 0.95533097 0.84242755
 0.60357803 0.6553689 ]
[ True  True  True  True  True  True  True  True  True  True False  True
 False  True  True  True  True  True  True  True]
