# Rule-based scores for unusual WMH distribution frequencies

This notebook implements rule-based scores that aim to identify participants with unusual WMH distribution frequencies, based on a N=3525 multicentre memory clinic cohort.

In [None]:
import numpy as np
import scipy
import SimpleITK as sitk

## Parameters

All relevant parameters for this script can be set in the box below, related to either the input, settings, or output.

### Input 

- **lesion_prevalence_filename:** file that contains the lesion prevalence map for this dataset. Can be downloaded from: https://doi.org/10.34894/FYL9ID 

In [None]:
lesion_prevalence_filename = r"metavcimap_memory_clinic_n3525_mni_space.nii.gz"

In [None]:
N = 3525
lesion_prevalence = sitk.GetArrayFromImage(sitk.ReadImage(lesion_prevalence_filename)).ravel()

In [None]:
lesion_prevalence_image = sitk.ReadImage(lesion_prevalence_filename)
lesion_prevalence_array = sitk.GetArrayFromImage(lesion_prevalence_image)

lesion_prevalence_probabilities = lesion_prevalence_array / N
lesion_prevalence_probabilities_inverted = 1-lesion_prevalence_probabilities

## Score 1
The first rule-based score (RB Score 1) was derived from WMH distribution frequencies, assigning a high score to participants having WMH voxels in low-probability regions based on the probability distribution of the whole cohort. It was calculated as 1 – the probability of a lesion in a certain voxel, and this was summed up for all voxels. The score was individually adjusted for total normalized WMH volume, by dividing the score with the square root of the total normalized WMH volume per participant.

In [None]:
def score1(patientXArray):
    score = np.sum(lesion_prevalence_probabilities_inverted * patientXArray)

    #volumeCorrection = 1                             # = no volume correction
    volumeCorrection = np.sqrt(np.sum(patientXArray)) # = divide by sqrt lesion volume

    return score / volumeCorrection

## Score 2
The second rule-based score (RB Score 2) assigned a high score to lesions (of at least ten voxels in size) at locations where less than five participants had a lesion. It was implemented by assessing only lesions at locations where less than five participants in the dataset had a lesion by masking out all other locations, and computing the sum of 1 – the probability of a lesion in a certain voxel for all remaining voxels. The score was individually adjusted for total normalized WMH volume, by dividing the score with the square root of the total normalized WMH volume per participant. 

In [None]:
def score2(patientXArray):  
    label, num_features = scipy.ndimage.label(patientXArray, np.ones((3,3,3)))
    try:
        perLesionSize = scipy.ndimage.labeled_comprehension(lesion_prevalence_probabilities_inverted, label, range(1, num_features+1), np.count_nonzero, np.float64, 0)
        perLesionScore = scipy.ndimage.labeled_comprehension(lesion_prevalence_probabilities_inverted * (lesion_prevalence_probabilities_inverted > (1 - 5/N)), label, range(1, num_features+1), np.sum, np.float64, 0)

        #volumeCorrection = 1                             # = no volume correction
        volumeCorrection = np.sqrt(np.sum(patientXArray)) # = divide by sqrt lesion volume

        return np.sum(perLesionScore[perLesionSize >= 10]) / volumeCorrection
    except:
        return -1

In [None]:
unseen_nii = sitk.ReadImage("filename.nii")
unseen_matrix = sitk.GetArrayViewFromImage(unseen_nii)

print("Score 1", score1(unseen_matrix))
print("Score 2", score2(unseen_matrix))