Bring metrics in PAM50 anatomical dimensions in `sct_process_segmenta…

…tion` (#3977) * add sript * add linear interpolation * intergrate normalization in process seg * Remove unused imports * Clarify comments * Clarify iterations over dicts * Clarify names of nb_slices variables * Set default value for `-vert` flag. * Get abs path for arguments.vertfile * Rename fname_vert_level to fname_vert_levels * Update authors * Clarify variable names * Add sct_progress_bar to monitor metrics aggregation * Add TODO if len(metrics_inter) == len(slices_PAM50) * Remove empty lines at the file end * remove default value for -vert * change fname levels for level * remove condition if len == * fix lint * change suffix for PAM50 in csv * put NAN if key is length * Create PAM50.csv filename in more robust way (it was failing for paths including `.` like /home/GRAMES.POLYMTL.CA) * change image orientation to RPI * remove levels 49 and higher * fix parser for normalize¸ * add testing of normalize PAM50 * add -normalize PAM50 in batch processing * change row for PAM50 csv file * add cahched file * remove -vert argument * remove -vert for sct_process_segmentation with -normalize PAM50 * seperate -normalize PAM50 into different arg * remove PAM50 description from arg -normalize * remove trailing whitespace * remove PAM50 option in SepreateNormArgs * fetch arguments.normalize_PAM50 argument and change in condition * if -normalize-pam50 run using pam50 metrics, output one csv file only * remove unnecessary raise * remove whitespace * remove extra \n * remove indent block with PAM50 condition * continuation underline fix * remove whitespace lines * change -normalize PAM50 argument for -normalize-PAM50 1 * remove filename_pam50 sinec only one .csv file is output now * `test_cli_sct_process_segmentation.py`: Fixup test This commit fixes 2 issues: - The index # didn't match the IS slice #. - The hardcoded value was too precise. * `metrics_to_PAM50.py`: Remove `levels_2_skip` variable The purpose of this commit is to reduce the number of lines of code, which saves readers some time/effort. Rather than storing levels[0] and levels[-1] in a variable, we can just access those values directly from `levels` itself. We also sort `levels`, that way we don't need to use `min` or `max` -- we can just use [0] and [-1]. * `metrics_to_PAM50.py`: Iterate through slices, not levels The purpose of this commit is to pre-compute the slice variables, which saves some lines of code inside the loop. Also, later on, this will help us compute `scales` outside the loop, which is important for letting us combine the two loops. (NB: The index `[::len(levels)-1]` looks complicated, but don't worry, I'm going to remove that indexing later on when I combine the two loops.) * `metrics_to_PAM50.py`: Replace `nb_slices_<>` with `len()` calls This change is a bit nitpicky, and this change is probably unnecessary, but I think that `len()` is pretty self-explanatory -- I don't know that we need an entire variable to represent it. This saves a few lines of code. * `metrics_to_PAM50.py`: Compute `scale_mean` outside the loop This change lets us combine the two loops. * `metrics_to_PAM50.py`: Use `metrics_inter` for both loops This change also lets us combine the two loops. * `metrics_to_PAM50.py`: Merge two loops into one This change saves us from having to duplicate the same instructions across two loops. It also allows us to more easily compare the different cases (levels[0], levels[1:-1], levels[-1]), since the variable assignments are grouped together. * `metrics_to_PAM50.py`: Replace `np.empty`+`np.fill` with `np.full` This saves us a line of code! Source: https://stackoverflow.com/a/26289777 * `metrics_to_PAM50.py`: Use dictionary comprehensions This saves us a quite a bit of space, too! * `metrics_to_PAM50.py`: Combine `levels` into single step This is another space-saving commit. * `metrics_to_PAM50.py`: Move dict init closer to loop This helps to group related code: - The dictionary initialization - The loop that fills up the dictionray * `metrics_to_PAM50.py`: Simplify interpolation by extracting out `diff` This helps us to avoid using a bunch of `len()` calls, and makes the conditionals a lot easier to read. * `download.py`: Use `sct_testing_data-PR3977.zip` This lets us temporarily test the new release, prior to making any actual changes to the `sct_testing_data` repo * `batch_processing.sh`: Use new option (`-normalize-PAM50`) * Change `csa_pam50_PAM50.csv` -> `csa_pam50.csv` * add missing - for -nromalize-PAM50 * remove -vertfile error * add error if normalize-PAM50 and vertfile doesn't exists * move initialization of normalize_pam50 arg for vertfile check * add function get all available vertebral levels * force levels of PAM50 or native space if -normalize-PAM50 1 * check if levels is empty before setting the value * use image.getNonZeroValue instead of get_all_vertebral_level * add getNonZeroValues * remove function get_all_vertebral_levels * remove import of get_all_vertebral_level * add updated cached results for csa_pam50.csv * update row for csa_pam50.csv with updated chached results * change row of csv file to check * change error to FileNotFoundError * Restore old permissions (100644 -> 100755) * `download.py`: Replace dummy link with `r20230207` release --------- Co-authored-by: valosekj <jan.valosek@upol.cz> Co-authored-by: Joshua Newton <joshuacwnewton@gmail.com>
spinalcordtoolbox · Feb 8, 2023 · 2f592a9 · 2f592a9
1 parent e64f7a2
commit 2f592a9
Show file tree

Hide file tree

Showing 8 changed files with 566 additions and 6 deletions.
diff --git a/batch_processing.sh b/batch_processing.sh
@@ -93,6 +93,8 @@ sct_process_segmentation -i t2_seg.nii.gz -vert 2:3 -o csa_c2c3.csv
 sct_detect_pmj -i t2.nii.gz -c t2 -qc "$SCT_BP_QC_FOLDER" 
 # Compute cross-section area at 60 mm from PMJ averaged on a 30 mm extent
 sct_process_segmentation -i t2_seg.nii.gz -pmj t2_pmj.nii.gz -pmj-distance 60 -pmj-extent 30 -qc "$SCT_BP_QC_FOLDER" -qc-image t2.nii.gz -o csa_pmj.csv
+# Compute morphometrics in PAM50 anatomical dimensions
+sct_process_segmentation -i t2_seg.nii.gz -vertfile t2_seg_labeled.nii.gz -perslice 1 -normalize-PAM50 1 -o csa_pam50.csv
 # Go back to root folder
 cd ..
 

diff --git a/spinalcordtoolbox/download.py b/spinalcordtoolbox/download.py
@@ -32,8 +32,8 @@
     },
     "sct_testing_data": {
         "mirrors": [
-            "https://github.com/spinalcordtoolbox/sct_testing_data/releases/download/r20210330230310/sct_testing_data-r20210330230310.zip",
-            "https://osf.io/download/60629509229503022e6f107d/",
+            "https://github.com/spinalcordtoolbox/sct_testing_data/releases/download/r20230207/sct_testing_data-r20230207.zip",
+            "https://osf.io/5twvs/?action=download",
         ],
         "default_location": os.path.join(__sct_dir__, "data", "sct_testing_data"),
     },

diff --git a/spinalcordtoolbox/image.py b/spinalcordtoolbox/image.py
@@ -599,6 +599,16 @@ def getNonZeroCoordinates(self, sorting=None, reverse_coord=False, coordValue=Fa
 
         return list_coordinates
 
+    def getNonZeroValues(self, sorting=True):
+        """
+        This function return all the non-zero unique values that the image contains.
+        If sorting is set to True, the list will be sorted.
+        """
+        list_values = list(np.unique(self.data[self.data > 0]))
+        if sorting:
+            list_values.sort()
+        return list_values
+
     def getCoordinatesAveragedByValue(self):
         """
         This function computes the mean coordinate of group of labels in the image. This is especially useful for label's images.

diff --git a/spinalcordtoolbox/metrics_to_PAM50.py b/spinalcordtoolbox/metrics_to_PAM50.py
@@ -0,0 +1,73 @@
+# Functions to interpolate metrics (from sct_process_segmentation) into the PAM50 anatomical dimensions
+# Author: Sandrine Bédard & Jan Valosek
+
+import numpy as np
+from spinalcordtoolbox.image import Image
+from spinalcordtoolbox.aggregate_slicewise import Metric
+from spinalcordtoolbox.template import get_slices_from_vertebral_levels
+
+
+def interpolate_metrics(metrics, fname_vert_levels_PAM50, fname_vert_levels):
+    """
+    Interpolates metrics perlevel into the PAM50 anatomical dimensions.
+    :param metrics: Dict of class Metric(). Output of spinalcordtoolbox.process_seg.compute_shape.
+    :param fname_vert_levels_PAM50: Path to the PAM50_levels.nii.gz (PAM50 labeled segmentation).
+    :param fname_vert_levels: Path to subject's vertebral labeling file.
+    :return metrics_PAM50_space: Dict of class Metric() in PAM50 anatomical dimensions.
+    """
+    # Load PAM50 labeled segmentation
+    im_seg_labeled_PAM50 = Image(fname_vert_levels_PAM50)
+    im_seg_labeled_PAM50.change_orientation('RPI')
+    # Load subject's labeled segmentation
+    im_seg_labeled = Image(fname_vert_levels)
+    im_seg_labeled.change_orientation('RPI')
+
+    # Get unique integer vertebral levels (but exclude 0, 49, and 50, as these aren't vertebral levels)
+    levels = sorted(int(level) for level in np.unique(im_seg_labeled.data) if 0 < int(level) < 49)
+
+    # Get slices corresponding to each level
+    level_slices_PAM50 = [get_slices_from_vertebral_levels(im_seg_labeled_PAM50, level) for level in levels]
+    level_slices_im = [get_slices_from_vertebral_levels(im_seg_labeled, level) for level in levels]
+
+    # Find the mean scaling between the image and PAM50 (excluding first and last levels)
+    scales = [len(slices_PAM50)/len(slices_im) for slices_PAM50, slices_im
+              in zip(level_slices_PAM50[1:-1], level_slices_im[1:-1])]
+    scale_mean = np.mean(scales)
+
+    # Initialize a metrics dict filled by NaN with number of rows equal to number of slices in PAM50 template
+    z = im_seg_labeled_PAM50.dim[2]  # z == number of slices
+    metrics_PAM50_space_dict = {k: np.full([z], np.nan) for k in metrics.keys()}
+    # Loop through slices per-level (excluding first and last levels), populating the metrics dict
+    for level, slices_PAM50, slices_im in zip(levels, level_slices_PAM50, level_slices_im):
+        # Prepare vectors for the interpolation
+        if level in [levels[0], levels[-1]]:
+            # Note: since the first/last levels can be incomplete, we use the mean scaling factor from all other levels
+            x_PAM50 = np.linspace(0, scale_mean * len(slices_im), int(scale_mean * len(slices_im)))
+            x = np.linspace(0, scale_mean * len(slices_im), len(slices_im))
+        else:
+            x_PAM50 = np.arange(0, len(slices_PAM50), 1)
+            x = np.linspace(0, len(slices_PAM50) - 1, len(slices_im))
+        # Loop through metrics
+        for key, value in metrics.items():
+            if key != 'length':
+                metric_values_level = value.data[slices_im]
+                # Interpolate in the same number of slices
+                metrics_inter = np.interp(x_PAM50, x, metric_values_level)
+                # Scale interpolation of first and last levels (to account for incomplete levels)
+                diff = len(metrics_inter) - len(slices_PAM50)
+                if level == levels[0]:
+                    # If the first level, scale from level below
+                    if diff > 0:
+                        metrics_inter = metrics_inter[:-diff]
+                    elif diff < 0:
+                        slices_PAM50 = slices_PAM50[:-abs(diff)]
+                elif level == levels[-1]:
+                    # If the last level, scale from level above
+                    if diff > 0:
+                        metrics_inter = metrics_inter[diff:]
+                    elif diff < 0:
+                        slices_PAM50 = slices_PAM50[abs(diff):]
+                metrics_PAM50_space_dict[key][slices_PAM50] = metrics_inter
+
+    # Convert dict of ndarrays to dict of Metric() objects
+    return {k: Metric(data=np.array(v), label=k) for k, v in metrics_PAM50_space_dict.items()}
diff --git a/spinalcordtoolbox/scripts/sct_process_segmentation.py b/spinalcordtoolbox/scripts/sct_process_segmentation.py
@@ -8,8 +8,7 @@
 #
 # ---------------------------------------------------------------------------------------
 # Copyright (c) 2014 Polytechnique Montreal <www.neuro.polymtl.ca>
-# Author: Benjamin De Leener, Julien Touati, Gabriel Mangeat
-# Modified: 2014-07-20 by jcohenadad
+# Author: Benjamin De Leener, Julien Touati, Gabriel Mangeat, Sandrine Bédard, Jan Valosek, Julien Cohen-Adad
 #
 # About the license: see the file LICENSE.TXT
 #########################################################################################
@@ -31,12 +30,15 @@
 from spinalcordtoolbox.process_seg import compute_shape
 from spinalcordtoolbox.scripts import sct_maths
 from spinalcordtoolbox.csa_pmj import get_slices_for_pmj_distance
+from spinalcordtoolbox.metrics_to_PAM50 import interpolate_metrics
 from spinalcordtoolbox.centerline.core import ParamCenterline
-from spinalcordtoolbox.image import add_suffix, splitext
+from spinalcordtoolbox.image import add_suffix, splitext, Image
 from spinalcordtoolbox.reports.qc import generate_qc
 from spinalcordtoolbox.utils.shell import SCTArgumentParser, Metavar, ActionCreateFolder, parse_num_list, display_open
 from spinalcordtoolbox.utils.sys import init_sct, set_loglevel, __sct_dir__
 from spinalcordtoolbox.utils.fs import get_absolute_path
+from spinalcordtoolbox import __data_dir__
+from spinalcordtoolbox.utils import sct_progress_bar
 
 logger = logging.getLogger(__name__)
 
@@ -234,6 +236,14 @@ def get_parser():
              "Given the risks and lack of consensus surrounding CSA normalization, we recommend thoroughly reviewing "
              "the literature on this topic before applying this feature to your data.\n"
     )
+    optional.add_argument(
+        '-normalize-PAM50',
+        metavar=Metavar.int,
+        type=int,
+        choices=[0, 1],
+        default=0,
+        help="Set to 1 to bring the metrics in the PAM50 anatomical dimensions perslice. -vertfile and -perslice need to be specified."
+    )
     optional.add_argument(
         '-qc',
         metavar=Metavar.folder,
@@ -375,10 +385,13 @@ def main(argv: Sequence[str]):
     append = bool(arguments.append)
     levels = arguments.vert
     fname_vert_level = arguments.vertfile
+    normalize_pam50 = arguments.normalize_PAM50
     if not os.path.isfile(fname_vert_level):
         logger.warning(f"Vertebral level file {fname_vert_level} does not exist. Vert level information will "
                        f"not be displayed. To use vertebral level information, you may need to run "
                        f"`sct_warp_template` to generate the appropriate level file in your working directory.")
+        if normalize_pam50:
+            raise FileNotFoundError("The vertebral level file must exist to use -normalize-PAM50.")
         fname_vert_level = None  # Discard the default '-vertfile', so that we don't attempt to find vertebral levels
         if levels:
             raise FileNotFoundError("The vertebral level file must exist to use `-vert` to group by vertebral level.")
@@ -397,6 +410,8 @@ def main(argv: Sequence[str]):
     qc_dataset = arguments.qc_dataset
     qc_subject = arguments.qc_subject
 
+    if normalize_pam50 and not perslice:
+        parser.error("Option '-normalize-PAM50' requires option '-perslice 1'.")
     if distance_pmj is not None and fname_pmj is None:
         parser.error("Option '-pmj-distance' requires option '-pmj'.")
     if fname_pmj is not None and distance_pmj is None and not perslice:
@@ -410,6 +425,18 @@ def main(argv: Sequence[str]):
                                          param_centerline=param_centerline,
                                          verbose=verbose,
                                          remove_temp_files=arguments.r)
+    if normalize_pam50:
+        fname_vert_level_PAM50 = os.path.join(__data_dir__, 'PAM50', 'template', 'PAM50_levels.nii.gz')
+        metrics_PAM50_space = interpolate_metrics(metrics, fname_vert_level_PAM50, fname_vert_level)
+        if not levels:  # If no levels -vert were specified by user
+            if verbose == 2:
+                # Get all available vertebral levels from PAM50 template to only include slices from available levels in .csv file
+                levels = Image(fname_vert_level_PAM50).getNonZeroValues()
+            else:
+                # Get all available vertebral levels to only include slices from available levels in .csv file
+                levels = Image(fname_vert_level).getNonZeroValues()
+        metrics = metrics_PAM50_space  # Set metrics to the metrics in PAM50 space to use instead
+        fname_vert_level = fname_vert_level_PAM50  # Set vertebral levels to PAM50
     if fname_pmj is not None:
         im_ctl, mask, slices, centerline, length_from_pmj = get_slices_for_pmj_distance(fname_segmentation, fname_pmj,
                                                                                         distance_pmj, extent_pmj,
@@ -422,7 +449,8 @@ def main(argv: Sequence[str]):
             np.savetxt(fname_ctl_csv + '.csv', centerline, delimiter=",")
     else:
         length_from_pmj = None
-    for key in metrics:
+    # Aggregate metrics
+    for key in sct_progress_bar(metrics, unit='iter', unit_scale=False, desc="Aggregating metrics", ascii=True, ncols=80):
         if key == 'length':
             # For computing cord length, slice-wise length needs to be summed across slices
             metrics_agg[key] = aggregate_per_slice_or_level(metrics[key], slices=slices,