Skip to content

Commit

Permalink
Bring metrics in PAM50 anatomical dimensions in `sct_process_segmenta…
Browse files Browse the repository at this point in the history
…tion` (#3977)

* add sript

* add linear interpolation

* intergrate normalization in process seg

* Remove unused imports

* Clarify comments

* Clarify iterations over dicts

* Clarify names of nb_slices variables

* Set default value for `-vert` flag.

* Get abs path for arguments.vertfile

* Rename fname_vert_level to fname_vert_levels

* Update authors

* Clarify variable names

* Add sct_progress_bar to monitor metrics aggregation

* Add TODO if len(metrics_inter) == len(slices_PAM50)

* Remove empty lines at the file end

* remove default value for -vert

* change fname levels for level

* remove condition if len ==

* fix lint

* change suffix for PAM50 in csv

* put NAN if key is length

* Create PAM50.csv filename in more robust way (it was failing for paths including `.` like /home/GRAMES.POLYMTL.CA)

* change image orientation to RPI

* remove levels 49 and higher

* fix parser for normalize¸

* add testing of normalize PAM50

* add -normalize PAM50 in batch processing

* change row for PAM50 csv file

* add cahched file

* remove -vert argument

* remove -vert for sct_process_segmentation with -normalize PAM50

* seperate -normalize PAM50 into different arg

* remove PAM50 description from arg -normalize

* remove trailing whitespace

* remove PAM50 option in SepreateNormArgs

* fetch arguments.normalize_PAM50 argument and change in condition

* if -normalize-pam50 run using pam50 metrics, output one csv file only

* remove unnecessary raise

* remove whitespace

* remove extra \n

* remove indent block with PAM50 condition

* continuation underline fix

* remove whitespace lines

* change -normalize PAM50 argument for -normalize-PAM50 1

* remove filename_pam50 sinec only one .csv file is output now

* `test_cli_sct_process_segmentation.py`: Fixup test

This commit fixes 2 issues:
  - The index # didn't match the IS slice #.
  - The hardcoded value was too precise.

* `metrics_to_PAM50.py`: Remove `levels_2_skip` variable

The purpose of this commit is to reduce the number of
lines of code, which saves readers some time/effort.

Rather than storing levels[0] and levels[-1] in a
variable, we can just access those values directly
from `levels` itself.

We also sort `levels`, that way we don't need to use
`min` or `max` -- we can just use [0] and [-1].

* `metrics_to_PAM50.py`: Iterate through slices, not levels

The purpose of this commit is to pre-compute the slice
variables, which saves some lines of code inside the loop.

Also, later on, this will help us compute `scales` outside
the loop, which is important for letting us combine the
two loops.

(NB: The index `[::len(levels)-1]` looks complicated, but
don't worry, I'm going to remove that indexing later on
when I combine the two loops.)

* `metrics_to_PAM50.py`: Replace `nb_slices_<>` with `len()` calls

This change is a bit nitpicky, and this change is probably
unnecessary, but I think that `len()` is pretty self-explanatory -- I don't know that we need an entire variable to represent it.

This saves a few lines of code.

* `metrics_to_PAM50.py`: Compute `scale_mean` outside the loop

This change lets us combine the two loops.

* `metrics_to_PAM50.py`: Use `metrics_inter` for both loops

This change also lets us combine the two loops.

* `metrics_to_PAM50.py`: Merge two loops into one

This change saves us from having to duplicate
the same instructions across two loops. It also
allows us to more easily compare the different
cases (levels[0], levels[1:-1], levels[-1]),
since the variable assignments are grouped
together.

* `metrics_to_PAM50.py`: Replace `np.empty`+`np.fill` with `np.full`

This saves us a line of code! Source:
https://stackoverflow.com/a/26289777

* `metrics_to_PAM50.py`: Use dictionary comprehensions

This saves us a quite a bit of space, too!

* `metrics_to_PAM50.py`: Combine `levels` into single step

This is another space-saving commit.

* `metrics_to_PAM50.py`: Move dict init closer to loop

This helps to group related code:
  - The dictionary initialization
  - The loop that fills up the dictionray

* `metrics_to_PAM50.py`: Simplify interpolation by extracting out `diff`

This helps us to avoid using a bunch of `len()` calls, and makes the
conditionals a lot easier to read.

* `download.py`: Use `sct_testing_data-PR3977.zip`

This lets us temporarily test the new release, prior to
making any actual changes to the `sct_testing_data` repo

* `batch_processing.sh`: Use new option (`-normalize-PAM50`)

* Change `csa_pam50_PAM50.csv` -> `csa_pam50.csv`

* add missing - for -nromalize-PAM50

* remove -vertfile error

* add error if normalize-PAM50 and vertfile doesn't exists

* move initialization of normalize_pam50 arg for vertfile check

* add function get all available vertebral levels

* force levels of PAM50 or native space if -normalize-PAM50 1

* check if levels is empty before setting the value

* use image.getNonZeroValue instead of get_all_vertebral_level

* add getNonZeroValues

* remove function get_all_vertebral_levels

* remove import of get_all_vertebral_level

* add updated cached results for csa_pam50.csv

* update row for csa_pam50.csv with updated chached results

* change row of csv file to check

* change error to FileNotFoundError

* Restore old permissions (100644 -> 100755)

* `download.py`: Replace dummy link with `r20230207` release

---------

Co-authored-by: valosekj <jan.valosek@upol.cz>
Co-authored-by: Joshua Newton <joshuacwnewton@gmail.com>
  • Loading branch information
3 people committed Feb 8, 2023
1 parent e64f7a2 commit 2f592a9
Show file tree
Hide file tree
Showing 8 changed files with 566 additions and 6 deletions.
2 changes: 2 additions & 0 deletions batch_processing.sh
Expand Up @@ -93,6 +93,8 @@ sct_process_segmentation -i t2_seg.nii.gz -vert 2:3 -o csa_c2c3.csv
sct_detect_pmj -i t2.nii.gz -c t2 -qc "$SCT_BP_QC_FOLDER"
# Compute cross-section area at 60 mm from PMJ averaged on a 30 mm extent
sct_process_segmentation -i t2_seg.nii.gz -pmj t2_pmj.nii.gz -pmj-distance 60 -pmj-extent 30 -qc "$SCT_BP_QC_FOLDER" -qc-image t2.nii.gz -o csa_pmj.csv
# Compute morphometrics in PAM50 anatomical dimensions
sct_process_segmentation -i t2_seg.nii.gz -vertfile t2_seg_labeled.nii.gz -perslice 1 -normalize-PAM50 1 -o csa_pam50.csv
# Go back to root folder
cd ..

Expand Down
4 changes: 2 additions & 2 deletions spinalcordtoolbox/download.py
Expand Up @@ -32,8 +32,8 @@
},
"sct_testing_data": {
"mirrors": [
"https://github.com/spinalcordtoolbox/sct_testing_data/releases/download/r20210330230310/sct_testing_data-r20210330230310.zip",
"https://osf.io/download/60629509229503022e6f107d/",
"https://github.com/spinalcordtoolbox/sct_testing_data/releases/download/r20230207/sct_testing_data-r20230207.zip",
"https://osf.io/5twvs/?action=download",
],
"default_location": os.path.join(__sct_dir__, "data", "sct_testing_data"),
},
Expand Down
10 changes: 10 additions & 0 deletions spinalcordtoolbox/image.py
Expand Up @@ -599,6 +599,16 @@ def getNonZeroCoordinates(self, sorting=None, reverse_coord=False, coordValue=Fa

return list_coordinates

def getNonZeroValues(self, sorting=True):
"""
This function return all the non-zero unique values that the image contains.
If sorting is set to True, the list will be sorted.
"""
list_values = list(np.unique(self.data[self.data > 0]))
if sorting:
list_values.sort()
return list_values

def getCoordinatesAveragedByValue(self):
"""
This function computes the mean coordinate of group of labels in the image. This is especially useful for label's images.
Expand Down
73 changes: 73 additions & 0 deletions spinalcordtoolbox/metrics_to_PAM50.py
@@ -0,0 +1,73 @@
# Functions to interpolate metrics (from sct_process_segmentation) into the PAM50 anatomical dimensions
# Author: Sandrine Bédard & Jan Valosek

import numpy as np
from spinalcordtoolbox.image import Image
from spinalcordtoolbox.aggregate_slicewise import Metric
from spinalcordtoolbox.template import get_slices_from_vertebral_levels


def interpolate_metrics(metrics, fname_vert_levels_PAM50, fname_vert_levels):
"""
Interpolates metrics perlevel into the PAM50 anatomical dimensions.
:param metrics: Dict of class Metric(). Output of spinalcordtoolbox.process_seg.compute_shape.
:param fname_vert_levels_PAM50: Path to the PAM50_levels.nii.gz (PAM50 labeled segmentation).
:param fname_vert_levels: Path to subject's vertebral labeling file.
:return metrics_PAM50_space: Dict of class Metric() in PAM50 anatomical dimensions.
"""
# Load PAM50 labeled segmentation
im_seg_labeled_PAM50 = Image(fname_vert_levels_PAM50)
im_seg_labeled_PAM50.change_orientation('RPI')
# Load subject's labeled segmentation
im_seg_labeled = Image(fname_vert_levels)
im_seg_labeled.change_orientation('RPI')

# Get unique integer vertebral levels (but exclude 0, 49, and 50, as these aren't vertebral levels)
levels = sorted(int(level) for level in np.unique(im_seg_labeled.data) if 0 < int(level) < 49)

# Get slices corresponding to each level
level_slices_PAM50 = [get_slices_from_vertebral_levels(im_seg_labeled_PAM50, level) for level in levels]
level_slices_im = [get_slices_from_vertebral_levels(im_seg_labeled, level) for level in levels]

# Find the mean scaling between the image and PAM50 (excluding first and last levels)
scales = [len(slices_PAM50)/len(slices_im) for slices_PAM50, slices_im
in zip(level_slices_PAM50[1:-1], level_slices_im[1:-1])]
scale_mean = np.mean(scales)

# Initialize a metrics dict filled by NaN with number of rows equal to number of slices in PAM50 template
z = im_seg_labeled_PAM50.dim[2] # z == number of slices
metrics_PAM50_space_dict = {k: np.full([z], np.nan) for k in metrics.keys()}
# Loop through slices per-level (excluding first and last levels), populating the metrics dict
for level, slices_PAM50, slices_im in zip(levels, level_slices_PAM50, level_slices_im):
# Prepare vectors for the interpolation
if level in [levels[0], levels[-1]]:
# Note: since the first/last levels can be incomplete, we use the mean scaling factor from all other levels
x_PAM50 = np.linspace(0, scale_mean * len(slices_im), int(scale_mean * len(slices_im)))
x = np.linspace(0, scale_mean * len(slices_im), len(slices_im))
else:
x_PAM50 = np.arange(0, len(slices_PAM50), 1)
x = np.linspace(0, len(slices_PAM50) - 1, len(slices_im))
# Loop through metrics
for key, value in metrics.items():
if key != 'length':
metric_values_level = value.data[slices_im]
# Interpolate in the same number of slices
metrics_inter = np.interp(x_PAM50, x, metric_values_level)
# Scale interpolation of first and last levels (to account for incomplete levels)
diff = len(metrics_inter) - len(slices_PAM50)
if level == levels[0]:
# If the first level, scale from level below
if diff > 0:
metrics_inter = metrics_inter[:-diff]
elif diff < 0:
slices_PAM50 = slices_PAM50[:-abs(diff)]
elif level == levels[-1]:
# If the last level, scale from level above
if diff > 0:
metrics_inter = metrics_inter[diff:]
elif diff < 0:
slices_PAM50 = slices_PAM50[abs(diff):]
metrics_PAM50_space_dict[key][slices_PAM50] = metrics_inter

# Convert dict of ndarrays to dict of Metric() objects
return {k: Metric(data=np.array(v), label=k) for k, v in metrics_PAM50_space_dict.items()}
36 changes: 32 additions & 4 deletions spinalcordtoolbox/scripts/sct_process_segmentation.py
Expand Up @@ -8,8 +8,7 @@
#
# ---------------------------------------------------------------------------------------
# Copyright (c) 2014 Polytechnique Montreal <www.neuro.polymtl.ca>
# Author: Benjamin De Leener, Julien Touati, Gabriel Mangeat
# Modified: 2014-07-20 by jcohenadad
# Author: Benjamin De Leener, Julien Touati, Gabriel Mangeat, Sandrine Bédard, Jan Valosek, Julien Cohen-Adad
#
# About the license: see the file LICENSE.TXT
#########################################################################################
Expand All @@ -31,12 +30,15 @@
from spinalcordtoolbox.process_seg import compute_shape
from spinalcordtoolbox.scripts import sct_maths
from spinalcordtoolbox.csa_pmj import get_slices_for_pmj_distance
from spinalcordtoolbox.metrics_to_PAM50 import interpolate_metrics
from spinalcordtoolbox.centerline.core import ParamCenterline
from spinalcordtoolbox.image import add_suffix, splitext
from spinalcordtoolbox.image import add_suffix, splitext, Image
from spinalcordtoolbox.reports.qc import generate_qc
from spinalcordtoolbox.utils.shell import SCTArgumentParser, Metavar, ActionCreateFolder, parse_num_list, display_open
from spinalcordtoolbox.utils.sys import init_sct, set_loglevel, __sct_dir__
from spinalcordtoolbox.utils.fs import get_absolute_path
from spinalcordtoolbox import __data_dir__
from spinalcordtoolbox.utils import sct_progress_bar

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -234,6 +236,14 @@ def get_parser():
"Given the risks and lack of consensus surrounding CSA normalization, we recommend thoroughly reviewing "
"the literature on this topic before applying this feature to your data.\n"
)
optional.add_argument(
'-normalize-PAM50',
metavar=Metavar.int,
type=int,
choices=[0, 1],
default=0,
help="Set to 1 to bring the metrics in the PAM50 anatomical dimensions perslice. -vertfile and -perslice need to be specified."
)
optional.add_argument(
'-qc',
metavar=Metavar.folder,
Expand Down Expand Up @@ -375,10 +385,13 @@ def main(argv: Sequence[str]):
append = bool(arguments.append)
levels = arguments.vert
fname_vert_level = arguments.vertfile
normalize_pam50 = arguments.normalize_PAM50
if not os.path.isfile(fname_vert_level):
logger.warning(f"Vertebral level file {fname_vert_level} does not exist. Vert level information will "
f"not be displayed. To use vertebral level information, you may need to run "
f"`sct_warp_template` to generate the appropriate level file in your working directory.")
if normalize_pam50:
raise FileNotFoundError("The vertebral level file must exist to use -normalize-PAM50.")
fname_vert_level = None # Discard the default '-vertfile', so that we don't attempt to find vertebral levels
if levels:
raise FileNotFoundError("The vertebral level file must exist to use `-vert` to group by vertebral level.")
Expand All @@ -397,6 +410,8 @@ def main(argv: Sequence[str]):
qc_dataset = arguments.qc_dataset
qc_subject = arguments.qc_subject

if normalize_pam50 and not perslice:
parser.error("Option '-normalize-PAM50' requires option '-perslice 1'.")
if distance_pmj is not None and fname_pmj is None:
parser.error("Option '-pmj-distance' requires option '-pmj'.")
if fname_pmj is not None and distance_pmj is None and not perslice:
Expand All @@ -410,6 +425,18 @@ def main(argv: Sequence[str]):
param_centerline=param_centerline,
verbose=verbose,
remove_temp_files=arguments.r)
if normalize_pam50:
fname_vert_level_PAM50 = os.path.join(__data_dir__, 'PAM50', 'template', 'PAM50_levels.nii.gz')
metrics_PAM50_space = interpolate_metrics(metrics, fname_vert_level_PAM50, fname_vert_level)
if not levels: # If no levels -vert were specified by user
if verbose == 2:
# Get all available vertebral levels from PAM50 template to only include slices from available levels in .csv file
levels = Image(fname_vert_level_PAM50).getNonZeroValues()
else:
# Get all available vertebral levels to only include slices from available levels in .csv file
levels = Image(fname_vert_level).getNonZeroValues()
metrics = metrics_PAM50_space # Set metrics to the metrics in PAM50 space to use instead
fname_vert_level = fname_vert_level_PAM50 # Set vertebral levels to PAM50
if fname_pmj is not None:
im_ctl, mask, slices, centerline, length_from_pmj = get_slices_for_pmj_distance(fname_segmentation, fname_pmj,
distance_pmj, extent_pmj,
Expand All @@ -422,7 +449,8 @@ def main(argv: Sequence[str]):
np.savetxt(fname_ctl_csv + '.csv', centerline, delimiter=",")
else:
length_from_pmj = None
for key in metrics:
# Aggregate metrics
for key in sct_progress_bar(metrics, unit='iter', unit_scale=False, desc="Aggregating metrics", ascii=True, ncols=80):
if key == 'length':
# For computing cord length, slice-wise length needs to be summed across slices
metrics_agg[key] = aggregate_per_slice_or_level(metrics[key], slices=slices,
Expand Down

0 comments on commit 2f592a9

Please sign in to comment.